Abhishek/metabrainz GSoC2016Proposal: Difference between revisions

Abhishek/metabrainz GSoC2016Proposal (view source)

Revision as of 17:39, 21 March 2016

2,378 bytes added , 21 March 2016

more details added

Abhisheksingh

Confirmed users

142

edits

@@ Line 214: / Line 214: @@
 <br>
-<u> '''Vector Space Model(VSM)''' </u>
+<p>'''Phase II : Vector Space Model(VSM)'''<br>
 <p> This is a basic yet effective model. The idea is to represent every audio entity in vector form in the feature space.</p>
@@ Line 299: / Line 299: @@
 ''Reference Paper'': http://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf <br>
-<u>'''Deep Neural Network Model (Deep learning Approach)''' </u>
+<u>'''Phase IV : Deep Neural Network Model (Deep learning Approach)''' </u>
-<p> This method helps in learning features of a audio automatically. It's an unsupervised approach to learn important features. In deep neural network each layer extracts some abstract representation of the audio. The idea is through each computation layer the system is able to learn some representation of the audio. Initial sets of inputs is passed through the input layer and computations are done at the hidden layers. After passing some hidden layer we learn new feature representation of initial input features. These new learned features vectors will be the representations of audios. Now, we could use machine learning algorithm on these features to do various classification and clustering job. </p>
+<p> This method helps in learning features of a audio automatically. It's an unsupervised approach to learn important features. In deep neural network each layer extracts some abstract representation of the audio. The idea is through each computation layer the system is able to learn some representation of the audio. Initial sets of inputs is passed through the input layer and computations are done at the hidden layers. After passing some hidden layer we learn new feature representation of initial input features. These new learned features vectors will be the representations of audios. Now, we could use machine learning algorithm on these features to do various classification and clustering job. This would help in dataset building and model set creation.</p>
 <p>[[File:Deepnets.png|center|600px]] </p>
@@ Line 308: / Line 308: @@
 <u>''Advantage''</u>
 <ul>
-<li> Saves the effort of learning features of audio (meta data extraction and manual labelling)  </li>
+<li> Saves the effort of learning features of audio (meta data extraction and manual labeling)  </li>
 <li> It can be used to find similar audios/music. Which can be useful in content based search and by using user information it can be used to recommend songs/audios.</li>
 <li> It can be used to detect duplicate songs. It generally happens that two songs ends up having different Ids but are indeed same songs.</li>
 </ul>
+<h3> Deliverables </h3>
+<p> An Audio retrieval system that supports following features</p>
+<ul>
+<li> Client API(s) for searching.</li>
+<li> Support for text based search. User can search relevant audios by providing tag/label or some keywords.</li>
+<li> Support for content based search. It will show similar audios based on content provided by user. Query can be done by providing a piece of audio also.</li>
+<li> Fingerprinting and similarity features will help in duplication detection.</li>
+<li> Support for advanced queries. User would be able to filter and group results.</li>
+<li> Visualization support for the data powered by Kibana.</li>
+<li> Proper documentation on the work for users and developers.</li>
+</ul>
+<p> A sample code showing an estimate of the the API(s) provided at the end.</p>
+<source lang="python">
+"""
+A sample to show some of the API(s) provided by
+AB search system and its usage.
+"""
+from AB import search
+# create a search client
+c = search.searchClient()
+# setting api key for the client
+# this may not be a part, depends on
+# AcousticBrainz community
+c.set_token("<your_api_key>","token")
+# get sound based on id
+sound = c.get_sound(108)
+# retrieve relevant results for a query
+results = c.text_search(query="hip hop")
+# fields parameter allows to specify the information
+# you want in the results lis
+results = c.text_search(query="hip hop",fields="id,name,previews")
+# applying filter
+sounds = c.text_search(query="hip hop", filter="tag:loop", fields="id,name,url")
+# search sounds using a content-based descriptor target and/or filter
+sounds = c.content_based_search(target="lowlevel.pitch.mean:220",
+                                descriptors_filter="lowlevel.pitch_instantaneous_confidence.mean:[0.8 TO 1]",
+                                fields="id,name,url")
+# content based search using given audio piece of music
+sounds = c.content_based_search(audiofile="sample.mp3")
+# Combine both text and content-based queries.
+sounds = c.combined_search(target="lowlevel.pitch.mean:220", filter="single-note")
+# returns a confidence score for the possibility of two
+# audios being exactly same
+dup_score = c.get_duplicate_score(audiofile1, audiofile2)
+# returns True if duplicate else False
+check = c.is_duplicate(audiofile1, audiofile2)
+</source>
 <h3> Time </h3>