3
edits
Rohandalvi (talk | contribs) No edit summary |
|||
| (11 intermediate revisions by 3 users not shown) | |||
| Line 1: | Line 1: | ||
<h1> Introduction</h1><br /> | <h1> Introduction</h1><br /> | ||
<p> This project is an extension of the GSoC Speech project. | <p> This project is an extension of the GSoC Speech project. It offers support to voice commands inside the firefox browser and has lead to an extension in SpeechRecognition API as well as text-to-speech API.</p> | ||
<br /> | <br /> | ||
<h1> Initial contributors:</h1> | <h1> Initial contributors:</h1> | ||
| Line 9: | Line 9: | ||
<h1> Technical Stuff </h1><br /> | <h1> Technical Stuff </h1><br /> | ||
<h2> | <h2> Speech Input API </h2> | ||
The speech input API aims to provide an alternative input method for web applications, without using a keyboard or other physical device. This API can be used to input commands, fill input elements, give directions etc. It is based on SpeechRequest[[http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Feb/att-0023/speechrequest.xml.html]] | The speech input API aims to provide an alternative input method for web applications, without using a keyboard or other physical device. This API can be used to input commands, fill input elements, give directions etc. It is based on SpeechRequest[[http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Feb/att-0023/speechrequest.xml.html]] | ||
| Line 31: | Line 31: | ||
*The developer should be able to start, stop, handle errors and multiple requests as required. | *The developer should be able to start, stop, handle errors and multiple requests as required. | ||
<br> | <br> | ||
== | == Text To Speech API == | ||
The text to speech API will be based on google's proposal([http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Feb/att-0022/htmltts-draft.html]).This API can be used for speech translation, turn by turn navigation, dialog systems etc. | The text to speech API will be based on google's proposal([http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Feb/att-0022/htmltts-draft.html]).This API can be used for speech translation, turn by turn navigation, dialog systems etc. | ||
| Line 43: | Line 43: | ||
*What speech engines is yet to be decided. | *What speech engines is yet to be decided. | ||
<br> | <br> | ||
== Hacking Firefox UI == | |||
To practically implement speech-to-text and text-to-speech, it was imperative to hack the UI of development version of Mozilla Firefox(Nightly).We had to modify the files browser.xul, browser.css, browser.js for that.In order to ensure security and privacy issues, we modified these files and added two separate buttons in the UI, one for initiating speech input from the user and the other to translate the selected text to speech.In this process we added functions in the javascript file (browser.js) to incorporate the functionality of those two buttons on the UI. | |||
<h1> Tentative Schedule </h1><br /> | <h1> Tentative Schedule </h1><br /> | ||
<ul> | <ul> | ||
<li>28<sup>th</sup> | <li>28<sup>th</sup> January - 4<sup>th</sup> February - SpeechRequest + endpointer code compiled.</li> | ||
<li>5<sup>th</sup> | <li>5<sup>th</sup> February - 11<sup>th</sup> February - Fixes to microphone handling on linux, some other small fixes, getting familiar with the code.</li> | ||
<li>12<sup>th</sup> | <li>12<sup>th</sup> February - 18<sup>th</sup> February - continue some small fixes ( for example simplify thread handling ).</li> | ||
<li>19<sup>th</sup> | <li>19<sup>th</sup> February - 25<sup>th</sup> February - Christmas etc, not much progress, but continue with fixing SpeechRequest API, adding possibly some new features.</li> | ||
<li>26<sup>th</sup> | <li>26<sup>th</sup> February - 1<sup>st</sup> March - Holiday Season, not much progress, but continue with SpeechRequest Implementation.</li> | ||
<li>2<sup>nd</sup> | <li>2<sup>nd</sup> March - 8<sup>th</sup> March - Get TTS working.</li> | ||
<li>9<sup>th</sup> | <li>9<sup>th</sup> March - 15<sup>th</sup> March - Enhancements to the TTS Implementation.</li> | ||
<li>16<sup>th</sup> | <li>16<sup>th</sup> March - 22<sup>nd</sup> March - First speech commands: for example <tt>browser go back</tt> & <tt>go forward</tt> etc.</li> | ||
<li>23<sup>rd</sup> | <li>23<sup>rd</sup> March - 29<sup>th</sup> March - More speech commands, maybe <tt>read entire text</tt></li> | ||
<h1> | <h1> Updates </h1> | ||
< | <p>You can check out the first update [https://github.com/Harshank/speechAPI.git here]</p> | ||
<p> We have recently released the second update on the same github link, you can clone, make changes and send a pull request.</p> | |||
edits