Changes

Jump to: navigation, search

Web Speech API - Speech Recognition

1 byte added, 01:35, 20 November 2019
Where are our servers and who manages it?
* Store-Transcription: determines if the user allows '''''Mozilla''''' to store the '''transcription''' in our own servers to further use (training our own models, for example)
* Product-Tag: determines which product is making use of the API. It can be: vf for voicefill, fxr for Firefox Reality, wsa for Web Speech API, and so on.
<li>Once the proxy receives the request with the audio sample, it looks for the headers that were set, and nothing besides what was request requested by the user plus a timestamp and the user-agent is saved. You can check it here: [https://github.com/mozilla/speech-proxy/blob/master/server.js#L324] </li>
<li>The proxy then looks for the format of the file and decodes it to raw pcm. </li>
<li>A request is made to the STT provider set in the proxy's configuration file containing '''just the audio file'''. </li>
<li>Once the STT provider returns the request containing a transcription and a confidence score, that is then forwarded to the client who then is responsible to take an action accordingly with the user's request.</li>
</ol>
===== How does your proxy server work? Why do we have it? =====
Confirm
58
edits

Navigation menu