Confirmed users
325
edits
No edit summary |
No edit summary |
||
| Line 1: | Line 1: | ||
Audio issues in getUserMedia and WebRTC: (bug numbers and details to be added) | Audio issues in getUserMedia and WebRTC: (bug numbers and details to be added) | ||
* Sampling Rate issues | * '''Sampling Rate issues''' | ||
** 44/44.1KHz mismatch -- {{bug| | ** 44/44.1KHz mismatch -- {{bug|886886}}<br>This causes a 0.23% drift in audio and a buffer buildup in MediaStreamGraph (delay). This is only an issue when a sampling frequency of 44100 is used, in particular on Windows when the user/default has that selected. Note some laptops and other devices come configured this way, or may have been reconfigured by the user for 44100Hz | ||
*** We believe this issue is also affecting B2G | *** We believe this issue is also affecting B2G | ||
*** Since we're requesting 16000 currently (and perhaps/probably 32000 or 48000 in the future), it makes sense to do the resample at the 44/44.1->16000 point, not a second resample | *** Since we're requesting 16000 currently (and perhaps/probably 32000 or 48000 in the future), it makes sense to do the resample at the 44/44.1->16000 point, not a second resample. Bug 886886 has a patch to handle this using the Speex resampler in our tree. There are also patches to port upcoming work from webrtc.org using their sinc resampler; however those are far more extensive and not upliftable. | ||
** Long-term clock-rate mismatches and drift<br>Because input, MediaStreamGraph (MSG) and output clocks may all be mismatched and/or change slightly over time, the code has to handle controlling delay and possible underflow. This normally isn't a problem when fed to a PeerConnection, as the other side will compensate, but for getUserMedia uses we have to care. | ** Long-term clock-rate mismatches and drift -- {{bug|884365}}<br>Because input, MediaStreamGraph (MSG) and output clocks may all be mismatched and/or change slightly over time, the code has to handle controlling delay and possible underflow. This normally isn't a problem when fed to a PeerConnection, as the other side will compensate, but for getUserMedia uses we have to care. | ||
*** Inputs need to either sync to the MSG frequency (system now, to be output clock), or they need to have optional resampling (see above about long-term mismatches) | *** Inputs need to either sync to the MSG frequency (system now, to be output clock), or they need to have optional resampling (see above about long-term mismatches) | ||
*** If the 44/44.1 issue is dealt with, then likely any adjustments to the resampling ratio for handling this (mismatch/buffering/drift) would be small enough and slow enough to likely not need to pass raw data (though perhaps there's a tiny quality loss for the far-end listener). | *** If the 44/44.1 issue is dealt with, then likely any adjustments to the resampling ratio for handling this (mismatch/buffering/drift) would be small enough and slow enough to likely not need to pass raw data (though perhaps there's a tiny quality loss for the far-end listener). | ||
*** Google tells us they have not yet dealt with this issue in Chrome. | |||
** Extra clock domains in MediaStreamGraph<br>MediaStreamGraph currently is clocked on the system clock; ongoing work is moving it to be clocked on the audio output clock. This will reduce total delay in MSG. Note that the output clock frequency may both drift and also suddenly change when the output is re-routed, and the code needs to adapt smoothly to this. | ** Extra clock domains in MediaStreamGraph<br>MediaStreamGraph currently is clocked on the system clock; ongoing work is moving it to be clocked on the audio output clock. This will reduce total delay in MSG. Note that the output clock frequency may both drift and also suddenly change when the output is re-routed, and the code needs to adapt smoothly to this. | ||
** Basic sample rate low<br>Right now, we're clocking everything at 16000Hz; we should be using higher clockrates. | ** Basic sample rate low<br>Right now, we're clocking everything at 16000Hz; we should be using higher clockrates. | ||
| Line 15: | Line 16: | ||
* Need to add a TrackUnion to streams output from PeerConnection | * Need to add a TrackUnion to streams output from PeerConnection | ||
** We can get persistent delay if the output of a PeerConnection gets blocked. The patch for this has been r-'d and needs a re-design. | ** We can get persistent delay if the output of a PeerConnection gets blocked. The patch for this has been r-'d and needs a re-design. | ||
* AEC location & quality | * '''AEC location & quality''' | ||
** The AEC should be in getUserMedia() to have the option of cancelling audio from multiple PeerConnections (so A doesn't head the echo of B (and vice-versa) when both are talking to C). Also this will allow other audio from the browser to be cancelled. Currently the AEC only cancels audio in the same PeerConnection. | ** The AEC should be in getUserMedia() to have the option of cancelling audio from multiple PeerConnections (so A doesn't head the echo of B (and vice-versa) when both are talking to C). Also this will allow other audio from the browser to be cancelled. Currently the AEC only cancels audio in the same PeerConnection. | ||
*** Google apparently has not moved the AEC yet either. | |||
** The pre-AEC resampler could be higher quality (currently it's linear) | ** The pre-AEC resampler could be higher quality (currently it's linear) | ||
* Dynamic input/output changes | *** Google tells us the pre-AEC resampler quality doesn't matter much as it works only on the far-end sound, and in practice causes little cancellation-quality loss. | ||
* '''Dynamic input/output changes''' | |||
** We need to support hot-(un)plugging headsets, and preferably not requiring it to be unplugged to send audio to speakers/etc | ** We need to support hot-(un)plugging headsets, and preferably not requiring it to be unplugged to send audio to speakers/etc | ||
** We need to support audio output routing (at least to support "ringing" from main speakers while in-call/video/etc audio goes to headset). | ** We need to support audio output routing (at least to support "ringing" from main speakers while in-call/video/etc audio goes to headset). | ||
* Investigate any remaining latency issues | * '''Investigate any remaining latency issues''' | ||
** Identifying any additional issues ASAP is critical | ** Identifying any additional issues ASAP is critical | ||
* Find some way to test audio quality and delay in automated testing | * '''Find some way to test audio quality and delay in automated testing''' | ||