Confirmed users
35
edits
No edit summary |
No edit summary |
||
| Line 2: | Line 2: | ||
== Introduction == | == Introduction == | ||
This efforts aims at building tools and | This efforts aims at building tools and framework for analyzing audio and video | ||
quality for the Firefox WebRTC implementation. WebRTC involves peer-to-peer rich multimedia | quality for the Firefox WebRTC implementation. WebRTC involves peer-to-peer rich multimedia | ||
communications across variety of end-user devices and network conditions. The | communications across variety of end-user devices and network conditions. The | ||
| Line 11: | Line 11: | ||
== Background == | == Background == | ||
=== Typical WebRTC Media Pipeline=== | === Typical WebRTC Media Pipeline=== | ||
Below picture captures various components involved in the flow of | <p>Below picture captures various components involved in the flow of | ||
media captured from mic/camera till it gets transported. The reverse direction | media captured from mic/camera till it gets transported. The reverse direction | ||
follows a similar path back till the RTP packets gets delivered as raw media for | follows a similar path back till the RTP packets gets delivered as raw media for | ||
rendering. | rendering.</p> | ||
[[File:Firefox_WebRTC_Pipeline.png|650px]] | [[File:Firefox_WebRTC_Pipeline.png|650px]] | ||
With several moving components in the pipeline, it becomes necessary to analyze | <p>With several moving components in the pipeline, it becomes necessary to analyze | ||
the impact these might have on the overall quality of the media being transmitted | the impact these might have on the overall quality of the media being transmitted | ||
or rendered. For instance, the parts of the pipeline highlighted ( marked star) | or rendered. For instance, the parts of the pipeline highlighted ( marked star) | ||
has potential to induce latency and impact quality of the encoded media. Thus, | has potential to induce latency and impact quality of the encoded media. Thus, | ||
having possibilities to measure, analyze and account these | having possibilities to measure, analyze and account these impacts can help | ||
improve the performance of the Firefox WebRTC implementation.</p> | |||
Not to forget, the pipeline doesn't capture impacts | <p>Not to forget, the pipeline in the diagram, doesn't capture impacts on induced due to | ||
network bandwidth, latency and congestion scenarios. | the network bandwidth, latency and congestion scenarios. </p> | ||
=== Scope === | === Scope === | ||
Following is wishlist of functionalities that the framework must be able to support | Following is wishlist of functionalities that the framework must be able to support | ||
# Audio and Video Quality Analysis | # Audio and Video Quality Analysis | ||
| Line 40: | Line 39: | ||
#Codec Configuration Variability Analysis | #Codec Configuration Variability Analysis | ||
## Sample Rate, Input and Output Channels, Reverse Channels, Echo Cancellation, Gain Control, Noise Suppression, Voice Activity Detection, Level Metrics, Delay, Drift compensation, Echo Metrics | ## Sample Rate, Input and Output Channels, Reverse Channels, Echo Cancellation, Gain Control, Noise Suppression, Voice Activity Detection, Level Metrics, Delay, Drift compensation, Echo Metrics | ||
## Video Frame-rate, | ## Video Frame-rate, Bitrate, Resolution and more | ||
#Hardware and Platform Variability Analysis | #Hardware and Platform Variability Analysis | ||
| Line 55: | Line 54: | ||
[[File:AudioPerf-Setup.png|650px]] | [[File:AudioPerf-Setup.png|650px]] | ||
<p>The idea here is to setup 2 Peer Connections within a single instance of the browser tab to setup a one-way audio call | <p>The idea here is to setup 2 Peer Connections within a single instance of the browser tab to setup a one-way audio call. Finally compute PESQ scores between the input audio file fed into the local Peer Connection and | ||
.Finally compute PESQ scores between the input audio file fed into the local Peer Connection and | the output audio audio file recorded at the play-out of the remote Peer Connection in a automated fashion.</p> | ||
the output audio audio file recorded at the play-out of the remote Peer Connection in a | |||
Following sub-sections explain in details the steps involved for this purpose | <p>Following sub-sections explain in details the steps involved for this purpose: </p> | ||
==== Running Browser Based Media Test Automatically ==== | ==== Running Browser Based Media Test Automatically ==== | ||
<p>[[Talos]] is Mozilla's python performance testing framework that is usable on Windows, Mac and Linux. | <p>[[Talos]] is Mozilla's python performance testing framework that is usable on Windows, Mac and Linux. | ||
| Line 65: | Line 63: | ||
<p>Talos is used in our setup to run media tests along with other start-up and page-loader performance tests.</p> | <p>Talos is used in our setup to run media tests along with other start-up and page-loader performance tests.</p> | ||
==== Feeding output of <audio> to the Peer Connection ==== | ==== Feeding output of <audio> to the Peer Connection ==== | ||
<p>Once we have the framework figured out to run automated media-tests, the next step is deciding on how do we insert | <p>Once we have the framework figured out to run automated media-tests, the next step is deciding on how do we insert input audio file into the WebRTC Peer Connection.</p> | ||
input audio file into the WebRTC Peer Connection.</p> | <p>For this purpose Mozilla's MediaStreamProcessing API '''mozCaptureStreamUntilEnded''' enables the <audio> element to produce MediaStream that consists of whatever the <audio> element is playing. Thus the stream produced by the <audio> element in this fashion replaces the function of obtaining the media stream via the WebRTC GetUserMedia API.</p> | ||
<p>For this purpose Mozilla's MediaStreamProcessing API '''mozCaptureStreamUntilEnded''' enables the <audio> element to produce MediaStream that consists of whatever the <audio> element is playing. Thus the stream produced by the <audio> element in this fashion replaces the function of obtaining the media stream via the WebRTC GetUserMedia API. | <p>Finally, the generated MediaStream is added to the local Peer Connection element via the addStream() API as shown below.</p> | ||
the generated MediaStream is added to the local Peer Connection element via the addStream() API as shown | |||
below.</p> | |||
<code> | <code> | ||
// localAudio is an <audio> , localPC is a PeerConnection Object | // localAudio is an <audio> , localPC is a PeerConnection Object | ||
| Line 83: | Line 79: | ||
*A way a feeding audio from an input audio file audio input to the Peer Connection without having to use the GetUserMedia() API | *A way a feeding audio from an input audio file audio input to the Peer Connection without having to use the GetUserMedia() API | ||
</p> | </p> | ||
<p> This sub-section explains various tools used to choose the right audio sink, record the audio played out and compute | <p> This sub-section explains various tools used to choose the right audio sink, record the audio played out and compute the PESQ scores </p> | ||
PESQ scores </p> | |||
#PulseAudio's pactl tool is used to find the right sink to play-out the audio produced by the remote Peer Connection. | #PulseAudio's pactl tool is used to find the right sink to play-out the audio produced by the remote Peer Connection. | ||
#PulseAudio's parec tool records mono channel audio played out at the sink in Signed Little Ending format at 16000 samples/sec. | #PulseAudio's parec tool records mono channel audio played out at the sink in Signed Little Ending format at 16000 samples/sec. | ||
#The output from the parec tool is fed into the SOX tool to generate .WAV version of the recorded audio and to trim silence | #The output from the parec tool is fed into the SOX tool to generate .WAV version of the recorded audio and to trim silence at the beginning and end of the recorded audio file. | ||
<br> <br> | |||
<code> | <code> | ||
Command: | Command: | ||
| Line 95: | Line 90: | ||
</code> | </code> | ||
<br> | <br> | ||
The recording is timed to match the length of the input audio file using SOX's trimming effects. This enables the recorder | The recording is timed to match the length of the input audio file using SOX's trimming effects. This enables the recorder process to auto-complete on the SOX trim timeout expiry determined by the <record-duration> | ||
process to complete on the SOX trim timeout expiry determined by the <record-duration> | |||
#Finally PESQ is used to compute the quality scores between the input audio file (original) and the output audio file (recorded) <br> <br> | #Finally PESQ is used to compute the quality scores between the input audio file (original) and the output audio file (recorded) <br> <br> | ||
<code> | <code> | ||
| Line 104: | Line 98: | ||
==== TODO Feature List ==== | ==== TODO Feature List ==== | ||
Some near term items to implement/support include: | |||
#Support different audio formats and lengths | #Support different audio formats and lengths | ||
#Provide tools support across platforms. Currently we support LINUX only | #Provide tools support across platforms. Currently we support LINUX only | ||
#Discuss the results generated and | #Discuss the results generated and Datazilla, GraphServer integration | ||
#Allow configuration options to specify sample rates, number of channels and encoding. | #Allow configuration options to specify sample rates, number of channels and encoding. | ||
==Video | ==Next Steps == | ||
# Discuss and fix open issues related to use Talos Setup/Cleanup framework to better control Audio Recodring Process. | |||
# Discuss on what more tests are required for audio quality and latency analysis | |||
# Discuss on what needs to be done for Video Quality Analysis | |||
# Extend the framework to simulate Constrained networked conditions. | |||