Networking/Archive/DASH/Implementation: Difference between revisions

Networking/Archive/DASH/Implementation (view source)

Revision as of 03:39, 12 January 2012

1,955 bytes added , 12 January 2012

Add info on VP8/Video and Vorbis/Audio decoders' ability to adapt

Sworkman

88

edits

@@ Line 23: / Line 23: @@
 '''Question: Are these approaches possible? Can a single MediaCoder handle multiple streams? Can a single MediaStream handle changes in bitstream coming from DASH?'''
+; VP8/Video
+: Chris Pearce: "For VP8 basically yes. In bug 626979 (and a follow-up fix in bug 661456) we implemented support WebM's track metadata DisplayWidth/DisplayHeight elements. We scale whatever contained video frames we encounter to DisplayWidth x DisplayHeight pixels, so you can change the dimensions of video frames at will while encoding a single track."
+: Tim Terriberry: "Resolution is not the same thing as bitrate, but in general yes, you can change both in VP8 without re-initializing the decoder. The one caveat is that if you ''do'' want to switch resolution, the first frame has to be a keyframe. You should also start with a keyframe when changing between streams encoded at different bitrates, or you'll get artifacts caused by prediction mis-matches."
+'''Question: Does the DASH Media Segment definition or media encoding process require that each new segment start with a keyframe?'''
+; Vorbis/Audio
+: From Chris, Tim; paraphrased: Vorbis is more complicated because of the way it is encoded.  It uses two different block sizes, 'long' and 'short'; the decision of which to use depends on what the encoder decides is best.  So, if we had two streams encoded at different rates, there is no guarantee that they blocks would line up.  This is important because in order to decode the first half of a block, you need the last half of the previous block - different block sizes from disparate streams won't enable this.  If we were to change stream, either a pause in audio would happen (due to the decoder being flushed), or we'd have to use some kind of extrapolation (LPC extrapolation).
+: To avoid this, we could require that each segment include extra packets to allow correct decoding.  Or, we can just support no audio adaptation to start with (similar to Apple HLS).  Rob O'Callahan is working on a Media Streams API which includes cross-fading - this may be a longer term solution, after starting with non-adaptive audio.
 == MPD ==