Accessibility/Caption Formats

From MozillaWiki
Jump to: navigation, search

< Back to main video accessibility page

From the requirements list we have a need for the following formats ("text codecs"):

  • a closed caption format
  • a closed subtitle format
  • a textual audio description format
  • potentially a karaoke format
  • a metadata / semantic annotations format
  • a transcript / script / lyrics format

These formats should specify:

  • in format header: the type of text codec they represent
  • in format header: the primary language
  • in format header: default display mechanism
  • in format header: open/closed by default
  • in format body: temporal structure
  • in format body: text & text styling
  • in format body: allow outgoing hyperlinks
  • in format body: allow naming of cue points / sections

Caption & Subtitle Formats

Icon for closed captions: CC.jpg Icon for subtitles (3-letter language): martingay_subtitles.jpg

This is a (incomplete) list of existing caption and subtitle formats:

Previous comparative analysis:


Many subtitle / caption formats come as just a sequence of triples: start-time, end-time, text (e.g. MicroDVD, SubRip). This is the rawest way of providing subtitles and it lacks further important information such as styling, title and language.

Some are more informative, e.g. Substation Alpha, USF, and SAMI, which introduce styling, metadata, and events with dynamic effects. SAMI and USF are in XML, which has been rejected as too talkative a langauge for creating subtitles.

TimedText has the most complex model, in particular for styling and layout.

Karaoke Formats

Icon for karaoke:

Comparison: Different Karaoke formats Different Karaoke formats


Many karaoke formats are based around non-text formats, such as midi, or graphics. The simplest karaoke format is the lyrics format, which enables the inclusion of timing towards text. Advanced Substation Alpha provides an extension to captioning for karaoke, but nothing fancy. The only format that can carry text and animated images is Kate, which can even draw arbitrary shapes.

Textual Audio Description Formats

Icon for audio descriptions (sound): audiodes.gif

Interestingly, both use the file extension .smi. :-)


There are not many formats that are explicitly targeted towards creating textual audio descriptions. These audio descriptions have to be synchronised with quiet sections in the video stream, which makes them somewhat different to captions - i.e. less sparse. SAMI has wide uptake as caption format in Korea.

Metadata / Semantic Annotations Formats

Icon for metadata / annotations: icon_Subtitles.gif

Dead Web 2.0 sites:


W3C format analysis


There are no cases of RDF annotations attached to video in the wild yet. MPEG-7 as well as RDF for video are mostly research projects. Existing sites (or formerly existing sites) do user-contributed video annotations in proprietary formats, unexposed. MPEG-7 and AAF are too complex as a generic annotation language for multimedia.

Transcript / Script / Lyrics Formats

Icon for transcript: Icon transcript.jpg


Transcripts are currently found mostly as web pages or as transcripts with time codes, which makes them caption-like. Where they are caption-like, there are no extra requirements. As for related text files - links are already easily presented on Web pages. The lyrics file consists of caption-like text after some core metadata fields, which are header-like information.

< Back to main video accessibility page