Accessibility/HTML5 captions

From MozillaWiki
Jump to: navigation, search

Specification of the itext element (First Version)

The itext element

Categories

   Metadata content.
   Flow content.
   Phrasing content.

Contexts in which this element may be used:

   In a video or audio element that is a child of a body element.

Content model:

   Empty.

Content attributes:

   Global attributes (include id and style)
   category
   src
   lang
   type
   charset
   display

DOM interface:

   [Callable=namedItem]
   interface HTMLItextElement : HTMLElement {
              attribute DOMString category;
              attribute DOMString src;
              attribute DOMString lang;
              attribute DOMString type;
              attribute DOMString charset;
              attribute boolean display;
     readonly attribute boolean fetched;
     readonly attribute boolean enabled;
     readonly attribute ItextError error;
     readonly attribute float delay;
     readonly attribute HTMLCollection allText;
     readonly attribute langName;
   void fetch();
   DOMString currentText(currentTime);
   void enable();
   void disable();
   void delay(seconds);
   };    


The itext element allows authors to include a link to an external file that contains informative text about the video. The external resource is expected to consist of a sequence of time intervals with associated text and potentially layout, styling, and animation information for the text. The text is displayed as the parent audio or video element goes through its time interval, i.e. the parent's currentTime has reached the start time of the interval but has not yet reached the end time of the interval (a semi-open interval: [start,end) ).

The category attribute describes what function the informative text represents and can be one of the following:

  • CC: closed captions (for the deaf)
  • SUB: subtitles (for i18n)
  • TAD: textual audio descriptions (for the blind; to be used as braille or through TTS)
  • KTV: karaoke
  • TIK: ticker text
  • AR: active regions
  • NB: semantic annotations, including speech bubbles and director comments
  • META: metadata, mostly machine-readable
  • TRX: transcripts / scripts
  • LRC: lyrics
  • LIN: linguistic markup
  • CUE: cue points, DVD style chapter markers and similar navigational landmarks

The src attribute gives the address of the external itext resource to use. The value of the attribute must be a valid URL identifying a text resource of the type given by the type attribute, if the attribute is present, or of the type "text/srt", if the attribute is absent.

NOTE: text/srt will need to be registered as a mime type (as well as a format standardisation)

The type attribute gives the format of the data, RFC 2046. If the attribute is present, its value must be a valid MIME type, optionally with parameters. The charset parameter must not be specified. (The default, which is used if the attribute is absent, is "text/srt".) [RFC2046]

The lang attribute, if present, gives the language of the linked resource. The value must be a valid RFC 3066 language code. [RFC3066] User agents will use this attribute to select between, e.g., all itext elements given for a video or audio element that belong to the same category, but represent different languages. User agents that discover upon fetching of the resource that language information associated with the resource differs from the given lang, will set an error code on the element.

The charset attribute gives the character encoding of the external text resource. If the attribute is set, its value must be a valid character encoding name, must be the preferred name for that encoding, and must match the encoding given in the charset parameter of the Content-Type metadata of the external file, if any. [IANACHARSET]

The display attribute enables an author to specify whether an itext element is displayed by default, not displayed by default, or automatically displayed in association with its parent audio or video element based on conditions. The values of this attribute are { yes, no, auto } and the default value is "no". A user agent that comes across an "auto" display itext element has to activate it by default if its lang setting corresponds with the browser's default language setting.


Further itext functionality:

1. Itext fetching

An itext resource is not automatically fetched as the element is parsed, since there may be a sizeable number of external resources to retrieve for an individual video or audio element. It is only fetched under the following circumstances:

  • if the display attribute is set to "yes", or
  • if the display attribute is set to "auto" and the lang matches to browser's default language setting, or
  • if the fetch() function has been called on the itext element.

Fetching an itext resource means following the src URL and retrieving the resource. Fetching the external resource must not delay the load event of the element's document. The user agent will work with the fetched itext resource as soon as it is retrieved.

2. Itext display

An enabled itext resource displays its content on screen.

An itext resource that has been fetched because of the display attribute is enabled by default.

An itext resource that has been fetched using the fetch() function is disabled.

An itext resource that is currently enabled can be disabled using the disable() function. An itext resource that is currently disabled can be enabled using the enable() function.

3. Itext delay

The itext resource is synchronised to its parent audio or video element through the parent's currentTime attribute. Sometimes, synchronisation can be off.

The delay(seconds) function provides a method to offset the currentTime by a positive or negative float value to fix synchronisation. The readonly delay attribute defaults to 0 and is updated through calls to the delay(seconds) function.

4. Itext text extraction

The currentText(currentTime) function returns the current text segment from the itext resource, i.e. the text that is active at the parent's currentTime attribute value.

The allText attribute allows access to all the text segments as extracted from the itext resource.

The langName attribute exposes the full language name for the itext resource, such that a javascript developer can display it in a menu.

5. Itext errors

The error attribute contains the last error that may have appeared in relation to the itext resource.

interface ItextError {

 const unsigned short ITEXT_ERR_ABORTED = 1; // fetching aborted
 const unsigned short ITEXT_ERR_NETWORK = 2; // network error
 const unsigned short ITEXT_ERR_PARSE = 3;   // parsing error of itext resource
 const unsigned short ITEXT_ERR_SRC_NOT_SUPPORTED = 4; // unsuitable itext resource
 const unsinged short ITEXT_ERR_LANG = 5;    // language mismatch
 readonly attribute unsigned short code;

};