From MozillaWiki
Jump to: navigation, search

Boot2Gecko Accessibility, AKA B2AW (Boot 2 Accessible Web)

This is a collection of ideas and requirements for the B2G project from the accessibility standpoint.

Output via Speech

Speech output can be done through either using freely redistributable TTS engine libraries and directly use the respective libraries api and expose some reasonable abstraction to js through ctypes / xpcom, or writing an engine ourselves in JavaScript, or emscripten'ing an existing library.

The things to consider are:

  • Multilingual support. We don't want something that speaks only English.
  • Speech quality: We don't want something that sounds like it was built on a Votrax chip in the early 1980s.

See also bug 525444.

Speech recognition

A solution that, like Nuance does it, sends stuff to a server to be recognized, since the speech rec process requires processing power that is not really available on mobile devices today. Speech recognition consumes considerably more CPU power than TTS.

What are the computational demands? The current top-of-line chipsets on mobiles offer O(10 Gflops) compute power and O(1GBps) memory bandwidth. That's pretty serious. By the time B2G is further along, those numbers will be higher. There's of course a power-consumption tradeoff that has to be made, but I don't think we should rule out "client side" recognition a priori. If this is a research problem, we have contents in academia who would almost certainly be interested.


Implement a sophisticated gesturing model that allows blind users to do virtually everything without requiring a physical keyboard:

  • Touch something: Speak what's under the finger.
  • Double tap the last touched item: Activate it if possible (e. g. activate a link, set focus to an input etc.)
  • Swipe left and right: Move from item to item. Not just tabbable ones, but everything, so the page can be explored element by element.
  • Two finger dial gesture: Set the rotor (see next gesture) to a certain element. Elements can be links, headers, landmarks, form elements, lists, graphics, tables, and others we think are useful. Turn the dialer counterclock wise and the selection moves in the opposite direction.
  • Swipe up and down: Move to the previous resp. next element of a given type.
  • Two finger swipe up: Read from beginning of page to currently touched element.
  • Two finger Swipe down: Read from current location to end of page continuously.
  • Three finger swipe down/up: Move to next/previous visible part of page (like what fits on the touchscreen)
  • Tap with three fingers: Say which page, of how many, one is on. Gives the user an idea on how long the page actually is.

Physical Keyboard input

This could be tailored to what NVDA is doing: In browse situations, allow to jump to headings, landmarks, links, form fields, graphics, and others. When inside an editable field, type into the field. Make it transparent to users whether they're browsing or filling out a form (the Orca/VoiceOver transparency model).

Alternative Input

We'll need to somehow enable the creation of mouse and keyboard events from alternate devices such as switches. We'd want to enable something like Tekla switch input (aside: davidb knows the Tekla people).


Provide Zoom features.