Accessibility/RichContentKeyboardBehaviour
Summary
What this is all about
The article proposes a model of keyboard navigation through the web page content. The keyboard navigation has special meaning for the screen reader users since the keyboard is primary tool they use to navigate through the page. Therefore it's very important to ensure any present content is accessible by keyboard.
Technically the keyboard navigation can be enabled explicitly (the caret navigation mode) or it's turned on automatically if the focus goes into editable area.
The typical web page or the web application may contain rich text, structural elements (like HTML tables) and UI elements so that the content the user deals with is very complex. Therefore keyboard navigation rules become a trick. Unfortunately there is no specification which would define the behavior. Browser implementations vary and none are perfect.
This leads to some portions of the web page aren't accessible by keyboard only. For example it might be not possible to focus UI elements placed between text content when user navigates through the page. The rules aren't defined to navigate the content when floating and absolute positioned content is met on the way. All of this reduce keyboard navigation appeal and makes screen readers to disable browser provided navigation and implement own version.
The propose of the article is to provide guideline how the keyboard navigation should work when the user navigates through the mixed content on the page.
The rich element term
The keyboard navigation behavior rules are standard if the navigable content is a rich text. The rich text behavior is the same as for plain text when user navigates through it. However the behavior is not well defined when the web page can contain other content types. In short these elements can be classified by the following groups.
- integral elements, i.e. elements that don't contain navigable text or their text doesn't behave as a part of surrounding text and can have own keyboard input processing
- native form controls like HTML button or HTML select or HTML input
- static elements like HTML image
- ARIA controls like HTML:div with @role="button"
- compound elements, i.e. elements having navigable text that is a part of surrounding text
- native focusable elements (e.g. HTML:a)
- used defined focusable elements (e.g. HTML:div with tabindex="0")
- ARIA controls like @role="link" or @role="textbox"
These elements are referred as rich elements. Any other elements are text or text container elements, this includes plain text, styled text, paragraphs and structural elements (e.g HTML tables or HTML lists).
Overview
Everything should be achievable
The primary idea is to put rich elements into navigation order together with the rich text so the user is able to achieve anything on the page by the keyboard only. This concerns to form controls and generated content (e.g. content for :after and :before pseudo styles). The user should be able to select every visible content and copy it into the clipboard.
Navigation order is defined within navigation blocks - a logical unions of navigable content. So if the caret inside of navigation block then the user should navigate the block entirely before the caret is moved to next in layout the navigation block. Example of navigation block is editable area - the user should traverse it before the caret moves to the next block. Navigation blocks can be nested.
Details
Navigation order should be defined by navigation blocks. The user should navigate the block entirely before the caret moves to the next block. For example if two blocks are visually placed next to each other and the user reaches the end of line of the first block then caret should be moved to the next line of the first block.
The navigation blocks are defined by layout flow. All content of the normal flow is contained by one navigation block what may contain nested blocks however. In contrast to normal flow content the float content or the absolute positioned content are presented by own blocks, i.e. each container (where the flow is changed) is represented by own navigation block.
The navigation blocks can be nested. For example, if the content contains editable area then the area is presented by own block which is contained by the container's one.
The next block is defined by the page layout. If two blocks are overlapped then "more closer" block is used and then "more far" is used. If two blocks occupy the same place then z-index is used, the bottom content is excluded from navigation order.
Navigation blocks are organized in tree structure.
For example
<body>
<p>normal multiline paragraph</p>
<p style="float: right; color: blue;">floating multiline paragraph</p>
</body>
which visually can presented as
| normal multiline | floating multiline |
| paragraph | paragraph |
If the user navigates through normal paragraph by characters then it should be navigated entirely, the caret should be moved to the floating paragraph when the user reaches the end of the normal paragraph. Note, the first line of the floating paragraph ("floating multiline") is visually "next" to the first line of the normal paragraph ("normal multiline").
Rich element as a lexical unit
The rich element should be treated differently than the rich text when user navigates through navigable area. The primary idea is allow to put the caret immediately before or after the rich element. This makes possible to select the rich element if it's not surrounded by the rich text. On the another hand this requirement has special meaning for editable areas where the user should be able to write a text before or after the rich element.
To make this happen special autogenerated empty characters are inserted before and after the rich element. If the rich elements are placed each after other then each of them has empty character embedded between them, i.e. the elements don't share empty characters.
The empty characters are not presented visually but they affect on keyboard navigation. They are used to designate the element boundaries when the element is treated as lexical unit, however their behavior differs from the word delimiters like 'space' characters.
Integral element is a word
The integral element should be treated as a word. The term rich word will be used to emphasize the word is for rich element.
If the element doesn't contain any navigable text then its word doesn't have any characters and is referred as an empty word, otherwise its word consist of all contained characters and it's referred as a complex word.
For example, HTML button is an empty word since it doesn't contain navigable text. HTML input is a complex word since it allows navigable content. Another example, non editable container element within the editable area is treated as an empty word if caret navigation mode is turned off.
The integral element is surrounded by empty characters.
In the following example
text<button>btn</button><input value="value">text
conditional notation can be presented as "text|||value|text", where the empty character is marked by '|' symbol. Both the empty word for the button and the complex word for the input are wrapped by empty characters ('|' symbols of blue and red colors correspondingly).
Compound element as a sentence
Since compound element behaves as a part of the rich text then it should be treated as a sentence what consist of all words of the contained text. The same time it should be possible to set the caret before/after the element. To meet this requirement the empty characters are appended before the first word and after the last word of the sentence to designate the sentence boundaries like it happens for rich words. The term rich sentence will be used to designate it's for the compound rich element.
For example, the following HTML anchor is treated as a rich sentence consisted of one word
Click <a>here</a> to see news
Conditional notation can be written as "Click |here| to see news".
Another example of the rich sentence is non editable container element within the editable area when caret navigation mode is on.
The caret position and selection terms
If the caret position is immediately before/after the start/end empty character of the rich element then the caret is immediately before/after the element.
If the caret position is between empty characters of the sibling rich elements (i.e. between the end and start empty characters) then the caret is between the elements.
If the caret position is between empty characters of the integral element represented by empty word then the caret is on the element. If the rich element has navigable text then the caret is inside the element.
If the selection contains both empty characters of the rich element then the element is selected entirely.
The caret visual position
Since the rich word or sentence are wrapped by special delimiters then cursor for the same caret position can referred by two ways: "the caret is immediately before the rich element" and "the caret is immediately after the element preceding the rich element".
If the rich element and the element preceding the rich element are placed visually on different lines then the cursor might be rendered logically in two different locations. The following rules are applied.
- if the rich element is next or preceding to text container then the caret is drawn after/before the text.
- if two rich elements are nested then the caret is drawn after the first rich element.
For example,
hello <div role="button" tabindex="0">button1</div> <div role="button" tabindex="0">button2</div>
then the cursor is drawn after the 'o' if the caret is before the first ARIA button. If the caret is before the second ARIA button then the cursor is drawn after the first ARIA button.
The empty characters mapping to AT
The empty characters used to designate the begin and the end of the rich word or sentence should be exposed to AT as a certain character like it happens for nested text container accessibles. This character should be not pronounceable character so that AT won't need any additional special support.
Keyboard interaction with the rich element
The rich element behavior on the keyboard input is the same as usual until the global keyboard navigation rules conflicts with the element behavior. In this case the element should explicitly prevent the global action, i.e. if the element prepares certain actions on the pressed key (e.g. HTML select element change the selected option on up/down keys) then it cancels the default event action.
If the compound element is driven by global navigation rules (e.g. HTML:a or ARIA controls) then nothing special should be done. However if the ARIA control handles and process keyboard events then it should care to prevent default action explicitly.
For example if the text field processes keyboard events to implement the caret navigation then it should prevent the default action iif the caret can't be moved within the text field. So that if the caret achieves the end of text field then the global keyboard navigation are applied and the caret is moved out from the text field to next keyboard navigable content.
The following conditions are considered under assumption the user navigates forward by words (ctrl or alt (option) + left/right arrow key depending on platform and text direction). The rules defined below are inverse if user moves backward.
Integral element
If the caret position is somewhere in the middle of the rich text and the integral element is next on the way (i.e. its rich word is the next word) then the caret should be set before the rich element. If the caret is before the integral element then the element should be skipped and the caret should be set before the next word.
For example,
Enter <input value="number"> in pixels.
If the initial caret position is before 'Enter' word then the caret should be moved before the input element, then after the input element (before the space preceding the 'in' word) and then before the 'pixels' word.
If the caret is on the integral element and the rich word is empty (i.e. the caret is between its boundary characters) then the caret position should be set before the next word.
For example, if the HTML button is focused (which is treated the caret is on the element) then the caret should be set immediately after the button.
Compound element
If the compound element is next on the way (i.e. the first word of its sentence is next word) then the caret should be set before the element. If the caret is before the element then it should be moved before the begin of the second word of the sentence. If the the sentence consist of one word then the caret should be moved before the word following the sentence.
For example,
See <a>this report</a> for more details.
If the initial caret position is before the 'See' word then caret should be set before the empty character of the 'this' word, before 'report' word and then before the 'for' word.
So that the compound element is processed as a normal text.
If the caret is in the middle of the rich sentence then the user should navigate the whole sentence by words.
For example, if the caret is inside of the anchor then the caret should be moved by words of the anchor text.
If the caret is inside of (or on) focusable rich element and the caret goes out of the element then the area where the element is placed should be focused.
If the user navigates through editable area by characters and special content element is encountered on the way then
- if the element is focusable then it should be focused
and
- if the element is compound then caret position should be set before the first character of the element's content
- otherwise the caret should be set on the element.
For example, if the control element is disabled or if the caret navigation mode is off and the element is non editable area then the caret should be set on the element, the editor should stay focused. Visually it might look like dashed border around the element.
If the control element is focused and the user navigates by characters then
- editor should be focused and caret position should be set after the control element if no control action can be performed on the pressed key
- otherwise control action should be performed.
For example if two buttons (HTML button) are placed after another
text<button>btn1</button><button>btn2</button>
then right arrow key presses should traverse "text" by characters, then focus the 1st button, focus the editor and set caret position between buttons, then focus the 2nd button, then focus the editor and set caret position after the second button.
If the compound element is focused and the user navigates by characters then caret should be moved consecutively to the end of the element's content, then should be set after the element and the editor should take the focus.
For example, if the HTML anchor and HTML button are placed after another
<a href="">link</a><button>btn</button>
then caret should be moved through the "link" text, should be set after anchor element and then button should be focused.
The user should be able to navigate to the begin/end of
- the current line (for example, Home/End keys)
- the editable area (for example, Home/End keys with pressed 'alt' modifier key).
If the special content element is focused then pressing tab should navigate to the next focusable element in tab order. Next focusable element in tab order can be encountered either inside or outside of editable area. This requirement is applied to control elements and compound element both.
For example, if editable area contains two buttons (HTML and ARIA buttons) and there is one button outside of editable area
<div contentEditable="true"> Text<button>btn1</button>text<div role="button" tabindex="0">btn2</button>text </div> <button>btn3</button>
and the 1st button is focused then pressing tab should move the focus to the 2nd button and then to the 3rd button.
If the editor is focused then pressing tab should insert '\t' character or its used analogue or move the focus outside of editable area to the next tabable element what depends on editor preferences or platform.
Mouse interaction with rich element
The special content element behaviour is the same on mouse input if the mouse isn't used to change the selection.
For example, if user clicks on combobox (HTML:select) then drop down list appears.
Managing the selection
When integral element participates in selection then it should be selected entirely always. If the integral element has own selection behaviour (like a HTML input) while the it's focused and the user starts the selection inside of it then there is no way to extend the selection out of boundaries of the element.
When compound element participates in selection then the editor should provide two opposite options to the user. It should be possible to select the element entirely so that its content wouldn't be selected. On the another hand it should be possible to select the content of the compound element so that the element itself wouldn't be selected.
Visually the selected entirely element might have blue border around it.
Keyboard selection
When user holds selection modifier key (for example, shift key) and moves through editable area then the editor stays focused.
At the same time if the integral element encountered on the way then the element is appended to the selection entirely.
If the compound element is encountered on the way while
- the user moves by words then the element is appended to the selection entirely
- the user moves by characters then the element's content is appended consecutively to the selection until the user reaches the end of the element.
- If the caret leaves the element then the element is appended to the selection entirely.
- If the user releases the selection modifier key then the compound element is focused while the caret stays inside of the element's content.
If the selection is started inside of compound element while is is focused then editor will take the focus once the caret leaves the compound element content.