Accessibility/RichContentKeyboardBehaviour
Summary
What this is all about
The article proposes a model of keyboard navigation through the web page content. The keyboard navigation has special meaning for the screen reader users since the keyboard is primary tool they use to navigate through the page. Therefore it's very important to ensure any present content is accessible by keyboard.
Technically the keyboard navigation can be enabled explicitly (the caret navigation mode) or it's turned on automatically if the focus goes into editable area.
The typical web page or the web application may contain rich text, structural elements (like HTML tables) and UI elements so that the content the user deals with is very complex. Therefore keyboard navigation rules become a trick. Unfortunately there is no specification which would define the behavior. Browser implementations vary and none are perfect.
This leads to some portions of the web page aren't accessible by keyboard only. For example it might be not possible to focus UI elements placed between text content when user navigates through the page. The rules aren't defined to navigate the content when floating and absolute positioned content is met on the way. All of this reduce keyboard navigation appeal and makes screen readers to disable browser provided navigation and implement own version.
The propose of the article is to provide guideline how the keyboard navigation should work when the user navigates through the mixed content on the page.
The rich element term
The keyboard navigation behavior rules are standard if the navigable content is a rich text. The rich text behavior is the same as for plain text when user navigates through it. However the behavior is not well defined when the web page can contain other content types. In short these elements can be classified by the following groups.
- integral elements, i.e. elements that don't contain navigable text
- native form controls (e.g. HTML button or HTML select)
- static elements (e.g. HTML image)
- ARIA controls (e.g. HTML:div with @role="button")
- compound elements, i.e. elements that contains navigable text
- native form controls (e.g. HTML input@type="text")
- native focusable elements (e.g. HTML:a)
- used defined focusable elements (e.g. HTML:div with tabindex="0")
- ARIA controls (e.g. @role="link" or @role="textbox")
These elements are referred as rich elements. Any other elements are text or text container elements, this includes plain text, styled text, paragraphs and structural elements (e.g HTML tables or HTML lists).
Overview
Everything should be achievable
The primary idea is to put rich elements into navigation order together with the rich text so the user is able to achieve anything on the page by the keyboard only. This concerns to form controls and generated content (e.g. content for :after and :before pseudo styles). The user should be able to select every visible content and copy it into the clipboard.
Navigation order is defined within navigation blocks - a logical unions of navigable content. So if the caret inside of navigation block then the user should navigate the block entirely before the caret is moved to next in layout the navigation block. Example of navigation block is editable area - the user should traverse it before the caret moves to the next block. Navigation blocks can be nested.
Details
Navigation order should be defined by navigation blocks. The user should navigate the block entirely before the caret moves to the next block. For example if two blocks are visually placed next to each other and the user reaches the end of line of the first block then caret should be moved to the next line of the first block.
The navigation blocks are defined by layout flow. All content of the normal flow is contained by one navigation block what may contain nested blocks however. In contrast to normal flow content the float content or the absolute positioned content are presented by own blocks, i.e. each container (where the flow is changed) is represented by own navigation block.
The navigation blocks can be nested. For example, if the content contains editable area then the area is presented by own block which is contained by the container's one.
The next block is defined by the page layout. If two blocks are overlapped then "more closer" block is used and then "more far" is used. If two blocks occupy the same place then z-index is used, the bottom content is excluded from navigation order.
Navigation blocks are organized in tree structure.
Example
<body>
<p>normal multiline paragraph</p>
<p style="float: right; color: blue;">floating multiline paragraph</p>
</body>
which visually can presented as
| normal multiline | floating multiline |
| paragraph | paragraph |
If the user navigates through normal paragraph by characters then it should be navigated entirely, the caret should be moved to the floating paragraph when the user reaches the end of the normal paragraph. Note, the first line of the floating paragraph ("floating multiline") is visually "next" to the first line of the normal paragraph ("normal multiline").
Rich element as a lexical unit
Rich element is a word
The rich element should be treated as a word when the user navigates through the navigable area. The idea of this is provide an ability to put the caret immediately before or after the rich element. It makes possible to select the rich element if it's not embedded by the rich text. This requirement has special meaning for editable areas where the user is able to write a text before or after the rich element.
The term rich word will be used to emphasize the word is represented by a rich element.
The number of characters of the rich word depends on whether this is compound or integral element, i.e. whether it can contain the navigable text or not. If the element is integral then it hasn't any characters and it will be referred as an empty word. Otherwise the element contains all characters of the element's content and will be referred as a compound word.
For example, HTML button is an empty word since it doesn't contain navigable text. The same time HTML input or HTML anchor are compound words since they both allow navigable content.
Another example, non editable container element within the editable area is treated as an empty word if caret navigation mode is turned off. Otherwise it is treated as a compound word consisted of a number of words equaled to a number of words non editable element consists of.
Boundaries of the rich word
The rich word boundaries are designated by special autogenerated empty characters, used as a word delimiters. These delimiters aren't a part of the word.
If the rich elements are placed each after other then each of them has empty character embedded between them, i.e. the elements don't share empty characters.
The empty characters are not presented visually, however they participate in keyboard navigation as normal characters.
In the following example
text<button>btn</button><a>link</a>text
conditional notation can be presented as "text|||link|text", where the empty character is marked by '|' symbol. Both the empty word for the button and the compound word for the link are wrapped by empty characters ('|' symbols of blue and red colors correspondingly).
The rich word boundaries mapping to AT
The empty characters used to designate the begin and the end of the special word should be exposed to AT as a certain character like it happens for nested text container accessibles. This character should be not pronounceable character so that AT won't need any additional special support.
The caret position and selection terms
If the caret position is immediately before/after left/right empty character of the rich element then it will be referred as the caret is immediately before/after the element.
If the caret position is between empty characters of the sibling rich elements then it will be referred as the caret is between the elements.
If the caret position is between empty characters of the integral element then it will be referred as the caret is on the element.
If the selection contains both empty characters of the rich element then it will be referred as the element is selected entirely.
The caret visual position
Since the rich word is wrapped by special delimiters then cursor for the same caret position can referred by two ways: "the caret is immediately before the rich element" and "the caret is immediately after the element preceding the rich element".
If the rich element and the element preceding the rich element are placed visually on different lines then the cursor might be rendered logically in two different locations. The following rules are applied.
- if the rich element is next or preceding to text container then the caret is drawn after/before the text.
- if two rich elements are nested then the caret is drawn after the first rich element.
For example,
hello <div role="button">button1</div> <div role="button">button2</div>
then the cursor is drawn after the 'o' if the caret is before the first ARIA button. If the caret is before the second ARIA button then the cursor is drawn after the first ARIA button.
Keyboard interaction with the rich element
The rich element behavior on the keyboard input is the same as usual until the global keyboard navigation rules conflicts with the element behavior. In this case the element should explicitly prevent the global action, i.e. if the element prepares certain actions on the pressed key (e.g. HTML select element change the selected option on up/down keys) then it cancels the default event action.
If the compound element is driven by global navigation rules (e.g. HTML:a or ARIA controls) then nothing special should be done. However if the ARIA control handles and process keyboard events then it should care to prevent default action explicitly.
For example if the text field processes keyboard events to implement the caret navigation then it should prevent the default action iif the caret can't be moved within the text field. So that if the caret achieves the end of text field then the global keyboard navigation are applied and the caret is moved out from the text field to next keyboard navigable content.
If the caret position is somewhere in the middle of the text and the user navigates by words (ctrl or alt (option) + left/right arrow key depending on platform) and rich element is encountered on the way then the rich element should be skipped, i.e. the caret should be moved from the position before the element to the position after the element.
For example, if the caret is before the HTML button element then the caret should be moved after element. If the caret is before HTML input then it should be skipped as well. The same is applicable to HTML anchor element. So that any compound element is excluded from navigation sequence when the users navigates by words.
If the caret is in the middle of the compound element then the user should navigate the compound element entirely.
For example, if the caret is inside of the table cell text (or anchor) then the caret should be moved by words of the cells text (anchor text).
If the caret is on the integral element (i.e. the word is empty and the caret is between its boundary characters) then the caret position should be set after the element.
For example, if the HTML button is focused (which is treated the caret is on the element) then the caret should be set immediately after the button.
If the caret is inside of (or on) focusable rich element and the caret goes out of the element then the area where the element is placed should be focused.
If the user navigates through editable area by characters and special content element is encountered on the way then
- if the element is focusable then it should be focused
and
- if the element is compound then caret position should be set before the first character of the element's content
- otherwise the caret should be set on the element.
For example, if the control element is disabled or if the caret navigation mode is off and the element is non editable area then the caret should be set on the element, the editor should stay focused. Visually it might look like dashed border around the element.
If the control element is focused and the user navigates by characters then
- editor should be focused and caret position should be set after the control element if no control action can be performed on the pressed key
- otherwise control action should be performed.
For example if two buttons (HTML button) are placed after another
text<button>btn1</button><button>btn2</button>
then right arrow key presses should traverse "text" by characters, then focus the 1st button, focus the editor and set caret position between buttons, then focus the 2nd button, then focus the editor and set caret position after the second button.
If the compound element is focused and the user navigates by characters then caret should be moved consecutively to the end of the element's content, then should be set after the element and the editor should take the focus.
For example, if the HTML anchor and HTML button are placed after another
<a href="">link</a><button>btn</button>
then caret should be moved through the "link" text, should be set after anchor element and then button should be focused.
The user should be able to navigate to the begin/end of
- the current line (for example, Home/End keys)
- the editable area (for example, Home/End keys with pressed 'alt' modifier key).
If the special content element is focused then pressing tab should navigate to the next focusable element in tab order. Next focusable element in tab order can be encountered either inside or outside of editable area. This requirement is applied to control elements and compound element both.
For example, if editable area contains two buttons (HTML and ARIA buttons) and there is one button outside of editable area
<div contentEditable="true"> Text<button>btn1</button>text<div role="button" tabindex="0">btn2</button>text </div> <button>btn3</button>
and the 1st button is focused then pressing tab should move the focus to the 2nd button and then to the 3rd button.
If the editor is focused then pressing tab should insert '\t' character or its used analogue or move the focus outside of editable area to the next tabable element what depends on editor preferences or platform.
Mouse interaction with rich element
The special content element behaviour is the same on mouse input if the mouse isn't used to change the selection.
For example, if user clicks on combobox (HTML:select) then drop down list appears.
Managing the selection
When integral element participates in selection then it should be selected entirely always. If the integral element has own selection behaviour (like a HTML input) while the it's focused and the user starts the selection inside of it then there is no way to extend the selection out of boundaries of the element.
When compound element participates in selection then the editor should provide two opposite options to the user. It should be possible to select the element entirely so that its content wouldn't be selected. On the another hand it should be possible to select the content of the compound element so that the element itself wouldn't be selected.
Visually the selected entirely element might have blue border around it.
Keyboard selection
When user holds selection modifier key (for example, shift key) and moves through editable area then the editor stays focused.
At the same time if the integral element encountered on the way then the element is appended to the selection entirely.
If the compound element is encountered on the way while
- the user moves by words then the element is appended to the selection entirely
- the user moves by characters then the element's content is appended consecutively to the selection until the user reaches the end of the element.
- If the caret leaves the element then the element is appended to the selection entirely.
- If the user releases the selection modifier key then the compound element is focused while the caret stays inside of the element's content.
If the selection is started inside of compound element while is is focused then editor will take the focus once the caret leaves the compound element content.