Platform/HTML5 sanitizer: Difference between revisions
< Platform
Jump to navigation
Jump to search
Line 5: | Line 5: | ||
* Have three element white lists: HTML, SVG and MathML. | * Have three element white lists: HTML, SVG and MathML. | ||
* Have three attribute white lists: HTML, SVG and MathML. The attributes don't depend on the element they are on beyond the element namespace. | * Have three attribute white lists: HTML, SVG and MathML. The attributes don't depend on the element they are on beyond the element namespace. | ||
* Have | * Have three lists of attributes that take URLs. Drop the attributes when they have prohibited URLs (after trimming whitespace from the value). | ||
** Resolve relative URLs into absolute ones using a per fragment base URL. (Is this correct for Gecko reqs?) | ** Resolve relative URLs into absolute ones using a per fragment base URL. (Is this correct for Gecko reqs?) | ||
** Why is whitespace trimmed before the security check? | ** Why is whitespace trimmed before the security check? | ||
** However, allow any URL in the src attribute on the img element, because imgs are safe. | ** However, allow any URL in the src attribute on the img element, because imgs are safe. | ||
*** Why risk this? | *** Why risk this? | ||
* Have a list of SVG attributes that take different-document references. | |||
* Have a list of SVG attributes that are allowed to have same-document references only. | |||
* If styles are allowed, sanitize style attribute values. If styles aren't allowed, drop the style attribute. | * If styles are allowed, sanitize style attribute values. If styles aren't allowed, drop the style attribute. | ||
* Always drop script and title elements and their contents. | * Always drop script and title elements and their contents. |
Revision as of 12:34, 11 January 2011
Gecko Requirements
- Allow a setting for enabling styles.
- Allow a setting for enabling comments. See bug 572642
- Have three element white lists: HTML, SVG and MathML.
- Have three attribute white lists: HTML, SVG and MathML. The attributes don't depend on the element they are on beyond the element namespace.
- Have three lists of attributes that take URLs. Drop the attributes when they have prohibited URLs (after trimming whitespace from the value).
- Resolve relative URLs into absolute ones using a per fragment base URL. (Is this correct for Gecko reqs?)
- Why is whitespace trimmed before the security check?
- However, allow any URL in the src attribute on the img element, because imgs are safe.
- Why risk this?
- Have a list of SVG attributes that take different-document references.
- Have a list of SVG attributes that are allowed to have same-document references only.
- If styles are allowed, sanitize style attribute values. If styles aren't allowed, drop the style attribute.
- Always drop script and title elements and their contents.
- If styles are disabled, drop style elements and their contents.
- If styles are enabled, sanitize the content of style elements.
- Add the controls attribute to the video and audio elements (if it isn't there already).
Open Questions
- Can stylistic SVG attributes have values that need to be sanitized?
- Can stylistic MathML attributes have values that need to be sanitized?
- Should element whitelisting take place after the tree builder algorithm so that the namespace of the element is known?
- Likely yes.
Non-Gecko Requirements
- Allow form-related elements to be toggled on and off in the white list.
- Allow using the sanitizer in non-fragment mode (in which case, the title element should be allowed).