MDN/Archives/Development/CompatibilityTables/Importer

From MozillaWiki
< MDN‎ | Archives
Jump to: navigation, search

Note: If you came to this page from a link on https://browsercompat.herokuapp.com/importer, then it may be that additional documentation has not been written for the importer issue. Feel free to add a Level 3 heading for your issue.

General Information

It is not enough to create a data store for compatibility data. It also needs to be populated with structured data. We've decided to start with MDN data, rather than start from scratch or an existing data source. An MDN data importer is now part of the browsercompat project, and is live at https://browsercompat.herokuapp.com/importer/.

This tool is used to generate and update the data store. However, a number of pages exist which do not scrape properly due to formatting issues or the like. For that reason, work is ongoing to fix all of those issues. We could use your help! Please see When fixing an error for an introduction to how to help.

Expected MDN content

The importer works with the raw versions of pages, which contains HTML with KumaScript tags. For example, the MDN page about the HTML <p> element is:

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/p

and the raw version of the page is:

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/p?raw

You can also see the raw version by editing a page and selection the "Source" button in the upper left corner.

The importer is expecting a page with:

  • Page content with sections headers
  • A Specifications section header
  • A table of specifications with {{SpecName}} and {{Spec2}} macros
  • A Browser Compatibility section header
  • A table of desktop browsers, features, and the browser's support for those features
  • A similar table of mobile browsers
  • (Optional) Footnotes referenced from the tables
  • (Optional) Another section header, with additional content

Here's an example (most pages are more complex):

<h2 id="Summary">Summary</h2>
<!-- ... Other content .... -->
<h2 id="Specifications" name="Specifications">Specifications</h2>
<table class="standard-table">
 <thead>
  <tr>
   <th scope="col">Specification</th>
   <th scope="col">Status</th>
   <th scope="col">Comment</th>
  </tr>
 </thead>
 <tbody>
  <tr>
   <td>{{SpecName('HTML WHATWG', 'grouping-content.html#the-p-element', '<p>')}}</td>
   <td>{{Spec2('HTML WHATWG')}}</td>
   <td> </td>
  </tr>
 </tbody>
</table>
<h2 id="Browser_compatibility" name="Browser_compatibility">Browser compatibility</h2>
<div>
 {{CompatibilityTable}}</div>
<div id="compat-desktop">
 <table class="compat-table">
  <tbody>
   <tr>
    <th>Feature</th>
    <th>Chrome</th>
    <th>Firefox (Gecko)</th>
   </tr>
   <tr>
    <td>Basic support</td>
    <td>1.0</td>
    <td>{{CompatGeckoDesktop("1.0")}} [1]</td>
   </tr>
  </tbody>
 </table>
</div>
<div id="compat-mobile">
 <table class="compat-table">
  <tbody>
   <tr>
    <th>Feature</th>
    <th>Android</th>
    <th>Firefox Mobile (Gecko)</th>
   </tr>
   <tr>
    <td>Basic support</td>
    <td>{{CompatVersionUnknown}}</td>
    <td>{{CompatGeckoMobile("1.0")}}</td>
   </tr>
  </tbody>
 </table>
</div>
<p>[1] This is a footnote</p>
<h2 id="See_also">See also</h2>
<!-- ... Rest of content ... -->

The importer is flexible about whitespace and some common MDN alternate patterns, but this flexibility has to be built in. If the page uses valid but unexpected HTML, the importer will fail, usually with a "no data" critical error.

When fixing an error

If you want to help fix these errors, the best place to find an error to fix is the importer issues page. Click on the links to go to the Parse results page for each page that raised an issue.

On these pages you will see a variety of data about each MDN page's compat data important status. The most useful items are "MDN URL", "Issues", and "Actions" > "Reset" button.

You need to read the "Issues" information to find out what the problem is. Look up the issue code in "The Issues" section of this page, below, to find out how to tackle fixing the problem.

Click on the link in the "MDN URL" section to go directly to the MDN page that has a problem, and edit the page to try to fix the problem.

Next, click the "Reset" button to get the system to re-download and re-parse the MDN page. If you successfully fixed the problem, the "Issues" section should list a message of "None detected." If this isn't the case, repeat the process and try again.

If you fix a problem that you had to figure out on your own what to do, please update the the issues section of this page that corresponds to the error in question to include a bullet point explaining your solution, in order to potentially help others in the future.

If you can't fix an error, contact the MDN writers about it using the #mdn (webchat) channel on IRC, or on the [dev-mdc mailing list].

Testing New Tables on MDN Pages

Starting October 2015, we are converting some pages to include new compatibility tables generated from API-backed data. These will only be visible to select MDN staff and collaborators. The purpose is to:

  • Discover bugs in the importer, API, compatibility tables, and the conversion process,
  • Enable MDN staff to make informed decisions on priorities and deliverables, and
  • Define and deploy the features needed to open the new tables to beta users.

We are explicitly deferring some features (such as non-English variants, scaling for production, and client-side refreshes) while the basic functionality is tested and improved. We are only converting a handful of pages, since editing will be buggy and difficult for months, and converted pages may need to be manually processed due to design changes.

Prerequisites

The new tables are only available to select MDN staff and collaborators. To see if you are in this group:

  1. Sign into MDN (Anonymous users get the old compatibility tables)
  2. Go to the Browser Compatibility section of Web/CSS/background. If you see the new table (green and red boxes, icon headers), then you are in the testing group on MDN.
  3. Sign into the BrowserCompat website
  4. Go to the importer page for /Web/CSS/background. If there is a "Commit" button under actions, you have the import-mdn permission needed to convert an MDN page.

If you want to be part of the testing group, the #browsercompat IRC channel is a good place to start the discussion.

Identifying a Converted Page

Here are the components of a converted MDN page, using Web/CSS/Background as an example:

  1. The page appears in search results for the EmbedCompatTable macro
  2. The BrowserCompat importer page shows no errors (some warnings are permitted).
  3. The imported data has been committed to the API, which can be confirmed by following the "Feature ID" link at the top of the page (655 for Web/CSS/Background). A committed feature will show Specifications and Browser compatibility data, similar to the importer page. An un-committed feature will show no data.
  4. The source of the MDN page has a {{CompatibilityTable}} macro, HTML tables, and an {{EmbedCompatTable("slug")}} macro.
  5. When logged in as a test user, the new API-backed table is displayed, including color-coded support information, icons for browsers, and a legend.
  6. When visiting as an anonymous user (such as browsing in private mode or an alternate browser), the traditional MDN compatibility table is displayed.

Convert an MDN Page

  1. Find a suitable MDN page. It should import with no errors, have good compatibility data and footnotes, and describe an older technology that isn't expected to change much in the next 6 months.
  2. If you haven't already, sign into the BrowserCompat website.
  3. Find the importer page for the MDN page. You can paste the URL in the search box to find it directly, or use the Topic filters and browse the page list. You may want to bookmark the page or write down the importer ID for future reference.
  4. On the importer page in the Actions section, click the "Reset" button to download and parse the latest page.
  5. Look for import errors or problems with the compatibility data. If needed, fix the MDN page, and click "Reset" again, repeating until issues are resolved.
  6. In the Actions section, click the "Commit" button to add the parsed data to the API.
  7. In the Raw Data section, take note of the API id and the slug.
  8. On the MDN page, edit the page and scroll the to Browser Compatibility section. Add a {{EmbedCompatTable("slug")}} macro (replacing "slug" with the quoted slug from the importer page) just before the next section (usually <h2>See also<h2>), after the compatibility tables and any footnotes. Do not remove the {{CompatibilityTable}} macro at the top of the section, which is needed for the traditional display.
  9. Save the page, and refresh until the new page is rendered.

Update Compatibility Data on a Converted Page

  1. Edit the compatibility data in the traditional tables on the MDN page, and save your changes. This will update the traditional tables, but the API-backed tables will show the old data.
  2. Sign into the BrowserCompat website.
  3. Find the importer page for the MDN page.
  4. On the importer page in the Actions section, click the "Reset" button to download and parse the updated page.
  5. Look for import errors or problems with the compatibility data. If needed, fix the MDN page, and click "Reset" again, repeating until issues are resolved.
  6. In the Actions section, click the "Commit" button to update the API with the new data.
  7. Return to the MDN page and force refresh the page to re-run the macros
  8. Wait until the new page, with updated API data, is rendered.

The Issues

The importer identifies classes of issues with a slug, a short span of text. The sections below use those slugs as the title, so that we can link directly from the importer to advice for handling that class of issue.

compatgeckodesktop_unknown

Error template:
Unknown Gecko version "{version}"
The importer does not recognize this version for CompatGeckoDesktop. Change the MDN page or update the importer.

Possible solutions:

  • You need to make sure you use the correct macro call: CompatGeckoDesktop(" ... ")
  • The "..." above needs to be replaced with the version of Gecko you want listed.
  • The version of Gecko needs to exist! To check that it exists if you are not sure, check for it on the Firefox Developer Release Notes page.
  • If you want to state Firefox 3.5, the version number you need to enter is actually 1.9.1. See Element.querySelector.

exception

Error template:
Unhandled exception
{traceback}

  • Retry the scrape at a later time. If it continues, file a bug in Bugzilla.

failed_download

Error template:
Failed to download {url}.
Status {status}, Content: {text}

  • Retry the scrape at a later time. If it continues, file a bug in Bugzilla.

footnote_feature

Error template:
Footnotes are not allowed on features
The Feature model does not include a notes field. Remove the footnote from the feature.

  • On the MDN page, remove the footnote reference from the feature and remove the footnote.


footnote_missing

Error template:
Footnote [{footnote_id}] not found.
The compatibility table has a reference to footnote "{footnote_id}", but no matching footnote was found. This may be due to parse issues in the footnotes section, a typo in the MDN page, or a footnote that was removed without removing the footnote reference from the table.

  • On the MDN page, fix the footnote or remove the footnote reference.


footnote_multiple

Error template:
Only one footnote allowed per compatibility cell.
The API supports only one footnote per support assertion. Combine footnotes [{prev_footnote_id}] and [{footnote_id}], or remove one of them.

Hints:

On Web/API/Text/replaceWholeText, Basic Support for Chrome had:

<td>
  {{CompatVersionUnknown}} [1] [2]
</td>
...
<p>[1] Chrome 41 has removed this method.</p>
<p>[2] Before Chrome 30 and Opera 17, the argument wasn't mandatory, like required by the specs.</p>

If both footnotes were relevant, then they could be combined. However, like most cases, these footnotes are used to document changing support across versions. In the Compatibility API, each footnote should refer to the single version where support changed.

The changed version looks like this:

<td>
  {{CompatVersionUnknown}} [1] <br>
  30.0 <br>
  {{CompatNo}} 41.0
</td>
...
<p>[1] Before Chrome 30, the argument wasn't mandatory, like required by the specs.</p>

footnote_no_id

Error template:
Footnote has no ID.
Footnote references, such as [1], are used to link the footnote to the support assertion in the compatibility table. Reformat the MDN page to use footnote references.

  • If the text is a footnote, add a footnote reference
  • If the text is not a footnote, move it or remove it
  • If the text is not Compatibility Data, you may need to add a section header, like <h3>See Also</h3>


footnote_unused

Error template:
Footnote [{footnote_id}] is unused.
No cells in the compatibility table included the footnote reference [{footnote_id}]. This could be due to a issue importing the compatibility cell, a typo on the MDN page, or an extra footnote that should be removed from the MDN page.

Hints:

  • Take care of any footnote_multiple issues, which usually result in a footnote_unused warning as well.

inline_text

Error template:
Unknown inline support text "{text}".
The API schema does not include inline notes. This text needs to be converted to a footnote, converted to a support attribute (which may require an importer update), or removed.

Possible solutions:

  • A common case that causes this error is where you insert a browser version number or compatibility macro into a Browser compatibility table, but then want to include some supporting data or information about an edge case or quirk next to or just below it. The correct way to deal with this is to insert the extra information as a footnote — see the Element.querySelector Browser compat table for an example of correct usage.
  • Prose descriptions of changing support should be modified to use the compatibility KumaScript macros. For example, Web/API/Coordinates included the saga of changing support in Opera:
<td>
    10.60 <br>
    Removed in 15.0 <br>
    Reintroduced in 16.0
</td>

This changes to:

<td>
    10.60 <br>
    {{CompatNo}} 15.0 <br>
    16.0
</td>

kumascript_wrong_args

Error template:
Bad argument count in KumaScript {kumascript} in compatibility feature.
The importer expected {name} to have {expected arguments}, but it had {actual arguments}

  • Change the MDN page so that the KumaScript has the expected number of arguments
  • If the argument count is correct, file a bug to fix the importer.

missing_attribute

Error template:
The tag {tag} is missing the expected attribute {attribute}
Add the missing attribute or convert the tag to plain text.

  • The importer does not handle <a> tags used as link targets, such as <a name="compat_hint1">. Remove the <a> tag from the footnote and the compatibility table, leaving plain footnote references like "[1]".

no_data

Error template:
No data was extracted from the page.
The page appears to have data, but nothing was extracted. Check for header sections wrapped in a <div> or other element. (Context will probably not highlight the issue)

  • On the MDN page, look for a <div> or other element that wraps sections and remove it.
  • If the page doesn't actually have specification or compatibility data, file a bug.

section_missed

Error template:
Section <h2>{title}</h2> was not imported.
The import of section {title} failed, but no parse error was detected. This is usually because of a previous critical error, which must be cleared before any parsing can be attempted.

  • Fix other errors. If this remains and you can not determine the cause, file a bug.

section_skipped

Error template:
Section <h2>{title}</h2> has unexpected content.
The parser was trying to match rule "{rule_name}", but was unable to understand some unexpected content. This may be markup or text, or a side-effect of previous issues. Look closely at the context (as well as any previous issues) to find the problem content.

Possible solutions:

  • Often, this is caused by the use of plain text or HTML instead of an expected KumaScript macro. For instance, specification tables should be using the SpecName and Spec2 macros instead of specifying text directly. If there is no specification for the described feature, you need to wrap the text into WhyNoSpecStart and WhyNoSpecEnd macros.
  • When the proper specification tables and macros are not used, and instead a simple link to the spec is provided, follow these steps to resolve (the example I fixed when writing these steps was Timeranges.start()):
    • Copy a proper spec table from a reliable source, for example the Fetch API spec table
    • Paste this into the "Specifications" section of the problem page.
    • Replace the specification identifying name in the SpecName(' ... ') and Spec2(' ... ') templates with the name of the spec where the feature is specified. You can look up what name to use for that particular spec in the SpecName template page. For example, Timeranges.start() is specified in the WHATWG HTML Living Standard. In the SpecName template its name is listed as 'HTML WHATWG', so that's what you'll need to use.
    • If the page you are fixing is for a specific API landing page, the above steps should be enough. If the page is for a specific feature like a property or method, keep reading!
    • The SpecName(' ... ') template can take two other arguments. The first one is the URL slug that when combined with the spec's base URL will point to the exact feature in the spec. For example, the HTML WHATWG spec's URL is https://html.spec.whatwg.org/multipage/, and the URL to the Timeranges.start() method is https://html.spec.whatwg.org/multipage/embedded-content.html#dom-timeranges-start, so the second argument needs to contain 'embedded-content.html#dom-timeranges-start'.
    • The third argument needs to contain a human-readable name for the feature, in this case 'start()'.
    • the full template call is SpecName('HTML WHATWG','embedded-content.html#dom-timeranges-start','start()')
  • The "Browser compatibility table" should be structured just like the one on the Fetch API landing page. If some diffrent kind of table is being used, replace it with a table of this structure.

skipped_content

Error template:
Content will not be imported.
This content will not be imported into the API.

  • If the content in its current format is important to the presentation, ignore this warning.
  • If the content could be expressed as one or more support footnotes, convert it to footnotes
  • If the content could exist outside of a specification or compatibility section, move it
  • If the content is explaining why there is no specification information, it can be wrapped in {{WhyNoSpecStart}}/{{WhyNoSpecEnd}} blocks.

skipped_h3

Error template:
<h3>{name}</h3> was not imported.
<h3> subsections are usually prose compatibility information, and anything after an <h3> is not parsed or imported. Convert to footnotes or move to a different <h2> section.

  • If the data can be expressed as footnotes, change to footnotes to import it.
  • If the data can not be expressed as footnotes, move it to a different section.


spec2_converted

Error template:
Specification status should be converted to KumaScript
Expected KumaScript {{Spec2("")}}, but got text "{name}".

  • Convert the MDN page to use {{Spec2()}}

spec_h2_id

Error template:
Expected <h2 id="Specifications">, actual id={{h2_id}}
Fix the id so that the table of contents, other feature work.

  • Fix the ID on the MDN page

spec_h2_name

Error template:
Expected <h2 name="Specifications">, actual name={{h2_name}}
Fix or remove the name attribute.

  • Fix the name on the MDN page


spec_mismatch

Error template:
SpecName({specname_key}, ...) does not match Spec2({spec2_key}).
SpecName and Spec2 must refer to the same mdn_key. Update the MDN page.

  • Update the MDN page to make the two macros agree.

specname_not_kumascript

Error template:
Specification name unknown, and should be converted to KumaScript Expected KumaScript {{SpecName(key, subpath, name)}}, but got text "{name}".

  • Update the MDN page to use {{SpecName(key, subpath, name)}}

tag_dropped

This tends to occur when something unexpected appears in one or more of the browser compat table cells, such as a <code> element, or link. If you need to include anything unusual like a link to more information, put it in a footnote, as seen in Web Workers API.

unexpected_attribute

Error template:
Unexpected attribute {tag}
For <p>, the importer expects no attributes. This unexpected attribute will be discarded.

Possible solutions:

  • This is caused by an attribute within an HTML tag, which is not expected to have that attribute, e.g. when a <p> tag has an id attribute or there is an unnecessary style attribute. In those cases you may just remove the attribute from the tag. If you feel, the attribute is valid at that place, file a new bug against the importer and mark it as blocker for bug 1132269.

unexpected_kumascript

Error template:
KumaScript {kumascript} was not expected in {section}. The KumaScript {kumascript} appears in a {section}, but is only expected in {expected_sections}. File a bug, or convert the MDN page to not use this KumaScript macro here.

  • If the error occurs in a compatibility support, the solution is often required to convert the text into a footnote.
  • If a {{Compat*}} macro is used in a footnote, this is often because multiple versions are being described in a footnote. Instead, convert the support to cover multiple versions (see the notes for multiple footnotes)

unknown_browser

Error template:
Unknown Browser "{name}".
The API does not have a browser with the name "{name}". This could be a typo on the MDN page, or the browser needs to be added to the API.

  • At this stage, it is mostly due to typos on the MDN page. On the MDN page, change the browser name to one used on other pages on MDN.

unknown_kumascript

Error template:
Unknown KumaScript {display} in {scope}.
The importer has to run custom code to import KumaScript, and it hasn't been taught how to import {name} when it appears in a {scope}. File a bug, or convert the MDN page to not use this KumaScript macro.

Possible solutions:

  • This is caused by a macro unknown to the importer. This can happen within the compatibility hints or the cells of the compatibility table. Current known issues:
    • unimplemented_inline or an unimplemented_inline_webkit macro is put next to the version number within a compatibility cell. This macro should be moved to a footnote and replaced by a bug macro or webkitbug macro respectively.
    • SVGRef is put at the bottom of the page and is considered as part of the footnotes. This macro should be moved at the top of the page.
  • If you feel, your issue is not covered by that bug and the importer should be able to handle the macro, file a new bug against it and mark it as blocker for bug 1132269, otherwise remove the macro.

unknown_spec

Error template:
Unknown Specification "{key}".
The API does not have a specification with mdn_key "{key}". This could be a typo on the MDN page, or the specfication needs to be added to the API.

  • Look at the SpecName template for a close misspelling. Change the MDN page to match SpecName
  • If the entry appears in SpecName, then try a Reset to re-import the page. If that doesn't work, then SpecName changed recently and the API needs to be refreshed.

unknown_version

Error template:
Unknown version "{version}" for browser "{browser_name}"
The API does not have a version "{version}" for browser "{browser_name} (id {browser_id}, slug "{browser_slug}"). This could be a typo on the MDN page, or the version needs to be added to the API.

  • Try a Reset - the version might have been added since the page was last imported.
  • If not, as the #mdn IRC channel if it is a valid version.