CloudServices/Sync/FxSync/Syncorro

From MozillaWiki
Jump to: navigation, search

Socorro + Sync = Syncorro \o/

People

  • Client engineering: Marina Samuel, Philipp von Weitershausen
  • Server engineering: XXX
  • Metrics: Daniel Einspanjer, Xavier Stevens
  • Product: Jennifer Arguello

Goals

  • Gather statistics on errors (to help with prioritization)
  • Be able to correlate errors with maintenance windows, user profiles, etc.
  • Simplify error reporting for users who file bugs or SUMO articles
  • Detect the "long tail" of problems that are never filed

Features

  • Each submitted report should be represented by a URL or at least an opaque token (e.g. UUID)
  • Ability to query according to application, Sync, and error specific metadata
  • Fulltext search over submitted log data
  • Ability to return instructions to client upon report submission (e.g. throttling, recovery, support messages for the user, etc.)

Roadmap

  • Discuss goals and features with metrics (DONE)
  • Discuss UI mockups with UX
  • Add ability to upload Syncorro data to ElasticSearch (see bug 673318)
  • Build add-on for the Services Beta Channel

Client UX

Submitting an error report

  • When there's a Sync error, the usual error bar is shown (except we're not showing the dreadful Unknown Error message):
 -------------------------------------------------------------------------
 | We're sorry, Sync encountered a problem [Details ...]              (X)|
 -------------------------------------------------------------------------
  • Clicking on the Details button dismisses the bar and brings up a tab with a high-level explanation of the details of the error:
 ------------------------------------------------------------------------
 | Details for Sync problem on Tuesday, May 1, 2011 5:59 pm             |
 | ========================================================             |
 |                                                                      |
 | There was a problem saving the "BBC News - World" bookmark to your   |
 | computer. Other data is not affected.                                |
 |                                                                      |
 | To help Mozilla improve Sync and prevent errors like this in the     |
 | future, please submit this report. Your personal data will not be    |
 | submitted.                                                           |
 |                                                                      |
 | [X] Automatically submit reports in the future.                      |
 |                                                                      |
 |                                                     [Submit report]  |
 |                                                                      |
 | > Full report                                                        |
 |                                                                      |
 ------------------------------------------------------------------------
  • Pressing the Submit report button will submit the report. Once the report is submitted, a link to the report on the server is displayed:
 ------------------------------------------------------------------------
 | Details for Sync problem on Tuesday, May 1, 2011 5:59 pm             |
 | ========================================================             |
 |                                                                      |
 | There was a problem saving the "BBC News - World" bookmark to your   |
 | computer. Other data is not affected.                                |
 |                                                                      |
 | Firefox submitted a report of the problem to Mozilla.                |
 |                                                                      |
 | [X] Automatically submit reports in the future.                      |
 |                                                                      |
 | > Full report                                                        |
 |                                                                      |
 ------------------------------------------------------------------------
  • If the Syncorro server finds a suitable support page, the page will display:
 ------------------------------------------------------------------------
 | Details for Sync problem on Tuesday, May 1, 2011 5:59 pm             |
 | ========================================================             |
 |                                                                      |
 | There was a problem saving the "BBC News - World" bookmark to your   |
 | computer. Other data is not affected.                                |
 |                                                                      |
 | Good news! Firefox submitted a report of the problem to Mozilla and  |
 | a possible solution was found. _View_support_page_                   |
 |                                                                      |
 | [X] Automatically submit reports in the future.                      |
 |                                                                      |
 | > Full report                                                        |
 |                                                                      |
 ------------------------------------------------------------------------
  • Click on the arrow next to Full Report will show all information that potentially is or was submitted. Since there's typically a lot of it, it's divided into separate collapsible sections itself:
 ------------------------------------------------------------------------
 | Details for Sync problem on Tuesday, May 1, 2011 5:59 pm             |
 | ========================================================             |
 |                                                                      |
 | There was a problem saving the "BBC News - World" bookmark to your   |
 | computer. Other data is not affected.                                |
 |                                                                      |
 | Firefox submitted a report of the problem to Mozilla.                |
 |                                                                      |
 | [X] Automatically submit reports in the future.                      |
 |                                                                      |
 | \/ Full report                                                       |
 |                                                                      |
 |    Report ID: {UUID}  [Copy to clipboard]                            |
 |                                                                      |
 |    > Application details                                             |
 |    > Sync account info                                               |
 |    > Error fingerprint                                               |
 |    > Log                                                             |
 |                                                                      |
 ------------------------------------------------------------------------

Looking up error reports

  • Basically make about:sync-log look like about:crashes, linking to the details pages as described in the previous section.

Client Implementation

Note: This is only a draft that is being fleshed out.

  • Using Metric's Elastic Search system (also used for AMO stats and Socorro) at data.mozilla.org
  • On error, Sync POSTs a payload to data.mozilla.org:
 POST /XXX HTTP/1.1
 Content-Type: application/json
 {
   id: "{UUID}",
   app: {
     product: "{UUID}",
     version: "8.0a1",
     buildID: "...",
     locale: "en_US",
     addons: ["{UUID}", "{UUID}", "{UUID}", ...]
   },
   sync: {
     version: "1.10",
     account: "eisklclxuauemrjghidis",
     cluster: "https://phx-sync091.services.mozilla.com/",
     engines: ["bookmarks", "history", ...],
     numClients: 2,
     mobileClients: true
   },
   error: {
     localTimestamp: 13294938593,
     engine: "bookmarks",
     result: 489294595, // the error constant if applicable
   },
   log: "..."
 }
  • Under normal conditions, the server returns HTTP 200 OK with optional hints for the client concerning throttling and help for the user:
 HTTP/1.1 200 OK
 Content-Type: application/json
 {
   reportURL: "http://data.mozilla.org/syncorro/{UUID}",
   throttle: 10,  // only submit every 10th error
   infoURL: "http://support.mozilla.com/..."  // optional support page
 }
  • Server can also return other status codes to indicate that the data wasn't accepted.
    • 500 Server Error
    • XXX throttled, try again later
    • XXX invalid data
  • If the client fails to upload the report (e.g. because of network connectivity problems or similiar), it will retry periodically using a backoff strategy. After some number of failures, the upload is failed permanently, and no further retries will be attempted.

Dashboard implementation

  • Graph of number of reports over time (potentially being able to split by certain metadata, e.g. product version, Sync node, etc.)
  • Query by metadata
  • Fulltext search over logs
  • Define SUMO pages for percolator matches

TODO details (talk to ddash, jbalogh)

Questions

  • Reports will probably have to be non-public for now, though it would be nice if users could view their own submitted reports... can we do some sort of token-based auth there?
  • Will this service require ToS changes?
  • What do we do with custom server users?
  • What do we do when user has Trace logging enabled?

Discussion

Tentatively identified as not in scope for v1

  • Ops paging/integration for events. A large spike in failures could be either a new client error or a server or operational issue, and that's info that we might want to leverage. Best to leave this until we know what we're doing.