PostCrash: Difference between revisions

4,358 bytes added ,  5 July 2011
done
No edit summary
(done)
 
(38 intermediate revisions by 4 users not shown)
Line 1: Line 1:
{{DRAFT}}
The goal of this project is to engage with users who provided an e-mail address in the Crash Reporter by
* Providing relevant info about their crash as well as general info about how to avoid crashes in Firefox.
*Letting them know when their crash has been fixed in Firefox.


The goal of this project is to improve the user post-crash experience by:
The project will require changes to SUMO and Socorro in order to supply the required functionality. For the secondary goal, Session Restore and/or about:crashes need to be changed, too.
* Developing about:crashes into a more useful resource
* Contacting the user by email after a crash is resolved


The project will require changes to about:crashes, SUMO, and Socorro in order to supply the required functionality.
= Engage with users after Firefox crashes =


= about:crashes =
We should engage with users via e-mail in the following situations:
We want to make about:crashes more useful.  Ideally, for each crash we would display:
* After submitting the crash report to provide info about the crash, including relevant support documentation links for the crash signature and generic support documentation for how to avoid crashes in general.
* the crash id
* When new support documentation is available for the particular crash signature (if it didn't exist when the report was submitted).
* the crash signature (source: Socorro)
* After a bug has been fixed and shipped in a Firefox release, to inform the user that the bug is fixed in Firefox version x.y.z. This will also work as a powerful way of retaining users, including getting users back who may have switched to another browser between the crash report and the fix.
* associated bugzilla ids (source: Socorro) and status of these bugs (source: Bugzilla or Socorro).
* If a bug is resolved fixed, show the version it was fixed in.
* if a crash is associated with a plugin, link to plugincheck (source: Socorro)
* If a canonical SUMO article exists, a link to that article, otherwise a link to SUMO search for that signature (as is currently done in Socorro)


The client should obtain the extra information from Socorro and SUMO via XHR.  This data should be stored locally once final.  Final data includes:
== When submitting the crash report (Q1 2011) ==
* crash signature
* bugzilla ids for RESOLVED bugs
* plugin association
* canonical SUMO links


Data that should be polled each time about:crashes is loaded includes:
This e-mail will be automated:
* Status of bugzilla bugs that were unresolved last time about:crashes was loaded
# A user submits a crash reports and provides an e-mail address.
* Search for a canonical SUMO article where one previously didn't exist
# Socorro generates the signature for the crash and queries the SUMO search to see if there's an associated support article for this particular crash.
#* If an article exists, the title and URL of that article is fetched and used in the e-mail.
#* If an article does not exist, the report is flagged accordingly so it can revisit the report when new info is available.
# An e-mail is sent out pointing the user to the support article (if one exists) and also includes general information about how to avoid crashes (linking to a generic article on SUMO listing ways to keep your Firefox, plug-ins and add-ons up to date), where to get more info about the crash itself (linking to the crash report on Socorro).


In order to minimize load on Socorro and SUMO, we will by default only show an expanded view for the most recent crash. Users may individually expand other crashes (something like "check crash status" on each previous crash).
Generally, the e-mail should:
* Thank the user for submitting the report and possibly describe what's going to happen to set the right expectations.
* Inform that we will contact the user again if there's more info about the bug.
* Make it clear that it's not possible to reply to the e-mail and that they should go to support.mozilla.com for help.


Question:
Issues with regard to authenticity of email / phishing can be addressed by encouraging users to "visit mozilla.com and click on blah for more information" (taking a lead from Paypal/eBay here).    We should make sure that emails come from a mozilla.com address and use whatever we have in place for email authentication.
* Should we show both hangs and crashes? Users may load the page and see nothing but hangsDoes it make more sense to (perhaps by default) just show crashes?
 
== When new support documentation is available (Q4 2010/Q1 2011) ==
 
This e-mail will be automated:
# A cron job would check each signature that lacks an associated SUMO article to see if new info is available.
# If a new article is detected, all crash reports with that signature and an e-mail address will be queued up for another e-mail message including the new information.
 
Generally, this e-mail should make the new information prominent, but otherwise retain the language and key messages of the initial e-mail.
 
== When a bug has been fixed in a shipped Firefox release (Q4/October) ==
 
This would in part be a manual workflow:
# When a release is planned generate a list of fixed bugs (from Bugzilla)
# Provide facility in Socorro to search by Bugzilla bug for a list of associated crashes
# For each associated crash, generate a list of user submitted emails (minus any users who have said they don't want to be emailed again) - this function should be available only from the admin interface
# Mark those emails as contacted for that given crash, so we don't contact them again.
 
=Technical requirements=
== Changes to Socorro==
UI changes consist of three basic web pages:
* Web page to allow email setup. User would input:
** Crash signature
** Start and end dates
** Email subject
** Email body (text only for this iteration)
** Hitting submit would take you to the preview page
* Preview page will show the user how many emails will be sent that match their criteria and have a confirm button that will submit the email for sending
* Previous mailings that have been sent, just a list in reverse date order.  Users should able to click through to see the text that was sent.
 
Back end changes:
* New database tables:
** Mailings
** Users x mailings
** Users unsub (user email) - users not to email
* Email injection code
 
Full details can be found at [http://code.google.com/p/socorro/wiki/CrashFixedEmailDesign http://code.google.com/p/socorro/wiki/CrashFixedEmailDesign]


== Changes to SUMO ==
== Changes to SUMO ==
* Add a webservice call that, given a signature and locale, searches for relevant content on SUMO and returns a URI. The web service should work as follows:
* Add a webservice call that, given a signature and locale, searches for relevant content on SUMO and returns a URI. The web service should work as follows:
*# If a canonical article exists for crash_signature (top search result), link directly to it, and display the title of the article
*# If a canonical article exists for crash_signature (top search result), return the title and URL
*# Else if search results exist for crash_signature, show search results
*# Else if other search results exist for crash_signature (e.g. forum threads), return URL to search results
*# Else link directly to a single generic crash article
* Should be available sometime the week of March 14th.
** See [https://bugzilla.mozilla.org/show_bug.cgi?id=631341 bug 631341].


== Changes to Socorro ==
=== API ===
* Add a webservice call.  Given crash id, return signature, associated plugin, associated bugzilla id and bug status.


= User email contact =
Will be available as of SUMO's 2.6.2 release. Scheduled for 15 March 2011.


After discussion (see chofmann's comments below for background), we think the best way to solve this is with a manual process rather than an automated one., something like the following:
GET /postcrash?s=<signature>


* During the release process, a Support team member should collect a set of bugs fixed in the release and matching crash signatures.
Return values:
* Socorro should provide the ability for an admin user to email the set of users who have submitted crashes matching a particular signature.
* Support team email would be something along the lines of:
"On date X/Y/Z you submitted a crash report to Mozilla.  Crashes with this signature were addressed in bug xyz which has been resolved in version A.B.C."


Issues with regard to authenticity of email / phishing can be addressed by encouraging users to "visit mozilla.com and click on blah for more information" (taking a lead from Paypal/eBay here).    We should in general make sure that emails come from a mozilla.com address and use whatever we have in place for email authentication.
HTTP/1.0 200 OK
Content-type: text/plain;charset=utf8


==Questions==
http://support.mozilla.com/...
 
or
 
HTTP/1.0 404 Not Found
 
A request without an <code>s</code> parameter will result in
 
HTTP/1.0 400 Bad Request
 
Responses may set caching headers. Conditional GET is encouraged. Consumers should wait before requesting again in the event of a 5xx error.
 
=Timeline=
* <strike>Monday Sept 13: Meeting with QA to establish QA/Support workflow</strike>
* <strike>October: Socorro 1.7.4 freeze (Socorro)</strike>
* <strike>October: Socorro 1.7.4 push (Socorro)</strike>
* <strike>October 22: Finalize e-mail template to send out for fixed bugs (Michael/David)</strike>
* <strike>November 22: Finalize e-mail for [Russian crash] (David)</strike>
* <strike>November 22: Finalize support e-mail FAQ and support article for Russian crash (David)</strike>
* <strike>November 24: Have e-mail + support article + FAQ localized into Russian (Alexander)</strike>
* <strike>November 30: Test e-mail campaign to a select number of people to ensure that the e-mail encoding is correct. This can be done on stage. (Laura/IT)</strike>
* <strike>December 14: Push out Socorro 1.7.5.4 which adds support for UTF-8 encoded e-mails (Laura/Socorro/IT)</strike>
* <strike>December 15: Send manual e-mail to users experiencing the Russian crash and use this as a basis for establishing additional requirements for Phase 2 in Q1 2011. (Michael/David/Austin/Christian)</strike>
* <strike>December 17: Follow up with Daniel to determine hit rate (David)</strike>
* <strike>December 17: Send out e-mail to the remaining Russian users</strike>
* <strike>December 23: Finalize PRD to include requirements for Phase 2 in Q1 2011 (David)</strike>
* Ongoing - Next Firefox 3.5.x/3.6.x release: Complie a list of actual crash bugs fixed (Christian)
* Ongoing - Next Firefox 3.5.x/3.6.x release + 7 days: Send out first "crash fixed" e-mails (Michael/David/Christian)
 
=Future requirements=
* Ability to define a campaign with multiple crash signatures, each with its own set of filtering restrictions (product, versions, etc)
* Ability to wildcard Firefox versions (e.g. "3.5.*,3.6.*")
* Ability to specify whether a campaign is for a fixed or ongoing crash issue (which determines whether this is the final e-mail campaign for those users, or if we are allowed to send additional information in the future)
* <strike>Asynchronous sending of the e-mail -- currently you have to select a really small date range because otherwise the system times out. If this could be changed to e.g. a backend query/job, sending e-mail in a large campaign would be much more straightforward.</strike>
 
=Questions/Discussion=
* Some crashes have more than one associated bug.  Should we email the user each time a bug is fixed or when all the bugs are fixed (or some other model)?
* Some crashes have more than one associated bug.  Should we email the user each time a bug is fixed or when all the bugs are fixed (or some other model)?
** the period of when bugs get fixed to when the crash happens might be very long. bugs being fixed on the trunk now might not be available in a final release until end of q3 or q4.  It doesn't make much sense to e-mail users now about fixes that won't appear in a final firefox 4.0 until then, or e-mail them at the end of the year about that that crash they had 6 months ago. A single signature with many bugs makes it nearly impossible to correlate if any one of the bugs might have fixed the specific bug that the user just crashed on.  we would need to start storing the full stack trace(s) of the bugs that we have fixed and map them to the full stack trace that the user experienced.  -chofmann
** the period of when bugs get fixed to when the crash happens might be very long. bugs being fixed on the trunk now might not be available in a final release until end of q3 or q4.  It doesn't make much sense to e-mail users now about fixes that won't appear in a final firefox 4.0 until then, or e-mail them at the end of the year about that that crash they had 6 months ago. A single signature with many bugs makes it nearly impossible to correlate if any one of the bugs might have fixed the specific bug that the user just crashed on.  we would need to start storing the full stack trace(s) of the bugs that we have fixed and map them to the full stack trace that the user experienced.  -chofmann
Line 59: Line 126:
** Do repeated crashes mean repeated e-mails to the users with the same message, or do we track so that we don't keep telling users there is no fix for their crash yet?  if we don't track it the e-mails start to look like spam. --chofmann
** Do repeated crashes mean repeated e-mails to the users with the same message, or do we track so that we don't keep telling users there is no fix for their crash yet?  if we don't track it the e-mails start to look like spam. --chofmann


* we could be over engineering things here.  the system that we had with talkback served a good purpose.  when we found a specific crash that we had something specific to tell users about,  we constructed the message, then we queued up the system to watch for that signature and send the message.  This was for a very small number of the total overall crashes, but had useful information as opposed to just auto-responding. e.g.
* we could be over engineering things here.  the system that we had with talkback served a good purpose.  when we found a specific crash that we had something specific to tell users about,  we constructed the message, then we queued up the system to watch for that signature and send the message.  This was for a very small number of the total overall crashes, but had useful information as opposed to just auto-responding. e.g. "from your recent crash we detected that you need to upgrade to Skype X.X to fix the crash."
"from your recent crash we detected that you need to upgrade to Skype X.X to fix the crash."   
** For the record, this is pretty much exactly what this project is about: reaching out to users when we actually have something meaningful to say (like "your crash has been fixed").  
At any particular point in time we might have something this specific to say in an extremely low pct. of cases.  Numbers below indicate this might be lower than 1% of the time a user reports a crash to us.
At any particular point in time we might have something this specific to say in an extremely low pct. of cases.  Numbers below indicate this might be lower than 1% of the time a user reports a crash to us.


* sending e-mails with technical instructions on how to fix or mitigate crashes, and espcially instructions that involve downloading and installing software is open to impersonation, and phishing. we need build in ways for users to authenticate that the message came from mozilla and that the same instructions are available on an "offical" mozilla hosted website.
* sending e-mails with technical instructions on how to fix or mitigate crashes, and espcially instructions that involve downloading and installing software is open to impersonation, and phishing. we need build in ways for users to authenticate that the message came from mozilla and that the same instructions are available on an "offical" mozilla hosted website.
* <bsmedberg> We need to be very careful with the about:crashes changes in order to preserve user privacy. The webservice calls have the ability to correlate a user with multiple crash reports: we probably should not perform them automatically, and should probably avoid sending any cookies over the wire when we do make the requests.


=== some numbers ===
=== some numbers ===
Line 88: Line 157:
</pre>
</pre>


=Mockups=
* The 3.6% might be heavily affected by the fact that the checkbox is unchecked by default. I.e. an insensible default. Can we change that?
 
==Early idea on a revamped about:crashes page==
[[File:About-crash.png|600px]]
* The latest crash automatically queries SUMO and Socorro for status info.
* Older crashes might not query the servers automatically to reduce server load.
* To reduce information overload, the crash ID isn't shown, but clicking on the signature takes you to the full report on Socorro.
 
Some additional thoughts/ideas:
* Would be even better if the page could query SUMO for the full article title and show that instead of "View solution" (the SUMO service should also support l10n to make sure that a localized article title appears if it exists).
* The latest crash could be more prominently highlighted -- perhaps even have a separate section above the "Recent Crash Reports".
* The generic support link at the bottom could be expanded into its own section. The target link could take you to a start page specializing on crash/hang problems, with prominent links for how to upgrade plugins, etc.
canmove, Confirmed users
1,511

edits