Scribed by Dennis
Prototype of the State of WebCompat Report (Honza)
- First milestone, [prototype of the report](https://docs.google.com/document/d/1COtINL7ZGqKJI74y6MIwjaLghnd1JrFeDYcYKeF59Nc/edit#)
- Based on State of CSS 2021, KB and QA trends
- What have we learned so far?
- Honza: We picked individual top 5 and compiled them into a single top 5 list. Let's go over it.
- Dennis: In relation to missing support for window.print: the score from the automatic algorithm is low, but we feel like this is a critical issue. One problem is that we track those as non-critical single function breakage. But in some cases that missing feature is more than a minor annoyance e.g. printing a boarding pass could be critical. Could need to distinguish minor feature breakage vs critical feature breakage.
- Honza: We had similar discussions about printing on desktop - which may also be more important than we think.
- James: With the algorithm that we have, we see the breakage as "a broken feature", but we reduce the score of that because we assume that only a small subset of users actually want to print, so the score is low. On the other hand, we get a lot of reports about this, so this may be more important. I agree that in some cases, like printing a boarding pass, can be a very critical breakage for users. It's hard to know if the number of duplicate reports is a strong signal or not. Currently, each duplicate report adds only a single point to the score. But if we're getting a lot of duplicates, we may be underestimating the issue. We currently don't have a ranking based on the number of duplicates per issue.
- James: One thing we could do is to have a distinction between "critical feature breakage" and "minor feature breakage". For window.print specifically, we should look into how many people actually want to print. Maybe we have telemetry data for that on Desktop, or maybe Chrome has data?
- James: In cases where we feel like the score doesn't reflect our feelings on how important an issue is, we should take this as a signal that we need to dig into that and search for some evidence for our feelings and to validate that the issue is indeed more important than the score says.
- James: At the end of the report, we might document how we score the issue. If our actual ranking disagrees with the score, we need some evidence to back up our decision to rank issues higher than the score says.
- James: (about the interventions note in the document) An interesting case is the SVG size issues. Some issues are minor, but other instances completely break a site. If we notice a site completely breaking, we immediately ship an intervention. If we ship interventions for all critical breakage, the score drops down a lot because we have "fixed" the issue. But we don't know the long tail of issues, and we don't know how many broken sites are still out there. This is another instnace where the number of duplicates might be a stronger signal, or the number of interventions shipped. Maybe we can find a pattern that we can detect and look at telemetry data to estimate how frequently users run into that scenario. Maybe, if we ship a lot of interventions for an issue, we should bump the priority.
- Joe: It would be interesting to have a Knowledge Base entry about different User Agent strings. Those are a cause of webcompat breakage, even though it works "as designed".
- Dennis: Does this just mean "Firefox has a different UA string to other browsers"?
- Joe: Yes.
- James: At least in the scoring logic, we have a score for "Firefox is blocked by the site". We don't have a dedicated KB entry for that. Prioritizing that is hard. If we have a metabug for all UA-based blocks, that bug would probably be one of the highest ranked webcompat issues, just based on the number of breakage related to it. But the individual reasons for each breakage might be very different. It might be worth tracking all UA-related issues together, but this is grouping a couple of probably unrelated issues together into one. We can have an entry for "the site is blocking us, but we don't know why".
- Honza: Are there other examples of issues like that?
- Dennis: We've talked about printing yesterday.
- James: Yeah, that's similar. We can't specifically diagnose every single printing issue at the moment, we only see a bunch of printing issue.
- Honza: Couldn't this be a Risk or a Trend?
- James: It probably shouldn't be one of the Top n issues, but it could be a trend. We might be able to use telemetry data to estimate how many sessions might be affected.
- Joe: I think that's fair. UA strings are not a webcompat issue, but we could track it as such and find evidence for how many people are affected.
- James: We might have enough data to work out how many sessions are affected by UA blocks etc.
- Joe: Somewhat related: I read through the KB entries, and noticed there is a field with a relation to a spec issue or discussion. It might be worth to differentate between "we don't follow the spec", "others don't follow the spec", "there is no spec". It might be worth to be able to say x% of webcompat issues are caused by things not covered by a spec, y% are caused by Chrome not following the spec, ...
- James: Yes. This might be the sort of thing that we can use for Interop 2023 etc. Sometimes, the answer is "we don't know", but we sometimes can infer that just based on the fact that a Chrome bug exists etc. Some cases are very obvious, like CSS zoom.
- Joe: What's the state of that report? Are we happy with it? Do we want to make more changes? How much more time do you want to spend on it?
- Honza: This is what we have at this point. Whatever improvement we do to it will be the next version. The valuable outcome was the process of creating this document, figuring out how to identify the top issues, ...
- James: (about the Risk section) we've discussed this in a previous meeting, and it hasn't significantly changed. This is primarily based on the work that bgrins did based on the caniuse data, looking at things that are about to ship in other browsers and we don't have an implementation for that. We had some advance warnings about those things by other vendors wanting to include that in Interop 2022. In the future, there might be other sources for this as well.
- Honza: I added a note about other information sources we can use.
- James: There seems to be a lot of overlap with the work bgrins is doing, and we should make sure we don't duplicate information.
- Honza: I had a meeting with him about that yesterday. He's happy to collaborate.
- Joe: It may make sense for that work to be owned by the WebCompat team. It's worth noting that we're writing this document for the end of the year, and by then, Firefox might be on its way to implement those features already. We might to publish the Risks more frequently.
- James: yes. For those specific examples, they have already been identified by other teams, so I'm hopeful those things will be shipping by the end of the year. But it's hard to tell that based on the information we have. However, assuming that Interop 2023 stars, we might have that as a source for future risks. We should coordinate the report frequency with the timespan that other teams are planing their future.
- Joe: While that makes sense, there might be some urgency to getting the risks out earlier. It would be a shame if we wait 5 months to tell others about risks we identified.
- James: At least for those two examples, there was a discussion on Slack about this as soon as we noticed other vendors are implementing it. But yes, we don't want to be in a situation where we're holding back that information. For things we identified as a high risk, we should raise them immediately.
- Joe: An addition to the report might be a list of APIs that we're tracking, but we're not ready to pull the triggers yet because we don't know what other vendors think.
- Honza: (about the Trends section) This selection is coming from Oana and Raul's triage work.
- Oana: yes, those are things we noticed a lot of reports for, and we think it's affecting a lot of users. We can't always reproduce all issues, but they look important.
- James: A lot of this looks like these issues could be a top webcompat issue. We don't have KB entries for some of the issues yet. If we create KB entries, we probably don't add the duplicate reports as individual breakage reports. I wonder if it's a good idea to add the duplicates.
- Dennis: You can't easily do this automatically, but GitHub is showing a list of all linked/mentioned bugs, you could use that as a base.
- Tom: For some issues that are tracked on Bugzilla (like the Stoarge APIs), you could use the see-also field on Bugzilla and look at all the linkned web-bugs.
- Oana: not all duplicate reports get linked on Bugzilla. For example the Imgur case, some issues are closed by the bot, others were just closed as a duplicate without linking them, ...
- Joe: The introduction of the section indicates that the trends are only for things we notice on large sites. But we could also see similar breakge on a large number of smaller sites.
- Oana: we do sometimes see that, like issues related to printing. Those are tracked in the trends OKR, but we only added the high-priority sites for now. You can find all trends at https://github.com/mozilla/webcompat-team-okrs/issues/262
- Honza: For identifying trends, there might be other teams we want to collaborate with. Product Management, for example. I'd also be interested in finding other data sources besides triage.
- James: A thing that's been mentioned is the kind of meta-knowledgebase entry, like "Firefox is blocked" or "generic issue with printing". The Trends section shouldn't be about individual platform bugs, but more generic groups of issues like "we're getting a lot of reports about broken sites in Korea, but we don't know why".
- Joe: The key part is "we noticed that", to highlight things that might not otherwise be captured in the report.
kb schema improvements (jgraham)
- James: While filling in the detailed breakage, it became clear there were things we didn't capture too well. I collected a couple of ideas. We don't have time to discuss this now, but please go through the document and add comments. If you find other cases where you don't know how to add this, add it to the list or file GitHub issues.
- James: Two more things came up in this meeting: considering the amount of duplicates, and keeping track of how critical feature breakage is.