Compatibility/Meetings/2022-10-25

Web Compatibility Meeting Meeting - 2022-10-25
Minutes: Previous 2022-10-18

Minutes

Scribed by James

Next week is: Fewer meetings week (Honza)

* Let’s cancel some meetings!

Honza: Happy to cancel 1:1, please ping me.
Dennis: Could cancel core bugs triage, but want to keep the web-bugs traige meeting.
Ksenia: What about this meeting? General agreement to cancel

Re-test old issues with needsdiagnoses (Honza)

There is about [170 issues](https://github.com/webcompat/web-bugs/issues?q=milestone%3Aneedsdiagnosis+sort%3Aupdated-desc+-label%3Adiagnosis-priority-p1+-label%3Adiagnosis-priority-p2+-label%3Adiagnosis-priority-p3+-label%3Astatus-needsinfo-denschub+-label%3Astatus-needsinfo-wisniewskit+-label%3Astatus-needsinfo-ksy36+label%3Aengine-gecko+) in our backlog (mostly reported in 2021). Many of those might be already fixed or not reproducible. About 100 of them is anonymous and perhaps those could be easily closed. Help from the QA team needed.

Honza: Came up in the fast response triage. We are down to old issues in the backlog. Many issues might already be fixed. It would be good to check. But how much time would that take? We could start with the issues reported by a GH user. Do we know how long it's likely to take?
Raul: It depends how many issues are incoming on a daily basis. We can make an OKR task for this. It might take some time since we also have 30ish new issues per day. Can make an OKR to go through the backlog, and if we don't make it it could slip to Q1 2023.
Dennis: No real urgency in looking at them since they're already a year old. It would just be good to know what proportion still reproduce.
Raul: I'll get the document ready for next week's meeting.

Autoassignments of web-bugs issues (Ksenia)

Wonder if we still want to auto assign web-bugs now that we have priority meetings. Could it be more useful to assign when we actually start working on an issue.

Ksenia: These don't mean anything except when they're reassinged as part of triage. Should we move back to the old system
Dennis: Yes. We don't need this anymore. We can disable it; I can do that.

SoWC Report Next Steps (Honza)

[Beta](https: //docs.google.com/document/d/145jS-dMuTHHsJMCgS0lhvhfa64p2j3nnzSRs2cT3adk/edit#)

Great feedback: informative, valuable source of data for prioritization

Joe: "This looks fantastic"
Andrew: Can we get some sort of standard "this is our fault" vs. "this is another vendor's fault" prefix or something?
James: Yes, but the answer might be disappointing.
Joe: I was talking to the DOM team about some issues, and for some issues, resolving the webcompat bug might make the web worse.
James: For interop, if we think it would be best for something to be changed in another browser, that would be good to know. It might not always be possible since other vendors might not break the web.
Joe: For some issues, there's a summary of the "stuck" situation in the report. I think that's enough in the case of disabled input elements.
Tom: And if someone doesn't see enough information in the report, we can investigate further.
Tom: Do we want to document cases where Google changes cause compat problems?
Joe: There have been times where we've wanted that, but there's no demand at the moment.
James: An alternative would be to suggest a "Proposed next action" for each item.
Joe: That seems like a good idea, but we should write it in conjunction with the people who are going to work on the fix.
mconca: It would be useful to characterize if each issue applies to desktop, mobile, or both.
James: We have that data in the KB, it's just not exposed into the report.
Joe: The appendix that looks at how we score the issues opens us to methodology criticism, might not be suitable for all audiences. Don't want to encourage bikeshedding.

Improve quality of input data for the scoring logic

* **Ranking factor** from CrUX [example](https://docs.google.com/document/d/1-pAmOC-WFbEVjb2rE0VtMrlOHgcl4Bn7lqmgW3u1rtk/edit#heading=h.pubs4ib8usva) (Ksenia) * **Platform data** from [telemetry](https://sql.telemetry.mozilla.org/queries/4838#9869) * **Reduction factor** (take into account the intervened issue)? * **Qualitative Likelihood** of a site user experience the issue * Calculate API popularity? * Could we use the sum of all issue scores (in KB) as an indicator of web compat state?

Honza: The data we use for scoring is not totally solid e.g. we might be better off not using Tranco; CrUX would also give us per-country data. Scoring logic is important for picking the top 10. Summing the overall scores could give a sense of overall WebCompat progress. We could use Telemetry directly for platform data.
James: I think we agree on using CrUX. For the platform share data, I based those numbers on Telemetry. There might be a case for not binding this to telemetry data: it's a decision that we care about the issues in relation to the current userbase. It might be that, for product reasons, we might, for example, weigh mobile issues higher. If we want to improve mobile, we should score mobile issues higher.
James: For the reduction factor: at the moment, we calculate the score based on what it would have been without intervention, and then reduce that score by a fixed factor, but we still score those issues. There might not be a way to accurately set that factor, because we can't predict how likely an issue is to pop up again on other sites if we fixed it via an intervention.
Honza: Missing API popularity feels like the biggest unknown. With HTTP Archive and Custom metrics, we could work out how popular the affected API is.
Dennis: There is a problem that some APIs are just queried for fingerprinting e.g. WebMIDI: large usage won't reflect actual popularity. Just looking at HTTP Archive data doesn't tell the full story. Basing scoring on just that data could be misleading.
Honza: We could also run our own tests with Custom Metrics
Tom: Front page data might not be enough to analyse usage. Requires more research.
James: I think we can use usage data for some things, like `overflow: overlay`. We know how to parse the CSS, and if it's in there, it's most likely being used somehow somewhere. For other things, like disabled input elements, using existing data sources can be tricky: sites might have input elements that are just disabled, and they might get re-enabled by a JS later on. You might be able to ship telemetry to count how often a user clicks on a disabled input element - but users might do that even in legitimate cases, as there's a reason why inputs are disabled in the first place.
Tom: In some cases it might be easier to fix the bug than to collect data. Could use developer surveys, but it's not clear that will get better data.
James: To be more positive: In some cases, we have useful data already. The Chrome Usercounters can be useful already. In some cases, we could get useful data very quickly. In some cases, figuring out how the webcompat issue actually looks can be complicated. Some things might be easier, some things might be hard.
Tom: API usage can be easy to measure or difficult to measure, but layout issues might be easier to measure in general. For important issues we should do this; it's like writing a site intervention. For complex APIs it might not be worth the time investment.
Joe: textinput, the kb entry makes it look really bad. The reality isn't that bad because of interventions. Should we tag those things?
Tom: The interventions are specific but the problem is also on other sites.
Joe: draftjs is used in a lot of places?
Tom: Yes, and other libs can also run into the same problem.
James: We already take existing interventions into account. The issues score would be way higher if that wasn't the case. We could highlight that we are shipping interventions for the major known breakage, so that people are aware.

Building the report more effectively

* How to make it easier to build/maintain KB? * Could we avoid duping data in KB? * Could building KB be part of our triage process?

James: Almost all of the data is currently based on a number of factors that we score, like the number of reports, if we're shipping an intervention or not, ...
James: If we have all the data in, for example, a BigQuery table, the scoring could be a SQL query for people to poke around with, instead of a fixed Rust program. So instead of coming up with our scoring system and defending it against nitpicking, we can allow other people to access the data and invite them to find their own ways to analyse the data and build their own score.
Dennis: We had the idea that we could optimise away creating the knowledge base entries. If we query across known breakages / bugzilla bugs / entries, the knowledge basis could be more like a set of queries. GitHub is a bad knowledge database, but we could e.g. use a bot to put the data somewhere else. Would help us keep the data up to date because there would be a single source of truth.
James: I think we should think about this. A week before our deadline, we had to update all the KB entires etc. I don't know if it's possible to get rid of the KB, but it might not be. The KB entires are, right now, a collection of links and some notes. So the KB could still exist to track "distinct webcompat issues", where each entry has a name attached and maybe a description. But if the same database also had a list of all web-bugs into it, you could link those directly to a "knowledge base entry", without having to maintain a list of links in a YAML file.
James: One proposal was: what if we had a "knowledge base" BigQuery table, where each entry contains the information, but we'd be able to link web-bugs to it by putting that data into the web-bug itself, without having to actively maintain the datastore. The advantage of having the data in BigQuery would be that it should make it easier to integrate the knowledge base data into other data sources. Something like this would resolve the biggest pain points we have: creating and updating knowledge base entries.
Dennis: I think this would be a good topic to talk about in Berlin. Make a dedicated timeslot for dedicated discussions.
Joe: The current design was supposed to bootstrap the idea, it would be surprising if it was the best design.

Next report

Honza: We now have top 10 issues defined, it doesn't make a lot of sense to build a new report with the same issues.
Joe: It depends how quickly we can fix those issues. Report has caused a number of converstations. It may be that some issues get fixed relatively quickly. Some issues might have fallout from fixing the initial problem that could themselves cause webcompat issues. We want to make sure that we always have something for web-platform to work on. We want to make sure the platform team always knows what to work on.
Tom: Could be more useful to have an automatically generated document per-team. Report itself would be for a broader audience that was better curated.
Tom: Might be good to wait and see how things evolve, and make a report when we think that there's big changes. We could notify for specific issues when they come up.

Compatibility/Meetings/2022-10-25

Contents

Minutes

Next week is: Fewer meetings week (Honza)

Re-test old issues with needsdiagnoses (Honza)

Autoassignments of web-bugs issues (Ksenia)

SoWC Report Next Steps (Honza)

Great feedback: informative, valuable source of data for prioritization

Improve quality of input data for the scoring logic

Building the report more effectively

Next report

Navigation menu

Compatibility/Meetings/2022-10-25

Minutes

Next week is: Fewer meetings week (Honza)

Re-test old issues with needsdiagnoses (Honza)

Autoassignments of web-bugs issues (Ksenia)

SoWC Report Next Steps (Honza)

Great feedback: informative, valuable source of data for prioritization

Improve quality of input data for the scoring logic

Building the report more effectively

Next report

Navigation menu

Search