Unified Telemetry/Status reports/August 7 2015

From MozillaWiki
Jump to: navigation, search

previous weeks report

Unified Telemetry status report August 7, 2015

Overall Project Health

Last week: Yellow

This week: Yellow - The primary risk is now the executive report. We're still debugging the discrepancies between the v4 and the v2 versions. We will not turn off v2 data if we don't have confidence in the executive report.

Exec Summary

  • Data quality, validation
    • Team is focusing on executive dashboard roll ups in validation effort
    • added client probes to beta population to narrow down missing pings question and a few other issues
  • Added a Windows drill down for executive report data
  • Starting work on executive dashboard with combined v2 + v4 data
  • Healthreport content is up and work to use the new API is underway (no longer a big risk)
  • Proposal: if we don't feel confident turning off v2 data for Fx41, collect v4 data for 5% of the population

Risks/Issues

Description of Risks/Issues State Owner Plan to Resolve/Mitigation Target Date
Investigate gaps in pings Open Stuart/Alessio https://bugzilla.mozilla.org/show_bug.cgi?id=1185123, working doc 8/10
Data integrity between V2/V4 and V4 internal data consistency Open Brendan/Sam Investigation in progress. Added resources (Sam). https://etherpad.mozilla.org/fhr-v4-validation 8/10
Data continuity across V2/V4 Open Katie/Mark/Trink Plan, Metabug 8/10
Legal review Open BDS/Legal Meeting between groups 8/10
QA sign off (functional, load) Open Stuart Telemetry/Testing 8/10
Operations - data retention requirements Open Travis/Katie Eng team owes ops a doc defining ping types and data retention requirements 8/10
Operations - analysis tools & microservices Open Travis/Mark/Roberto Architecture/Data flow diagram 8/10
Data loss incident Fixed mreid/whd/trink Tee server needs to return error status from old or new. Added Ops resources (Daniel Thornton). 7/15
Remote about:healthreport content Open Katie/BDS Working on pr for fhr-jelly, will deploy next week 8/10
Budget, size of UT pings Open Mark/BDS https://bugzilla.mozilla.org/show_bug.cgi?id=1182693 8/10
Analysis difficulty Open Katie/tbd Spark training; need comprehensive plan 8/10

Accomplished for Last Period

Engineering & Ops

QA

  • Load testing
  • work with softvision

Project management

  • meetings, emails, hand waving

Planned for Upcoming Period

Engineering

  • Client
    • uplifts for probes
    • data quality investigations
    • datachoices infobar bug
  • Pipeline
    • In talk with Databricks wrt to Sparks hosting
    • Mechanism for Heka state preservation when it gets wedged
    • UT specific monitoring and alerting
    • data retention spec
  • Data validation
    • update data sets (exe dashboard)
    • acceptance criteria
    • missing subsessions ping investigation
    • Many submission for few clients issue
  • Data continuity
    • Document strategy for executive dashboards with v2 + v4 data

Ops

  • building automated jenkins deployments
  • nginx load balancing

QA:

  • Look into prod T issue with Ops
  • continue test suite creation
  • finalizing long term QA engagement (softvision engagement, tooling asks for CI loop based testing)

Project Management

  • Finish triage of bugs
  • remainder of release tasks scheduled

Outstanding requests not yet road mapped into a release

Description State Owner Plan to Resolve/Mitigation Target Date
FireFox OS - app pings Open Katie Need to schedule and understand impact on project TBD
histograms for loop/hello Open Katie Need to schedule and understand impact on project TBD

Important Links/References