Unified Telemetry/Status reports/August 7 2015: Difference between revisions
Jump to navigation
Jump to search
(→Important Links/References: links) |
|||
| Line 102: | Line 102: | ||
=== Important Links/References === | === Important Links/References === | ||
* [ | * [[Unified_Telemetry|https://wiki.mozilla.org/Unified_Telemetry]] | ||
* [https:// | * [[CloudServices/DataPipeline|https://wiki.mozilla.org/CloudServices/DataPipeline]] | ||
* [http://mzl.la/1FPWObo bug list] | |||
Revision as of 22:32, 7 August 2015
Unified Telemetry status report August 7, 2015
Overall Project Health
Last week: Yellow
This week: Yellow
Exec Summary
- Team is focussing on executive dashboard roll ups in validation effort
- adding client probes to beta population to facilitate validation effort
Risks/Issues
| Description of Risks/Issues | State | Owner | Plan to Resolve/Mitigation | Target Date |
|---|---|---|---|---|
| Investigate gaps in pings | Open | Stuart/Alessio | https://bugzilla.mozilla.org/show_bug.cgi?id=1185123, working doc | 8/10 |
| Data integrity between V2/V4 and V4 internal data consistency | Open | Brendan/Sam | Investigation in progress. Added resources (Sam). https://etherpad.mozilla.org/fhr-v4-validation | 8/10 |
| Data continuity across V2/V4 | Open | Katie/Mark/Trink | Plan, Metabug | 8/10 |
| Legal review | Open | BDS/Legal | Meeting between groups | 8/10 |
| QA sign off (functional, load) | Open | Stuart | Telemetry/Testing | 8/10 |
| Operations - data retention requirements | Open | Travis/Katie | Eng team owes ops a doc defining ping types and data retention requirements | 8/10 |
| Operations - analysis tools & microservices | Open | Travis/Mark/Roberto | Architecture/Data flow diagram | 8/10 |
| Data loss incident | Fixed | mreid/whd/trink | Tee server needs to return error status from old or new. Added Ops resources (Daniel Thornton). | 7/15 |
| Remote about:healthreport content | Open | Katie/BDS | Working on pr for fhr-jelly, will deploy next week | 8/10 |
| Budget, size of UT pings | Open | Mark/BDS | https://bugzilla.mozilla.org/show_bug.cgi?id=1182693 | 8/10 |
| Analysis difficulty | Open | Katie/tbd | Spark training; need comprehensive plan | 8/10 |
Accomplished for Last Period
Engineering & Ops
- FHR Jelly PR
- DataBricks meeting
- Client work: Spreadsheet
- Data validation
- probes for telemetry
- Missing pings doc
- Generated v4 data set with complete set of pings from all clients seen on nightly: https://bugzilla.mozilla.org/show_bug.cgi?id=1171265#c24
- Work on missing subsessions analysis (hints at a client bug): https://bugzilla.mozilla.org/show_bug.cgi?id=1171268
- Pipeline scaling work
- Back fill of executive summary pings (hindsight)
- snappy support added to Spark and Heka infrastructure
QA
- Load testing
- work with softvision
Project management
- meetings, emails, hand waving
Planned for Upcoming Period
Engineering
- Client
- uplifts for probes
- data quality investigations
- datachoices infobar bug
- Pipeline
- In talk with Databricks wrt to Sparks hosting
- Mechanism for Heka state preservation when it gets wedged
- UT specific monitoring and alerting
- data retention spec
- Data validation
- update data sets (exe dashboard)
- acceptance criteria
- missing subsessions ping investigation
- Many submission for few clients issue
- Data continuity
- Document strategy for executive dashboards with v2 + v4 data
Ops
- building automated jenkins deployments
- nginx load balancing
QA:
- Look into prod T issue with Ops
- continue test suite creation
- finalizing long term QA engagement (softvision engagement, tooling asks for CI loop based testing)
Project Management
- Finish triage of bugs
- remainder of release tasks scheduled
Outstanding requests not yet road mapped into a release
| Description | State | Owner | Plan to Resolve/Mitigation | Target Date |
|---|---|---|---|---|
| FireFox OS - app pings | Open | Katie | Need to schedule and understand impact on project | TBD |
| histograms for loop/hello | Open | Katie | Need to schedule and understand impact on project | TBD |