Unified Telemetry/Status reports/July 24 2015: Difference between revisions
(→Risks/Issues: added new issue) |
(→Unified Telemetry status report July 24, 2015: dco link) |
||
| (10 intermediate revisions by the same user not shown) | |||
| Line 4: | Line 4: | ||
=== Overall Project Health === | === Overall Project Health === | ||
Last week: Green | |||
This week: Yellow - r41 is go live for unified Telemetry. All issues triaged and assigned milestones. Dev Team continues to focus on data validation, [https://bugzilla.mozilla.org/show_bug.cgi?id=1185123 client side pings] being the current blocker under investigated. | |||
=== Exec Summary === | === Exec Summary === | ||
* | * Validation work hits ping bugs, current blocking issue | ||
* July 30 milestone for first complete pass of data validation, deployment of pipeline scaling work | * July 30 milestone for first complete pass of data validation, deployment of pipeline scaling work | ||
* Testing plan up on wiki:[[Telemetry/Testing]] | * Testing plan up on wiki:[[Telemetry/Testing]] | ||
| Line 18: | Line 19: | ||
! Description of Risks/Issues !! State !! Owner !! Plan to Resolve/Mitigation !! Target Date | ! Description of Risks/Issues !! State !! Owner !! Plan to Resolve/Mitigation !! Target Date | ||
|- | |- | ||
| Investigate gaps in pings || Open || Stuart/Alessio || https://bugzilla.mozilla.org/show_bug.cgi?id=1185123 || 8/04 | | Investigate gaps in pings || Open || Stuart/Alessio || https://bugzilla.mozilla.org/show_bug.cgi?id=1185123, [https://etherpad.mozilla.org/u230yVoP9S working doc] || 8/04 | ||
|- | |- | ||
| Data integrity between V2/V4 and V4 internal data consistency || Open || Brendan/Sam || Investigation in progress. Added resources (Sam). https://etherpad.mozilla.org/fhr-v4-validation || 7/30 | | Data integrity between V2/V4 and V4 internal data consistency || Open || Brendan/Sam || Investigation in progress. Added resources (Sam). https://etherpad.mozilla.org/fhr-v4-validation || 7/30 | ||
| Line 43: | Line 44: | ||
=== Accomplished for Last Period === | === Accomplished for Last Period === | ||
Engineering & Ops | Engineering & Ops | ||
* | * Initial Databricks investigation: not useful to Perf Team, metrics team/Katie to decide next week if it suits our purpose. | ||
* Aggregation work up in stage, needs testing | |||
* Client work: [https://docs.google.com/spreadsheets/d/1yAJmgCGYyk1d7A41DZa653Z3u2AbH-kDWsO1vPSgbfE/edit?usp=sharing Spreadsheet] | * Client work: [https://docs.google.com/spreadsheets/d/1yAJmgCGYyk1d7A41DZa653Z3u2AbH-kDWsO1vPSgbfE/edit?usp=sharing Spreadsheet] | ||
* Data validation | * Data validation | ||
** Missing pings [https://etherpad.mozilla.org/u230yVoP9S doc] | |||
** Generated v4 data set with complete set of pings from all clients seen on nightly: https://bugzilla.mozilla.org/show_bug.cgi?id=1171265#c24 | ** Generated v4 data set with complete set of pings from all clients seen on nightly: https://bugzilla.mozilla.org/show_bug.cgi?id=1171265#c24 | ||
** Work on missing subsessions analysis (hints at a client bug): https://bugzilla.mozilla.org/show_bug.cgi?id=1171268 | ** Work on missing subsessions analysis (hints at a client bug): https://bugzilla.mozilla.org/show_bug.cgi?id=1171268 | ||
| Line 58: | Line 58: | ||
** Kickoff meeting for deployment plan for telemetry tools and microservices: [https://docs.google.com/a/mozilla.com/document/d/1KoLtIFV-aZtxruSVNmcc26F22MfqWjDynKgZ6adYk54/edit?usp=sharing Architecture flow diagram] | ** Kickoff meeting for deployment plan for telemetry tools and microservices: [https://docs.google.com/a/mozilla.com/document/d/1KoLtIFV-aZtxruSVNmcc26F22MfqWjDynKgZ6adYk54/edit?usp=sharing Architecture flow diagram] | ||
QA | QA | ||
* test | * Investigate client QA automated test scripts | ||
* Update test wiki | |||
*work with softvision to prepare for RC pass | |||
Project management | Project management | ||
* | * meetings, emails, hand waving | ||
=== Planned for Upcoming Period === | === Planned for Upcoming Period === | ||
| Line 67: | Line 69: | ||
* Client | * Client | ||
** Do code reviews for deletion pings and choices info bar | ** Do code reviews for deletion pings and choices info bar | ||
** Pending ping cleanup | ** Continue Pending ping cleanup | ||
** Investigate count discrepancies between "main" pings and "saved session" pings | ** Continue Investigate count discrepancies between "main" pings and "saved session" pings | ||
* Pipeline | * Pipeline | ||
** Continue with scaling work | ** Continue with scaling work | ||
| Line 75: | Line 77: | ||
** Bug fixes | ** Bug fixes | ||
* Data validation | * Data validation | ||
** | ** Working on 100k-client paired v2/v4 pings from early June to early July | ||
** | ** validation efforts (main vs saved-sessions, ending subsessions pings, broken chaining) | ||
** Deep dive on missing subsessions as it may indicate a client bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1171268 | ** Deep dive on missing subsessions as it may indicate a client bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1171268 | ||
* Data continuity | * Data continuity | ||
** Document strategy for executive dashboards with v2 + v4 data | ** Document strategy for executive dashboards with v2 + v4 data | ||
Ops | Ops | ||
* | * aggregate pipeline available in staging, needs testing | ||
QA | QA | ||
* closing bugs | * closing bugs | ||
* test suite creation | * continue test suite creation | ||
* finalizing long term QA engagement (softvision engagement, tooling asks for CI loop based testing) | * finalizing long term QA engagement (softvision engagement, tooling asks for CI loop based testing) | ||
Project Management | Project Management | ||
Latest revision as of 17:04, 24 July 2015
Unified Telemetry status report July 24, 2015
Overall Project Health
Last week: Green
This week: Yellow - r41 is go live for unified Telemetry. All issues triaged and assigned milestones. Dev Team continues to focus on data validation, client side pings being the current blocker under investigated.
Exec Summary
- Validation work hits ping bugs, current blocking issue
- July 30 milestone for first complete pass of data validation, deployment of pipeline scaling work
- Testing plan up on wiki:Telemetry/Testing
- Ongoing planning on FHR V2/V3 historic pipeline migration link to status here.
Risks/Issues
| Description of Risks/Issues | State | Owner | Plan to Resolve/Mitigation | Target Date |
|---|---|---|---|---|
| Investigate gaps in pings | Open | Stuart/Alessio | https://bugzilla.mozilla.org/show_bug.cgi?id=1185123, working doc | 8/04 |
| Data integrity between V2/V4 and V4 internal data consistency | Open | Brendan/Sam | Investigation in progress. Added resources (Sam). https://etherpad.mozilla.org/fhr-v4-validation | 7/30 |
| Data continuity across V2/V4 | Open | Katie/Mark/Trink | Plan, Metabug | 7/30 |
| Legal review | Open | BDS/Legal | Meeting between groups | 8/04 |
| QA sign off (functional, load) | Open | Stuart | Telemetry/Testing | 8/04 |
| Operations - data retention requirements | Open | Travis/Katie | Eng team owes ops a doc defining ping types and data retention requirements | 8/04 |
| Operations - analysis tools & microservices | Open | Travis/Mark/Roberto | Architecture/Data flow diagram | 8/04 |
| Data loss incident | Fixed | mreid/whd/trink | Tee server needs to return error status from old or new. Added Ops resources (Daniel Thornton). | 7/15 |
| Remote about:healthreport content | Open | Katie/Georg | Made a request to Laura Thomson for help | 8/04 |
| Budget, size of UT pings | Open | Mark/BDS | https://bugzilla.mozilla.org/show_bug.cgi?id=1182693 | 8/04 |
| Analysis difficulty | Open | Katie/tbd | No plan yet, aside from ongoing work on tools | 8/04 |
Accomplished for Last Period
Engineering & Ops
- Initial Databricks investigation: not useful to Perf Team, metrics team/Katie to decide next week if it suits our purpose.
- Aggregation work up in stage, needs testing
- Client work: Spreadsheet
- Data validation
- Missing pings doc
- Generated v4 data set with complete set of pings from all clients seen on nightly: https://bugzilla.mozilla.org/show_bug.cgi?id=1171265#c24
- Work on missing subsessions analysis (hints at a client bug): https://bugzilla.mozilla.org/show_bug.cgi?id=1171268
- Pipeline scaling work
- Finished distributed aggregation work started at workweek: https://github.com/mozilla-services/data-pipeline/pull/93
- Deployed next round of changes
- Telemetry tools and microservices
- Work on memory footprint of the Spark jobs: https://bugzilla.mozilla.org/show_bug.cgi?id=1182499
- Kickoff meeting for deployment plan for telemetry tools and microservices: Architecture flow diagram
QA
- Investigate client QA automated test scripts
- Update test wiki
- work with softvision to prepare for RC pass
Project management
- meetings, emails, hand waving
Planned for Upcoming Period
Engineering
- Client
- Do code reviews for deletion pings and choices info bar
- Continue Pending ping cleanup
- Continue Investigate count discrepancies between "main" pings and "saved session" pings
- Pipeline
- Continue with scaling work
- Monitoring work for Telemetry data
- Investigate executive stream discrepancies
- Bug fixes
- Data validation
- Working on 100k-client paired v2/v4 pings from early June to early July
- validation efforts (main vs saved-sessions, ending subsessions pings, broken chaining)
- Deep dive on missing subsessions as it may indicate a client bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1171268
- Data continuity
- Document strategy for executive dashboards with v2 + v4 data
Ops
- aggregate pipeline available in staging, needs testing
QA
- closing bugs
- continue test suite creation
- finalizing long term QA engagement (softvision engagement, tooling asks for CI loop based testing)
Project Management
- Finish triage of bugs
- remainder of release tasks scheduled
Outstanding requests not yet road mapped into a release
| Description | State | Owner | Plan to Resolve/Mitigation | Target Date |
|---|---|---|---|---|
| FireFox OS - app pings | Open | Katie | Need to schedule and understand impact on project | TBD |
| histograms for loop/hello | Open | Katie | Need to schedule and understand impact on project | TBD |