Unified Telemetry/Status reports/July 17 2015: Difference between revisions
Jump to navigation
Jump to search
m (→Accomplished for Last Period: tweak) |
(→Planned for Upcoming Period: Add next tasks) |
||
| Line 64: | Line 64: | ||
Engineering | Engineering | ||
* Pipeline | * Client | ||
* | ** Do code reviews for deletion pings and choices info bar | ||
* Data | ** Pending ping cleanup | ||
* | ** Investigate count discrepancies between "main" pings and "saved session" pings | ||
* Pipeline | |||
** Continue with scaling work | |||
** Monitoring work for Telemetry data | |||
** Investigate executive stream discrepancies | |||
** Bug fixes | |||
* Data validation | |||
** Join corresponding v2 data to v4 nightly clients data set | |||
** Continue writing callbacks that look at other measures | |||
** Breadth first, do a first pass at most validations and flag big issues | |||
** Deep dive on missing subsessions as it may indicate a client bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1171268 | |||
* Data continuity | |||
** Document strategy for executive dashboards with v2 + v4 data | |||
Ops | Ops | ||
* data bricks investigation (big jobs on big clusters) - cost, resourcing etc | * data bricks investigation (big jobs on big clusters) - cost, resourcing etc | ||
QA | QA | ||
* closing bugs | * closing bugs | ||
| Line 80: | Line 88: | ||
Project Management | Project Management | ||
* Finish triage of bugs | * Finish triage of bugs | ||
*remainder of release tasks scheduled | * remainder of release tasks scheduled | ||
=== Outstanding requests not yet road mapped into a release === | === Outstanding requests not yet road mapped into a release === | ||
Revision as of 16:57, 17 July 2015
Unified Telemetry status report July 17, 2015
Overall Project Health
Green - r41 is go live for unified Telemetry. All issues triaged and assigned milestones. Dev Team continues to focus on data validation.
Exec Summary
- Client work delayed this week by sick time.
- Working toward data validation milestone on July 30: http://mzl.la/1J2OdZA
- Pipeline scaling work to be completed by July 30(?)
- <tools>
- Ongoing planning on FHR V2/V3 historic pipeline migration link to status here.
Risks/Issues
| Description of Risks/Issues | State | Owner | Plan to Resolve/Mitigation | Target Date |
|---|---|---|---|---|
| Data integrity between V2/V4 and V4 internal data consistency | Open | Brendan/Sam | Investigation in progress. Added resources (Sam). https://etherpad.mozilla.org/fhr-v4-validation | 7/30 |
| Data continuity across V2/V4 | Open | Katie/Mark/Trink | Mark writing up plan from Whistler; metrics team specifying data sets and reviewing "executive" data set. https://bugzilla.mozilla.org/show_bug.cgi?id=1182684 | 7/23 |
| Legal review | Open | BDS/Legal | Meeting between groups | 8/04 |
| QA sign off (functional, load) | Open | Stuart | Working with QA on creating test cases/test plans | 8/04 |
| Operations - data retention requirements | Open | Travis/Katie | Eng team owes ops a doc defining ping types and data retention requirements | 8/04 |
| Operations - analysis tools & microservices | Open | Travis/Mark/Roberto | Architecture/Data flow diagram; meeting next Monday (7/13) | 8/04 |
| Data loss incident | Fixed | mreid/whd/trink | Tee server needs to return error status from old or new. Added Ops resources (Daniel Thornton). | 7/15 |
| Remote about:healthreport content | Open | Katie/Georg | Made a request to Laura Thomson for help | 8/04 |
| Budget, size of UT pings | Open | Mark/BDS | https://bugzilla.mozilla.org/show_bug.cgi?id=1182693 | 8/04 |
| Analysis difficulty | Open | Katie/tbd | No plan yet, aside from ongoing work on tools | 8/04 |
Accomplished for Last Period
Engineering & Ops
- Heka 0.10.0 beta released
- Client work: Spreadsheet
- Not uplifting recent send logic changes to Beta (needs more bake time for confidence)
- Uplifting a few patches around the send-logic ([uplift2], http://bit.ly/1Je45UA) to Aurora as soon as the send-logic impact is verified
- Remaining client work ([uplift3], http://bit.ly/1TCl4r8) for 41 is manageable and either blocked by info requests or review
- Data validation
- Generated v4 data set with complete set of pings from all clients seen on nightly: https://bugzilla.mozilla.org/show_bug.cgi?id=1171265#c24
- Work on missing subsessions analysis (hints at a client bug): https://bugzilla.mozilla.org/show_bug.cgi?id=1171268
- Pipeline scaling work
- Finished distributed aggregation work started at workweek: https://github.com/mozilla-services/data-pipeline/pull/93
- Deployed next round of changes
- Telemetry tools and microservices
- Work on memory footprint of the Spark jobs: https://bugzilla.mozilla.org/show_bug.cgi?id=1182499
- Kickoff meeting for deployment plan for telemetry tools and microservices: Architecture flow diagram
QA
- test cases, bug closing
Project management
- meeting, emails, hand waving
Planned for Upcoming Period
Engineering
- Client
- Do code reviews for deletion pings and choices info bar
- Pending ping cleanup
- Investigate count discrepancies between "main" pings and "saved session" pings
- Pipeline
- Continue with scaling work
- Monitoring work for Telemetry data
- Investigate executive stream discrepancies
- Bug fixes
- Data validation
- Join corresponding v2 data to v4 nightly clients data set
- Continue writing callbacks that look at other measures
- Breadth first, do a first pass at most validations and flag big issues
- Deep dive on missing subsessions as it may indicate a client bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1171268
- Data continuity
- Document strategy for executive dashboards with v2 + v4 data
Ops
- data bricks investigation (big jobs on big clusters) - cost, resourcing etc
QA
- closing bugs
- test suite creation
- finalizing long term QA engagement (softvision engagement, tooling asks for CI loop based testing)
Project Management
- Finish triage of bugs
- remainder of release tasks scheduled
Outstanding requests not yet road mapped into a release
| Description | State | Owner | Plan to Resolve/Mitigation | Target Date |
|---|---|---|---|---|
| FireFox OS - app pings | Open | Katie | Need to schedule and understand impact on project | TBD |
| histograms for loop/hello | Open | Katie | Need to schedule and understand impact on project | TBD |