Unified Telemetry/Status reports/July 17 2015
Jump to navigation
Jump to search
Unified Telemetry status report July 17, 2015
Overall Project Health
Green - r41 is go live for unified Telemetry. All issues triaged and assigned milestones. Dev Team continues to focus on data validation.
Exec Summary
- Client work delayed this week by sick time.
- Working toward data validation milestone on July 30: http://mzl.la/1J2OdZA
- Pipeline scaling work to be completed by July 30(?)
- <tools>
- Ongoing planning on FHR V2/V3 historic pipeline migration link to status here.
Risks/Issues
| Description of Risks/Issues | State | Owner | Plan to Resolve/Mitigation | Target Date |
|---|---|---|---|---|
| Data integrity between V2/V4 and V4 internal data consistency | Open | Brendan/Sam | Investigation in progress. Added resources (Sam). https://etherpad.mozilla.org/fhr-v4-validation | 7/30 |
| Data continuity across V2/V4 | Open | Katie/Mark/Trink | Mark writing up plan from Whistler; metrics team specifying data sets and reviewing "executive" data set. https://bugzilla.mozilla.org/show_bug.cgi?id=1182684 | 7/23 |
| Legal review | Open | BDS/Legal | Meeting between groups | 8/04 |
| QA sign off (functional, load) | Open | Stuart | Working with QA on creating test cases/test plans | 8/04 |
| Operations - data retention requirements | Open | Travis/Katie | Eng team owes ops a doc defining ping types and data retention requirements | 8/04 |
| Operations - analysis tools & microservices | Open | Travis/Mark/Roberto | Architecture/Data flow diagram; meeting next Monday (7/13) | 8/04 |
| Data loss incident | Fixed | mreid/whd/trink | Tee server needs to return error status from old or new. Added Ops resources (Daniel Thornton). | 7/15 |
| Remote about:healthreport content | Open | Katie/Georg | Made a request to Laura Thomson for help | 8/04 |
| Budget, size of UT pings | Open | Mark/BDS | https://bugzilla.mozilla.org/show_bug.cgi?id=1182693 | 8/04 |
| Analysis difficulty | Open | Katie/tbd | No plan yet, aside from ongoing work on tools | 8/04 |
Accomplished for Last Period
Engineering
- heka 0.10.0 beta pushed
- Client work: Spreadsheet
- Not uplifting recent send logic changes to Beta (needs more bake time for confidence)
- Uplifting a few patches around the send-logic ([uplift2], http://bit.ly/1Je45UA) to Aurora as soon as the send-logic impact is verified
- Remaining client work ([uplift3], http://bit.ly/1TCl4r8) for 41 is manageable and either blocked by info requests or review
- Updates to the unified telemetry decoder and executive report
- Architecture flow diagram in preparation for meeting with ops
- Progress on data validation
- Compare FHR v2 and FHR v4 search, crash, and other fields: https://bugzilla.mozilla.org/show_bug.cgi?id=1179376 -- close agreement for search counts
- Saved-session vs main pings: https://bugzilla.mozilla.org/show_bug.cgi?id=1147395 -- mismatch in about 7% of sessions for one of the metrics investigated
- created new milestones for remainder of r40 cycle, triaged bugs into the buckets
Ops
- alerts and status code trapping
Performance
- spark out of mem jobs
QA
- test cases, bug closing
Project management
- meeting, emails, hand waving
Planned for Upcoming Period
Engineering
- Pipeline monitoring (tracking errors by channel and build id)
- Uplift final client changes for r40: spreadsheet
- Data validation: https://etherpad.mozilla.org/fhr-v4-validation
- Targeting 40r9 as next major milestone to have bulk of validation work completed.
Ops
- Meeting to go over Telemetry tools/microservices production deployment
- Continued work on scaling for release loads
- data bricks investigation (big jobs on big clusters) - cost, resourcing etc
Performance
- automate spark
QA
- closing bugs
- test suite creation
- finalizing long term QA engagement (softvision engagement, tooling asks for CI loop based testing)
Project Management
- Finish triage of bugs
- remainder of release tasks scheduled
Outstanding requests not yet road mapped into a release
| Description | State | Owner | Plan to Resolve/Mitigation | Target Date |
|---|---|---|---|---|
| FireFox OS - app pings | Open | Katie | Need to schedule and understand impact on project | TBD |
| histograms for loop/hello | Open | Katie | Need to schedule and understand impact on project | TBD |