Confirmed users
539
edits
(Add work queue) |
(Formatted work queue) |
||
Line 31: | Line 31: | ||
= Work Queue = | = Work Queue = | ||
== Risks/Questions == | == Risks/Questions == | ||
Send something to dev-planning? [kparlante, telliot] | * Send something to dev-planning? [kparlante, telliot] | ||
Old-FHR data through pipeline? Yes/No: [telliot] | * Old-FHR data through pipeline? Yes/No: [telliot] | ||
Deletes & legal policy [telliot, mreid to provide cost estimate] | * Deletes & legal policy [telliot, mreid to provide cost estimate] | ||
Maintaining a sample data set for faster queries | == Needs more discussion/definition == | ||
Implement a specific flag to determine if data gets warehoused or not | * Maintaining a sample data set for faster queries | ||
Integrate Roberto’s spark data flow into new DWH | * Implement a specific flag to determine if data gets warehoused or not | ||
Implies a similar db-backed table of DWH filenames for filtering (don’t want to list S3 every time - too slow) | * Integrate Roberto’s spark data flow into new DWH | ||
Elasticsearch (Kibana) output filter | * Implies a similar db-backed table of DWH filenames for filtering (don’t want to list S3 every time - too slow) | ||
Complete list of outputs (and filters and any other support) | * Elasticsearch (Kibana) output filter | ||
Build a shim for debugging CEPs with local data | * Complete list of outputs (and filters and any other support) | ||
Store the “raw raw” data for some period to ensure we’re safe if our code and/or CEP code is badly broken. Can’t just lose data. | * Build a shim for debugging CEPs with local data | ||
Tee off to short-lived S3 before it goes through the main pipeline? | * Store the “raw raw” data for some period to ensure we’re safe if our code and/or CEP code is badly broken. Can’t just lose data. | ||
BI query example that cross references data sources | * Tee off to short-lived S3 before it goes through the main pipeline? | ||
example: does fxa/sync increase browser usage? | * BI query example that cross references data sources | ||
* example: does fxa/sync increase browser usage? | |||
Q4 telemetry: (re) implement telemetry monitoring dashboards [?] | |||
Q1 BI: define schema for data warehouse (talk to jjensen) [kparlante] | == To Do == | ||
should use multiple data sources | * Q4 telemetry: (re) implement telemetry monitoring dashboards [?] | ||
Q1 BI: write filter for data warehouse [trink] | * Q1 BI: define schema for data warehouse (talk to jjensen) [kparlante] | ||
Q1 BI: signal & schedule loading of data warehouse [mreid] | * should use multiple data sources | ||
Q1 BI: redshift output [trink] | * Q1 BI: write filter for data warehouse [trink] | ||
Q1 BI: setup domo and/or tableau to look at mysql or csv or whatever is easy [?] | * Q1 BI: signal & schedule loading of data warehouse [mreid] | ||
Q1: Data format spec [kparlante, trink] | * Q1 BI: redshift output [trink] | ||
JSON schema, specifically for FHR+telemetry, also anticipate other sources | * Q1 BI: setup domo and/or tableau to look at mysql or csv or whatever is easy [?] | ||
Q1: implement best guess at per user sampling [trink] | * Q1: Data format spec [kparlante, trink] | ||
follow up with saptarshi for more complex algorithm | * JSON schema, specifically for FHR+telemetry, also anticipate other sources | ||
Doing | * Q1: implement best guess at per user sampling [trink] | ||
Opsify stack [whd] | * follow up with saptarshi for more complex algorithm | ||
Q4 telemetry: Send telemetry data through the pipeline [mreid] | == Doing == | ||
Q4 telemetry: Larger payloads (32MB) for telemetry [trink] | * Opsify stack [whd] | ||
risk mitigation: Estimate cost of “full scan” DWH query [mreid] | * Q4 telemetry: Send telemetry data through the pipeline [mreid] | ||
risk mitigation: Estimate cost of single DWH delete [mreid] | * Q4 telemetry: Larger payloads (32MB) for telemetry [trink] | ||
Done | * risk mitigation: Estimate cost of “full scan” DWH query [mreid] | ||
Parallelize sandbox filters (eg FHRSearch) [trink] | * risk mitigation: Estimate cost of single DWH delete [mreid] | ||
Enable Lua JIT [trink] | == Done == | ||
* Parallelize sandbox filters (eg FHRSearch) [trink] | |||
* Enable Lua JIT [trink] |