|
|
Line 39: |
Line 39: |
| * Deletes & legal policy [telliot, mreid to provide cost estimate] | | * Deletes & legal policy [telliot, mreid to provide cost estimate] |
|
| |
|
| === Needs more discussion ===
| |
| * Maintaining a sample data set for faster queries
| |
| * Implement a specific flag to determine if data gets warehoused or not
| |
| * Integrate Roberto’s spark data flow into new DWH
| |
| ** Implies a similar db-backed table of DWH filenames for filtering (don’t want to list S3 every time - too slow)
| |
| * Elasticsearch (Kibana) output filter
| |
| * Complete list of outputs (and filters and any other support)
| |
| * Build a shim for debugging CEPs with local data
| |
| * Store the “raw raw” data for some period to ensure we’re safe if our code and/or CEP code is badly broken. Can’t just lose data.
| |
| ** Tee off to short-lived S3 before it goes through the main pipeline?
| |
| * BI query example that cross references data sources
| |
| ** example: does fxa/sync increase browser usage?
| |
| === To Do === | | === To Do === |
| * Q4 telemetry: (re) implement telemetry monitoring dashboards [?] | | * Q4 telemetry: (re) implement telemetry monitoring dashboards [?] |