Confirmed users
539
edits
(Formatted work queue) |
m (more formatting) |
||
Line 39: | Line 39: | ||
* Implement a specific flag to determine if data gets warehoused or not | * Implement a specific flag to determine if data gets warehoused or not | ||
* Integrate Roberto’s spark data flow into new DWH | * Integrate Roberto’s spark data flow into new DWH | ||
* Implies a similar db-backed table of DWH filenames for filtering (don’t want to list S3 every time - too slow) | ** Implies a similar db-backed table of DWH filenames for filtering (don’t want to list S3 every time - too slow) | ||
* Elasticsearch (Kibana) output filter | * Elasticsearch (Kibana) output filter | ||
* Complete list of outputs (and filters and any other support) | * Complete list of outputs (and filters and any other support) | ||
* Build a shim for debugging CEPs with local data | * Build a shim for debugging CEPs with local data | ||
* Store the “raw raw” data for some period to ensure we’re safe if our code and/or CEP code is badly broken. Can’t just lose data. | * Store the “raw raw” data for some period to ensure we’re safe if our code and/or CEP code is badly broken. Can’t just lose data. | ||
* Tee off to short-lived S3 before it goes through the main pipeline? | ** Tee off to short-lived S3 before it goes through the main pipeline? | ||
* BI query example that cross references data sources | * BI query example that cross references data sources | ||
* example: does fxa/sync increase browser usage? | ** example: does fxa/sync increase browser usage? | ||
== To Do == | == To Do == |