Confirmed users
156
edits
mNo edit summary |
mNo edit summary |
||
| Line 93: | Line 93: | ||
* To execute an entire pipeline and observe the full results as a Python object, use ".collect()". | * To execute an entire pipeline and observe the full results as a Python object, use ".collect()". | ||
* You can write files as part of the analysis. They will appear in the "analyses" folder on your spark instance. | * You can write files as part of the analysis. They will appear in the "analyses" folder on your spark instance. | ||
* Because analysis steps are pipelined until results are observed, it is dangerous to re-use variable names across blocks. You may have to restart the kernel or re-run previous steps to make sure pipelines are constructed correctly. | |||
* You can automate spark jobs via telemetry-dash; output files will appear in S3 and can be used from people.mozilla.org to build dashboards. | * You can automate spark jobs via telemetry-dash; output files will appear in S3 and can be used from people.mozilla.org to build dashboards. | ||