Telemetry/LongitudinalExamples: Difference between revisions

Telemetry/LongitudinalExamples (view source)

499 bytes removed , 3 August 2016

→‎Sampling: removing reference to bernoulli sampling

54

edits

@@ Line 24: / Line 24: @@
   SELECT * FROM longitudinal LIMIT 1000 ...
-For a statistically sound sample, use TABLESAMPLE:
+There's no need to use other sampling methods, such as TABLESAMPLE, on the longitudinal set. Rows are randomly ordered, so a LIMIT sample is expected to be random.
- SELECT * FROM longitudinal TABLESAMPLE BERNOULLI(xx)
-Where xx is an integer representing what percentage of data you want to include in your sample (e.g. 10% sample -> xx=10).
-A couple of caveats:
-* This sampling method will only decrease your query run time if you're manipulating the data a lot. Bernoulli sampling still requires reading the whole DB before proceeding.
-* This sample will not be deterministic. I.e. you will not get the same sample for every run. This can cause problems when using Presto Views or logical tables.
-* Unlike LIMIT, this method does not guarantee a fixed number of results.
 === Arrays ===