Auto-tools/Projects/ActiveData/Redshift: Difference between revisions

Auto-tools/Projects/ActiveData/Redshift (view source)

3 bytes removed , 18 March 2015

m

fix link

Confirmed users

513

edits

@@ Line 113: / Line 113: @@
 * Query planner might help with optimization, but I do not believe it will help in this situation; Redshift already indexes the columns for fast filtering and aggregation, but in the case of joins you can control what node you data resides to minimize communication overhead between nodes.
 * SSD drives might improve query performance.
-* Other hidden “shallow optimizations” – I have the sense the number of unknowns in Redshift  is still quite large to me.  One simple oversight, and all my numbers are irrelevant.  “with enough eyeballs, all [optimizations] are shallow” [http://en.wikipedia.org/wiki/Linus%27s_Law 1].
+* Other hidden “shallow optimizations” – I have the sense the number of unknowns in Redshift  is still quite large to me.  One simple oversight, and all my numbers are irrelevant.  http://en.wikipedia.org/wiki/Linus%27s_Law “With enough eyeballs, all [optimizations] are shallow”].
 * More nodes – I have no doubt more nodes can make the whole thing faster, but this must be balanced with cost.
 * More efficient data shape – There is an endless set of transformations you can apply to your data to get better query performance.  The ActiveData philosophy is against putting effort into this time sink: Software is good enough that it should be performing this in the background given the data volume, data shape, and given the queries performed on it.