Auto-tools/Projects/ActiveData/Redshift: Difference between revisions

fix wording
(initial commit)
 
(fix wording)
Line 113: Line 113:
* Query planner might help with optimization, but I do not believe it will help in this situation; Redshift already indexes the columns for fast filtering and aggregation, but in the case of joins you can control what node you data resides to minimize communication overhead between nodes.
* Query planner might help with optimization, but I do not believe it will help in this situation; Redshift already indexes the columns for fast filtering and aggregation, but in the case of joins you can control what node you data resides to minimize communication overhead between nodes.
* SSD drives might improve query performance.
* SSD drives might improve query performance.
* Other hidden “shallow optimizations” – “with enough eyeballs, all [optimizations] are shallow” [http://en.wikipedia.org/wiki/Linus%27s_Law 1].  I have the sense the domain of unknown-unknowns when it comes to Redshift is still quite large to me.  One simple oversight, and all my numbers are irrelevant.
* Other hidden “shallow optimizations” – I have the sense the number of unknowns in Redshift  is still quite large to me.  One simple oversight, and all my numbers are irrelevant.  “with enough eyeballs, all [optimizations] are shallow” [http://en.wikipedia.org/wiki/Linus%27s_Law 1].   
* More nodes – I have no doubt more node can make the whole thing faster, but this must be balanced with cost.
* More nodes – I have no doubt more nodes can make the whole thing faster, but this must be balanced with cost.
* More efficient data shape – There is a, possibly endless, set of transformations you can apply to your data to get better query performance.  The ActiveData philosophy is against putting effort into this time sink: Software is good enough that it should be performing this in the background given the data volume, data shape, and given the queries performed on it.
* More efficient data shape – There is an endless set of transformations you can apply to your data to get better query performance.  The ActiveData philosophy is against putting effort into this time sink: Software is good enough that it should be performing this in the background given the data volume, data shape, and given the queries performed on it.


==Summary==
==Summary==
Confirmed users
513

edits