Auto-tools/Projects/ActiveData/Redshift: Difference between revisions

m
fixlink
m (fix link)
m (fixlink)
Line 113: Line 113:
* Query planner might help with optimization, but I do not believe it will help in this situation; Redshift already indexes the columns for fast filtering and aggregation, but in the case of joins you can control what node you data resides to minimize communication overhead between nodes.
* Query planner might help with optimization, but I do not believe it will help in this situation; Redshift already indexes the columns for fast filtering and aggregation, but in the case of joins you can control what node you data resides to minimize communication overhead between nodes.
* SSD drives might improve query performance.
* SSD drives might improve query performance.
* Other hidden “shallow optimizations” – I have the sense the number of unknowns in Redshift  is still quite large to me.  One simple oversight, and all my numbers are irrelevant.  http://en.wikipedia.org/wiki/Linus%27s_Law “With enough eyeballs, all [optimizations] are shallow”].   
* Other hidden “shallow optimizations” – I have the sense the number of unknowns in Redshift  is still quite large to me.  One simple oversight, and all my numbers are irrelevant.  [http://en.wikipedia.org/wiki/Linus%27s_Law “With enough eyeballs, all [optimizations] are shallow”].   
* More nodes – I have no doubt more nodes can make the whole thing faster, but this must be balanced with cost.
* More nodes – I have no doubt more nodes can make the whole thing faster, but this must be balanced with cost.
* More efficient data shape – There is an endless set of transformations you can apply to your data to get better query performance.  The ActiveData philosophy is against putting effort into this time sink: Software is good enough that it should be performing this in the background given the data volume, data shape, and given the queries performed on it.
* More efficient data shape – There is an endless set of transformations you can apply to your data to get better query performance.  The ActiveData philosophy is against putting effort into this time sink: Software is good enough that it should be performing this in the background given the data volume, data shape, and given the queries performed on it.
Confirmed users
513

edits