Auto-tools/Projects/ActiveData/Redshift: Difference between revisions

Jump to navigation Jump to search
m
fix link
(fix wording)
m (fix link)
Line 113: Line 113:
* Query planner might help with optimization, but I do not believe it will help in this situation; Redshift already indexes the columns for fast filtering and aggregation, but in the case of joins you can control what node you data resides to minimize communication overhead between nodes.
* Query planner might help with optimization, but I do not believe it will help in this situation; Redshift already indexes the columns for fast filtering and aggregation, but in the case of joins you can control what node you data resides to minimize communication overhead between nodes.
* SSD drives might improve query performance.
* SSD drives might improve query performance.
* Other hidden “shallow optimizations” – I have the sense the number of unknowns in Redshift  is still quite large to me.  One simple oversight, and all my numbers are irrelevant.  “with enough eyeballs, all [optimizations] are shallow” [http://en.wikipedia.org/wiki/Linus%27s_Law 1].   
* Other hidden “shallow optimizations” – I have the sense the number of unknowns in Redshift  is still quite large to me.  One simple oversight, and all my numbers are irrelevant.  http://en.wikipedia.org/wiki/Linus%27s_Law “With enough eyeballs, all [optimizations] are shallow”].   
* More nodes – I have no doubt more nodes can make the whole thing faster, but this must be balanced with cost.
* More nodes – I have no doubt more nodes can make the whole thing faster, but this must be balanced with cost.
* More efficient data shape – There is an endless set of transformations you can apply to your data to get better query performance.  The ActiveData philosophy is against putting effort into this time sink: Software is good enough that it should be performing this in the background given the data volume, data shape, and given the queries performed on it.
* More efficient data shape – There is an endless set of transformations you can apply to your data to get better query performance.  The ActiveData philosophy is against putting effort into this time sink: Software is good enough that it should be performing this in the background given the data volume, data shape, and given the queries performed on it.
Confirmed users
513

edits

Navigation menu