Socorro:Hadoop: Difference between revisions

no edit summary
No edit summary
No edit summary
Line 12: Line 12:


=== Hadoop Flowchart  ===
=== Hadoop Flowchart  ===
[[Image:Hadoop-Hbase.png]]


Each hadoop job reads list of ooid's, splits and passes the ooid's to mapper Mapper invokes a socket connection and sends a request to pyproc to process raw-dumps raw-dumps are then collected by mapper and sent to the reducer. reducer inserts the raw dump and certain processed columns in Hbase
Each hadoop job reads list of ooid's, splits and passes the ooid's to mapper Mapper invokes a socket connection and sends a request to pyproc to process raw-dumps raw-dumps are then collected by mapper and sent to the reducer. reducer inserts the raw dump and certain processed columns in Hbase
32

edits