Auto-tools/Projects/OrangeFactor/2010-09-22: Difference between revisions
< Auto-tools | Projects | OrangeFactor
Jump to navigation
Jump to search
(Created page with "== War on Orange meeting, September 22, 2010 == Parsing: * logparser lives here: http://hg.mozilla.org/automation/logparser ** THIS DOESN'T WORK! its a straight port from topfai...") |
m (Edmorley moved page Auto-tools/Projects/WarOnOrange/2010-09-22 to Auto-tools/Projects/OrangeFactor/2010-09-22: Renaming to the actual name of the webapp) |
(No difference)
| |
Latest revision as of 13:33, 23 October 2014
War on Orange meeting, September 22, 2010
Parsing:
- logparser lives here: http://hg.mozilla.org/automation/logparser
- THIS DOESN'T WORK! its a straight port from topfails (which is to say it works but is inadequate and there are also bugs on it)
- should be done to jython standards for hadoop
Storage:
- files in filesystem mirror that from the ftp site
- (raw) log -> parser -> flume (sp?) -> hdfs
- block size: 128M
- does this make looking through files slow?
What do we want?
- we have a (proposed) schema
- we have a (proposed) REST interface
- (we should put this on a wiki page and move towards finalization)
Process:
- we give python script (e.g. logparser)
- invoked on every log file
- output == what we want