Firefox/Input/Data: Difference between revisions

Jump to navigation Jump to search
Fix parser link
(Fix parser link)
Line 11: Line 11:
The data is a UTF-8 encoded unicode stream. Lines (=records) are separated using LF (newline, U+000A). There are no header/title records. Fields (=columns) are separated by TAB (U+0009). So TAB and LF in fields need escaping. For this, they are preceded using backslash (U+005C). Of course, this means that backslashes in fields are escaped themselves.
The data is a UTF-8 encoded unicode stream. Lines (=records) are separated using LF (newline, U+000A). There are no header/title records. Fields (=columns) are separated by TAB (U+0009). So TAB and LF in fields need escaping. For this, they are preceded using backslash (U+005C). Of course, this means that backslashes in fields are escaped themselves.


* [https://github.com/michaelku/grouper-worker/blob/488f1385fe5a1865cfc423ce7bec25237b150bca/src/main/java/org/mozilla/grouper/input/TsvReader.java Example FSM] to parse input data
* [https://github.com/mozilla-metrics/grouperfish/blob/master/src/main/java/com/mozilla/grouperfish/input/TsvReader.java Example FSM] to parse input data
* [https://github.com/fwenzel/reporter/blob/master/apps/api/cron.py Python exporter] that generates the input data
* [https://github.com/fwenzel/reporter/blob/master/apps/api/cron.py Python exporter] that generates the input data


48

edits

Navigation menu