Address Book: Difference between revisions
| Andrew8192 (talk | contribs)  (→Links to information:  fix dead links; redirects to Internet Archive copies of pages) | Andrew8192 (talk | contribs)   (cleanup) | ||
| Line 1: | Line 1: | ||
| = Deconstructing the Thunderbird Address Book = | == Deconstructing the Thunderbird Address Book == | ||
| There exists surprisingly little documentation on the format of the Thunderbird Address Book. Thunderbird stores Address Book Data (.mab files) and Mail Folder Summaries (.msf files) in a textual database format called "[[Mork]]", designed by David McCusker <davidmc@netscape.com>. Mork, unfortunately, is not a friendly format, and David is no longer working on it (nor does he care, apparently).   | |||
| There exists surprisingly little documentation on the format of the Thunderbird Address Book. Thunderbird stores Address Book Data (.mab files) and Mail Folder Summaries (.msf files) in a textual database format called "[[ | |||
| * Consider some of the comments found on the net: | * Consider some of the comments found on the net: | ||
| ** "When I opened my abook.mab file in vi, my heart sank. Inside, was an opaque mass of hex digits, parentheses, brackets and braces. I quickly threw my hands in the air and decided to hit Google up for some answers on how to process this stuff." | ** "When I opened my abook.mab file in vi, my heart sank. Inside, was an opaque mass of hex digits, parentheses, brackets and braces. I quickly threw my hands in the air and decided to hit Google up for some answers on how to process this stuff." | ||
| ** "It is impossible for non-Mozilla programs to extract data from (the History or Address Book) because it uses Mork, which is  | ** "It is impossible for non-Mozilla programs to extract data from (the History or Address Book) because it uses Mork, which is — and I do not use these words lightly — the single most braindamaged file format that I have ever seen in my nineteen year career." | ||
| ** "I have tried to write a parser for Mork in Perl, and it will never work right. The depths of depravity to which this format sinks are too great." | ** "I have tried to write a parser for Mork in Perl, and it will never work right. The depths of depravity to which this format sinks are too great." | ||
| ** "The original author not only hasn't worked on it in a long while, but doesn't care about it. He also admits that it is undocumented, and that he was never asked for such." | ** "The original author not only hasn't worked on it in a long while, but doesn't care about it. He also admits that it is undocumented, and that he was never asked for such." | ||
| Line 17: | Line 15: | ||
| *** It masquerades as a "textual" file format when in fact it's just another binary-blob file, except that it represents all its magic numbers in ASCII. It's not human-readable, it's not hand-editable, so the only benefit there is to the fact that it uses short lines and doesn't use binary characters is that it makes the file bigger. Oh wait, my mistake, that isn't actually a benefit at all." | *** It masquerades as a "textual" file format when in fact it's just another binary-blob file, except that it represents all its magic numbers in ASCII. It's not human-readable, it's not hand-editable, so the only benefit there is to the fact that it uses short lines and doesn't use binary characters is that it makes the file bigger. Oh wait, my mistake, that isn't actually a benefit at all." | ||
| Suffice it to say, Mork is not a human-friendly format. See the [[Examples of various Address Book formats]] section for more information. | |||
| == The good news == | |||
| The good news is that the Thunderbird Address Book importers do a fairly good job of importing the well-documented, [http://en.wikipedia.org/wiki/LDIF LDIF] format. There also exist a web-based PHP script to [http://labs.brotherli.ch/vcfconvert/ Convert vCards to LDIF Format]. I've dome some basic testing exporting vCards from the OS X Address Book, converting them to LDIF, then importing them into Thunderbird, and have not run into any serious problems, though [http://qwiki.qualcomm.com/eudora/Data_Loss_Converting_Address_Book_Data some data are lost along the way.] Thunderbird can also import from [http://en.wikipedia.org/wiki/Comma-separated_values CSV] format, though this process is not nearly as easy (from the user perspective) as importing from LDIF. | |||
| == External links == | |||
| == | |||
| Some links to documentation or information about the Mork format: | Some links to documentation or information about the Mork format: | ||
| Line 29: | Line 27: | ||
| * [https://web.archive.org/web/20080201004840/http://www.mozilla.org/mailnews/arch/mork/primer.txt A brief primer] on the Mork text format. ''(archived)'' | * [https://web.archive.org/web/20080201004840/http://www.mozilla.org/mailnews/arch/mork/primer.txt A brief primer] on the Mork text format. ''(archived)'' | ||
| * The [http://vcard.mozdev.org/ Mozilla vCard Project], stalled since 2005. | * The [http://vcard.mozdev.org/ Mozilla vCard Project], stalled since 2005. | ||
| * [http://en.wikipedia.org/wiki/Mork_%28file_format%29 Wikipedia article on Mork] | |||
Revision as of 21:02, 12 May 2014
Deconstructing the Thunderbird Address Book
There exists surprisingly little documentation on the format of the Thunderbird Address Book. Thunderbird stores Address Book Data (.mab files) and Mail Folder Summaries (.msf files) in a textual database format called "Mork", designed by David McCusker <davidmc@netscape.com>. Mork, unfortunately, is not a friendly format, and David is no longer working on it (nor does he care, apparently).
- Consider some of the comments found on the net:
- "When I opened my abook.mab file in vi, my heart sank. Inside, was an opaque mass of hex digits, parentheses, brackets and braces. I quickly threw my hands in the air and decided to hit Google up for some answers on how to process this stuff."
- "It is impossible for non-Mozilla programs to extract data from (the History or Address Book) because it uses Mork, which is — and I do not use these words lightly — the single most braindamaged file format that I have ever seen in my nineteen year career."
- "I have tried to write a parser for Mork in Perl, and it will never work right. The depths of depravity to which this format sinks are too great."
- "The original author not only hasn't worked on it in a long while, but doesn't care about it. He also admits that it is undocumented, and that he was never asked for such."
- "In brief, let's count its (Mork's) sins:
- Two different numerical namespaces that overlap.
- It can't decide what kind of character-quoting syntax to use: Backslash? Hex encoding with dollar-sign?
- C++ line comments are allowed sometimes, but sometimes // is just a pair of characters in a URL.
- It goes to all this serious compression effort (two different string-interning hash tables) and then writes out Unicode strings without using UTF-8: writes out the unpacked wchar_t characters!
- Worse, it hex-encodes each wchar_t with a 3-byte encoding, meaning the file size will be 3x or 6x (depending on whether whchar_t is 2 bytes or 4 bytes.)
- It masquerades as a "textual" file format when in fact it's just another binary-blob file, except that it represents all its magic numbers in ASCII. It's not human-readable, it's not hand-editable, so the only benefit there is to the fact that it uses short lines and doesn't use binary characters is that it makes the file bigger. Oh wait, my mistake, that isn't actually a benefit at all."
 
 
Suffice it to say, Mork is not a human-friendly format. See the Examples of various Address Book formats section for more information.
The good news
The good news is that the Thunderbird Address Book importers do a fairly good job of importing the well-documented, LDIF format. There also exist a web-based PHP script to Convert vCards to LDIF Format. I've dome some basic testing exporting vCards from the OS X Address Book, converting them to LDIF, then importing them into Thunderbird, and have not run into any serious problems, though some data are lost along the way. Thunderbird can also import from CSV format, though this process is not nearly as easy (from the user perspective) as importing from LDIF.
External links
Some links to documentation or information about the Mork format:
- The Bugzilla Bug, from which some of the above comments are taken.
- The Hard Way, from which other comments above are taken. (archived)
- A brief primer on the Mork text format. (archived)
- The Mozilla vCard Project, stalled since 2005.
- Wikipedia article on Mork