PDF.js/EMBED

From MozillaWiki
Jump to: navigation, search

EMBED JPEG Marker for Adobe CMYK Images

(Draft Recommendation)


Background

This page is a specification for the custom EMBED JPEG marker. Most of the JPEG images contain the color information in the YCbCr color model. To include the information base on other color models (e.g. RGB or CMYK), format vendors have to include some additional information via JPEG extensions -- application markers.

Adobe introduced the following extensions via APP14 'Adobe' marker (see Supporting the DCT Filters in PostScript Level 2, Technical Note #5116, Paragraph 18) The specification was not clear about the CMYK color model components value meaning: is 0 represents absence of the color or its presence (CMYK is a subtractive model). From the http://www.jpegcameras.com/libjpeg/libjpeg-3.html:

"... it appears that Adobe Photoshop writes inverted data in CMYK JPEG files: 0 represents 100% ink coverage, rather than 0% ink as you'd expect. This is arguably a bug in Photoshop, but if you need to work with Photoshop CMYK files, you will have to deal with it in your application. We cannot "fix" this in the library by inverting the data during the CMYK<=>YCCK transform, because that would break other applications, notably Ghostscript. Photoshop versions prior to 3.0 write EPS files containing JPEG-encoded CMYK data in the same inverted-YCCK representation used in bare JPEG files, but the surrounding PostScript code performs an inversion using the PS image operator. I am told that Photoshop 3.0 will write uninverted YCCK in EPS/JPEG files, and will omit the PS-level inversion. (But the data polarity used in bare JPEG files will not change in 3.0.) In either case, the JPEG library must not invert the data itself, or else Ghostscript would read these EPS files incorrectly."

In nutshell, the Photoshop saves the JPEG images in inverted-YCCK representation, however JPEG images embeded in PDF or EPS files are saved in normal-YCCK representation. Popular libraries such as the libjpeg, which has support for inverted-YCCK color model, is not planing to change the behavior and relies on readers/viewer do perform the conversion if necessary.

Most of the web browsers are using the libjpeg as the JPEG decoding backends. That allows to view the CMYK files saved directly from the Photoshop. However if the images that are extracted from the container file (e.g. PDF) will not be displayed properly. Also, some of the operating systems, such as Mac OS, render any CMYK image in normal-YCCK representation.

Introducing the EMBED marker

Let's introduce the custom marker that will be placed into the JPEG data after it's extracted from the container file. That gives the viewer/browser the indication the the image has originally present as a part of file. The JPEG X'FFEC marker (or APP12 marker) will have the following format:

  • two bytes of the maker header: 0xFF, 0xEC;
  • two bytes of the maker data length: 0x06;
  • six bytes of the ASCII string 'EMBED' with trailing zero: 0x45, 0x4D, 0x42, 0x45, 0x44, 0x00.

Those bytes can be inserted in any place allowed by JPEG format for an application marker.

Work in Progress

The implementation and the examples of the files can be found in the bug 674619.