PDF.js: Difference between revisions
(→(TODO)) |
|||
| Line 37: | Line 37: | ||
Backend | Backend | ||
* zooming | * zooming | ||
** the general idea is that the UI will set a zoom factor, say 200% | ** the general idea is that the UI will set a zoom factor, say 200% | ||
** we'll redraw the canvas, but with a scale transform to 2x, and a translation set to move the content we want to fill the screen to the top-left | ** we'll redraw the canvas, but with a scale transform to 2x, and a translation set to move the content we want to fill the screen to the top-left | ||
* draw subpage | * draw subpage | ||
* SVG backend | |||
* linearization | |||
** byte range requests | |||
* hyperlinks (hash URLs, intra-doc links) | * hyperlinks (hash URLs, intra-doc links) | ||
* perf (use workers for some stuff?) | * perf (use workers for some stuff?) | ||
* color spaces (big, pervasive) | * color spaces (big, pervasive) | ||
* build something like gecko's display list, for hit testing | * build something like gecko's display list, for hit testing | ||
** click-on-link (easy) | ** click-on-link (easy) | ||
** text selection (hard) | ** text selection (hard) | ||
UI | UI | ||
* animations (page flip, etc.) | * animations (page flip, etc.) | ||
* hyperlinks | * hyperlinks | ||
* page transitions | * page transitions | ||
* dual-page display | |||
* page-transition animations | |||
* <s>pan/zoom/next/prev gestures</s> (Edit: Felipe Gomes and cjones discussed a better way to support these, but it will require new web APIs) | |||
Platform | Platform | ||
* TextMetrics.maxHeight (to compute more accurate bounding boxes; can approximate without this, though) | * TextMetrics.maxHeight (to compute more accurate bounding boxes; can approximate without this, though) | ||
* | * implement text selection in SVG documents | ||
* (determine extent of SVG a11y implementation, if any) | * (determine extent of SVG a11y implementation, if any) | ||
| Line 97: | Line 86: | ||
==== Big project: SVG backend ==== | ==== Big project: SVG backend ==== | ||
Most of SVG maps well to PDF (was influenced by?). There are existing PDF->SVG translators. Perf is the biggest concern. | Most of SVG maps well to PDF (was influenced by?). There are existing PDF->SVG translators. Perf is the biggest concern. We want to build the SVG document in the background, without affecting main-thread interactivity. The way to do that is by building the document with a Web Worker thread. The problem is, Workers don't have access to any DOM APIs. We'll probably need to build the document as a string in the background, then send it over to the main thread for parsing. | ||
==== Big project: Text selection ==== | ==== Big project: Text selection ==== | ||
Revision as of 06:06, 19 June 2011
PDF.js is an HTML5-based Portable Document Format renderer.
Project Manager: Pascal Finette
Developers: Andreas Gal (part-time), Chris Jones (part-time), Vivien Nicolas (part-time), Shaon Barman
Repository: https://github.com/andreasgal/pdf.js
IRC: #pdfjs on irc.mozilla.org
Milestone: Big-splash demo
Probably will be of pixel-perfect rendering of tracemonkey paper, with nontrivial UI (i.e. eye candy).
Pixel-perfect rendering
Type1 fontsBitmaps and SMask blendingcanvas.setDash()even-odd fillsaxial shading- TTF fonts (pass the sanitizer)
Non-trivial UI
zoomingpre-rendering pages"continuous" scrolling
Milestone: PDF.js Firefox extension 1.0
Minimum Feature Set
Schedule
(TODO)
Backend
- zooming
- the general idea is that the UI will set a zoom factor, say 200%
- we'll redraw the canvas, but with a scale transform to 2x, and a translation set to move the content we want to fill the screen to the top-left
- draw subpage
- SVG backend
- linearization
- byte range requests
- hyperlinks (hash URLs, intra-doc links)
- perf (use workers for some stuff?)
- color spaces (big, pervasive)
- build something like gecko's display list, for hit testing
- click-on-link (easy)
- text selection (hard)
UI
- animations (page flip, etc.)
- hyperlinks
- page transitions
- dual-page display
- page-transition animations
pan/zoom/next/prev gestures(Edit: Felipe Gomes and cjones discussed a better way to support these, but it will require new web APIs)
Platform
- TextMetrics.maxHeight (to compute more accurate bounding boxes; can approximate without this, though)
- implement text selection in SVG documents
- (determine extent of SVG a11y implementation, if any)
Testing
- reftest-style harness, compare hand-written PDF commands to hand-written canvas (?)
- compare to poppler output, keep list of differences
Analysis
- dump stream info
- dump font info
- dump raster image info
Big project: Color spaces
Approach: map input color values (fillcolor, strokecolor etc.) to output color space. Map input bitmaps to output space with SVG color-matrix filter/WebGL shader program/hand-written JS as available. Problem: will this work correctly for interpolated color values, like intermediate colors in a gradient, and other computed values like the result of composition operators? Does canvas need color-space support? Do we care enough? (What do other PDF renderers do?)
Big project: Hyperlinks
- Parse link data from PDF
- Add UI to highlight/set cursor on link hover
- Implement "go to point X in page Y" interface in backend
- Figure out encoding scheme for absolute links, e.g. http://foo.com/bar.pdf#[encoded link]
Big project: SVG backend
Most of SVG maps well to PDF (was influenced by?). There are existing PDF->SVG translators. Perf is the biggest concern. We want to build the SVG document in the background, without affecting main-thread interactivity. The way to do that is by building the document with a Web Worker thread. The problem is, Workers don't have access to any DOM APIs. We'll probably need to build the document as a string in the background, then send it over to the main thread for parsing.
Big project: Text selection
- Option 1: In SVG backend
- Draw to canvas first. On first selection, switch to SVG-rendered content.
- Let Gecko do all text selection in SVG document
- Option 2: In canvas backend
- Build data structure representing text drawn to screen (e.g., display list/BSP/etc.). For best results, collapse adjacent and same-height/width "text runs".
- Walk data structure and compute textruns at a particular point and/or within a bounding box
- Add UI for "highlighted" text above PDF and saving selected text to clipboard
- Corner cases: clipped text, occluded text, non-white backgrounds, non-black text
- Maybe: render without display-list building first, then on first selection re-interpret PDF to build display list. Or pre-build display list in the background.
Big project: Accessibility
Kind of like text selection, except there's no web-visible accessibility API we could hook with canvas. So
- Somehow detect that a11y is enabled, permanently switch to SVG backend
- Let Gecko implement a11y interfaces
(Possibly) Big project: Vertical text
Somewhat pervasive mode switch in text-drawing code. Is it just a matter of transform hackery to put glyphs in the right place, or do we need canvas support? Canvas support might be a big project.
Utils
To uncompress a PDF
- install pdftk (http://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/)
- run |pdftk foo.pdf output uncompressed.foo.pdf uncompress|
Coding Style
- add a
"use strict";statement (exactly that!) to the top of your JS files
- 2 spaces for indentation. (sbarman: it seems like its 4 currently in pdf.js) (cjones: we're going to fix pdf.js after Type1 fonts merge)
- Line break are free (I promise) don't hesitate to use them to separate logical block inside your functions.
- Adding a toString method to an object to print informations about this particular object to the console is helpful when debugging.
- Be sure to declare a variable with 'var' before using it you don't want to be hurt by random variables living on the global scope.
- Files are named
like_this.js.
Useful resources:
- https://developer.mozilla.org/En/Developer_Guide/Coding_Style#General_Practices
- https://developer.mozilla.org/En/Developer_Guide/Coding_Style#JavaScript_Practices
- https://developer.mozilla.org/En/Developer_Guide/Coding_Style#Naming_and_Formatting_code
Also some particular points (sentence stolen from https://developer.mozilla.org/en/JavaScript_style_guide)
- Don't use object methods and properties more than you have to. It is often faster to store the result in a temporary variable.
If you have to do DOM manipulations (hopefully not!):
- Don't call getAttribute to see if an attribute exists, call hasAttribute instead.
- Prefer to loop through childNodes rather than using first/lastChild with next/previousSibling. But prefer hasChildNodes() to childNodes.length > 0. Similarly prefer document.getElementsByTagName(aTag).item(0) != null to document.getElementsByTagName(aTag).length > 0.
Review (aka pull-request) policy
NBB: this isn't being enforced yet
- New code has to pass all tests (FORTHCOMING)
- New code can't regress performance on (TBD) as measured by (TBD). Unless the new code implements a new feature major enough to suffer a temporary perf regression. This is up to common sense.
- Major new features should have architectural review from (TBD). Less major patches can be reviewed by (TBD).