Gecko:PortlandRendering: Difference between revisions
Jump to navigation
Jump to search
(Created page with "What Gecko does * On every paint, on the main thread: ** Build fresh display list (tree) for entire window (nsFrame::BuildDisplayList) ** Analyze display list (FrameLayerBuild...") |
No edit summary |
||
| Line 1: | Line 1: | ||
What Gecko does | == What Gecko does == | ||
* On every paint, on the main thread: | * On every paint, on the main thread: | ||
** Build fresh display list (tree) for entire window (nsFrame::BuildDisplayList) | ** Build fresh display list (tree) for entire window (nsFrame::BuildDisplayList) | ||
| Line 14: | Line 15: | ||
** Paint invalid areas of PaintedLayers (rasterizing on main thread) | ** Paint invalid areas of PaintedLayers (rasterizing on main thread) | ||
** Send all layer changes to compositor | ** Send all layer changes to compositor | ||
* Pro: Very flexible layer assignment (full CSS compliance) | * '''Pro''': Very flexible layer assignment (full CSS compliance) | ||
* Pro: Highly optimized layer tree (memory usage and component-alpha avoidance) | * '''Pro''': Highly optimized layer tree (memory usage and component-alpha avoidance) | ||
* Pro: Very precise invalidation; minimal invalidation cost during reflow | * '''Pro''': Very precise invalidation; minimal invalidation cost during reflow | ||
* Con: high overhead for small paints (MazeSolver) (scaling up as displayports grow) | * '''Con''': high overhead for small paints (MazeSolver) (scaling up as displayports grow) | ||
* Con: can trigger invalidation of scrolled content | * '''Con''': can trigger invalidation of scrolled content | ||
* Con: complex | * '''Con''': complex | ||
== What Blink does (in Gecko terms) == | |||
* At frame construction time, assign layers to frames | * At frame construction time, assign layers to frames | ||
* During reflow and at other times, accumulate invalid regions via explicit invalidation | * During reflow and at other times, accumulate invalid regions via explicit invalidation | ||
| Line 28: | Line 30: | ||
* On another Skia thread: | * On another Skia thread: | ||
** Rasterize paint commands | ** Rasterize paint commands | ||
* Pro: low overhead for small paints | * '''Pro''': low overhead for small paints | ||
* Pro: no invalidation of scrolled content | * '''Pro''': no invalidation of scrolled content | ||
* Pro: simpler | * '''Pro''': simpler | ||
* Con: unfixable CSS rendering bugs | * '''Con''': unfixable CSS rendering bugs | ||
* Con: less efficient layer trees generated | * '''Con''': less efficient layer trees generated | ||
* Con: less precise invalidation in some situations | * '''Con''': less precise invalidation in some situations | ||
== Where Blink is going: "Slimming Paint" == | |||
* At paint time, generate display list and send it to the compositor | * At paint time, generate display list and send it to the compositor | ||
** Actually a flat list with start/end markers for container items, in paint order | ** Actually a flat list with start/end markers for container items, in paint order | ||
| Line 50: | Line 53: | ||
** Each SkPicture has relatively high overhead (88/136 bytes) | ** Each SkPicture has relatively high overhead (88/136 bytes) | ||
** May end up not using SkPicture | ** May end up not using SkPicture | ||
* Pro: may have low overhead for small paints | * '''Pro''': may have low overhead for small paints | ||
** Incremental update of display lists is a struggle. | ** Incremental update of display lists is a struggle. | ||
** You can skip processing frame subtrees that are stacking contexts and have no invalid frames | ** You can skip processing frame subtrees that are stacking contexts and have no invalid frames | ||
** You can skip processing of display items that don't intersect invalid frames | ** You can skip processing of display items that don't intersect invalid frames | ||
* Pro: more efficient layer trees | * '''Pro''': more efficient layer trees | ||
* Pro: CSS compliance | * '''Pro''': CSS compliance | ||
* Con: moving towards Gecko's current scheme in complexity | * '''Con''': moving towards Gecko's current scheme in complexity | ||
* Con: moves work from layout thread to compositor thread. This could be bad especially in multiprocess | * '''Con''': moves work from layout thread to compositor thread. This could be bad especially in multiprocess | ||
http://dev.chromium.org/blink/slimming-paint | http://dev.chromium.org/blink/slimming-paint | ||
https://docs.google.com/document/d/1L6vb9JEPFoyt6eNjVla2AbzSUTGyQT93tQKgE3f1EMc/edit# | https://docs.google.com/document/d/1L6vb9JEPFoyt6eNjVla2AbzSUTGyQT93tQKgE3f1EMc/edit# | ||
https://docs.google.com/presentation/d/1zpGlx75eTNILTGf3s_F6cQP03OGaN2-HACsZwEobMqY/view#slide=id.g40cfae859_045 | https://docs.google.com/presentation/d/1zpGlx75eTNILTGf3s_F6cQP03OGaN2-HACsZwEobMqY/view#slide=id.g40cfae859_045 | ||
Where Gecko should go (speculative proposal) | == Where Gecko should go (speculative proposal) == | ||
* Recap: our main problems are: | * Recap: our main problems are: | ||
** Performance, especially for small incremental updates | ** Performance, especially for small incremental updates | ||
| Line 91: | Line 95: | ||
*** But limit invalidation to the tiles marked for update | *** But limit invalidation to the tiles marked for update | ||
** Repaint the invalid area of each PaintedLayer | ** Repaint the invalid area of each PaintedLayer | ||
* Pro: Preserves flexible, optimized layer assignment and precise invalidation | * '''Pro''': Preserves flexible, optimized layer assignment and precise invalidation | ||
* Pro: Layer assignment on the content thread --- seems simpler, more scalable | * '''Pro''': Layer assignment on the content thread --- seems simpler, more scalable | ||
* Pro: Can be implemented by evolving current code incrementally | * '''Pro''': Can be implemented by evolving current code incrementally | ||
* Pro: More modular design than the current code | * '''Pro''': More modular design than the current code | ||
* Pro: Painting overhead proportional to the number of invalid tiles | * '''Pro''': Painting overhead proportional to the number of invalid tiles | ||
* Con: compared to Slimming Paint, less parallelism (busier main thread) | * '''Con''': compared to Slimming Paint, less parallelism (busier main thread) | ||
* How much work to offload from the main thread is a key issue. Offloading frees up the main thread for parallelism but has more overhead (both perf and in code). | * How much work to offload from the main thread is a key issue. Offloading frees up the main thread for parallelism but has more overhead (both perf and in code). | ||
** Currently I feel keep the compositor thread lean is very important, and a three-thread design would be even more overhead and complexity. | ** Currently I feel keep the compositor thread lean is very important, and a three-thread design would be even more overhead and complexity. | ||
Revision as of 02:29, 27 November 2014
What Gecko does
- On every paint, on the main thread:
- Build fresh display list (tree) for entire window (nsFrame::BuildDisplayList)
- Analyze display list (FrameLayerBuilder)
- Build layers for some items
- Complex heuristics to choose resolution
- Accumulate other items into PaintedLayers (buffers)
- Simple PaintedLayers (solid colors, single image) optimized to ColorLayer/ImageLayer
- Items assigned to PaintedLayers based on "animated geometry roots" (items placed in the same PaintedLayer when they move together via scrolling etc)
- Choose resolutions for PaintedLayers
- Recycle existing layers when possible
- Deem some layers as "inactive" in which case we create them, but rasterize and composite them into PaintedLayers using the CPU
- Track changes to items in PaintedLayers to compute precise invalid areas (DLBI)
- Paint invalid areas of PaintedLayers (rasterizing on main thread)
- Send all layer changes to compositor
- Pro: Very flexible layer assignment (full CSS compliance)
- Pro: Highly optimized layer tree (memory usage and component-alpha avoidance)
- Pro: Very precise invalidation; minimal invalidation cost during reflow
- Con: high overhead for small paints (MazeSolver) (scaling up as displayports grow)
- Con: can trigger invalidation of scrolled content
- Con: complex
What Blink does (in Gecko terms)
- At frame construction time, assign layers to frames
- During reflow and at other times, accumulate invalid regions via explicit invalidation
- On every paint, on the main thread:
- Repaint invalid areas using Skia
- On another Skia thread:
- Rasterize paint commands
- Pro: low overhead for small paints
- Pro: no invalidation of scrolled content
- Pro: simpler
- Con: unfixable CSS rendering bugs
- Con: less efficient layer trees generated
- Con: less precise invalidation in some situations
Where Blink is going: "Slimming Paint"
- At paint time, generate display list and send it to the compositor
- Actually a flat list with start/end markers for container items, in paint order
- Generic "ContentDisplayItem" contains an SkPicture with drawing commands
- Lists split by type with indices into the ContentDisplayItem list to track scope of effects
- Incremental list updates will be supported
- List includes hints from layout
- In compositor
- Group display items into layers
- Render
- Invalidation calculation; still crude compared to DLBI
- Since compositor makes layerization decisions, it handles inactive layers
- Optimizing SkPicture for small lists of drawing commands
- Each SkPicture has relatively high overhead (88/136 bytes)
- May end up not using SkPicture
- Pro: may have low overhead for small paints
- Incremental update of display lists is a struggle.
- You can skip processing frame subtrees that are stacking contexts and have no invalid frames
- You can skip processing of display items that don't intersect invalid frames
- Pro: more efficient layer trees
- Pro: CSS compliance
- Con: moving towards Gecko's current scheme in complexity
- Con: moves work from layout thread to compositor thread. This could be bad especially in multiprocess
http://dev.chromium.org/blink/slimming-paint https://docs.google.com/document/d/1L6vb9JEPFoyt6eNjVla2AbzSUTGyQT93tQKgE3f1EMc/edit# https://docs.google.com/presentation/d/1zpGlx75eTNILTGf3s_F6cQP03OGaN2-HACsZwEobMqY/view#slide=id.g40cfae859_045
Where Gecko should go (speculative proposal)
- Recap: our main problems are:
- Performance, especially for small incremental updates
- Complexity
- Invalidation of painted layers
- Identify container layers during frame construction and store this as frame state (like old Blink)
- Potential problem with merging display items for continuation frames of the same element
- Not too difficult, associate state with first-in-flow
- I don't think making container-layerization decisions in the compositor makes sense
- Track layer activity status in frame tree too
- Computing layer activity in the compositor makes more sense, but I still think it's OK to not be there
- will-transform anti-abuse measures require some work
- Need to eliminate BasicLayers component-alpha-avoidance flattening pass
- With this change, painting and remaining layerization problems can be handled separately per active container layer
- Set invalid bits on frames when repainting is needed (can be coarse)
- At paint time, on the main thread, identify active container layers whose contents need to be updated. For each one:
- Its child active container layers can be obtained from the frame tree, so we only care about descendant content without its own active container layer
- For each relevant animated geometry root, split its coordinate space into tiles!
- Create an invalid region
- For every invalid frame, the tiles of its animated geometry root that it intersects are marked for update
- The invalid region is the union of all invalid tiles across all animated geometry roots
- Walk the frame tree to build a display list for the region
- Only the container layer's frame subtree
- Skipping frames entirely outside the region or with their own container layers
- Maintain invalidation state for each PaintedLayer x tile combination (PaintedTile)
- Walk the display list to assign each display item to a PaintedLayer
- And run DLBI at the same time
- But limit invalidation to the tiles marked for update
- Repaint the invalid area of each PaintedLayer
- Pro: Preserves flexible, optimized layer assignment and precise invalidation
- Pro: Layer assignment on the content thread --- seems simpler, more scalable
- Pro: Can be implemented by evolving current code incrementally
- Pro: More modular design than the current code
- Pro: Painting overhead proportional to the number of invalid tiles
- Con: compared to Slimming Paint, less parallelism (busier main thread)
- How much work to offload from the main thread is a key issue. Offloading frees up the main thread for parallelism but has more overhead (both perf and in code).
- Currently I feel keep the compositor thread lean is very important, and a three-thread design would be even more overhead and complexity.
- I'm comfortable with carrying on doing DL and Layer building on the main thread.
- Maybe offload rasterization to another thread, but that's compatible with this.
- Needs a cool name