Gecko:PortlandRendering: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
(Created page with "What Gecko does * On every paint, on the main thread: ** Build fresh display list (tree) for entire window (nsFrame::BuildDisplayList) ** Analyze display list (FrameLayerBuild...")
 
No edit summary
Line 1: Line 1:
What Gecko does
== What Gecko does ==
 
* On every paint, on the main thread:
* On every paint, on the main thread:
** Build fresh display list (tree) for entire window (nsFrame::BuildDisplayList)
** Build fresh display list (tree) for entire window (nsFrame::BuildDisplayList)
Line 14: Line 15:
** Paint invalid areas of PaintedLayers (rasterizing on main thread)
** Paint invalid areas of PaintedLayers (rasterizing on main thread)
** Send all layer changes to compositor
** Send all layer changes to compositor
* Pro: Very flexible layer assignment (full CSS compliance)
* '''Pro''': Very flexible layer assignment (full CSS compliance)
* Pro: Highly optimized layer tree (memory usage and component-alpha avoidance)
* '''Pro''': Highly optimized layer tree (memory usage and component-alpha avoidance)
* Pro: Very precise invalidation; minimal invalidation cost during reflow
* '''Pro''': Very precise invalidation; minimal invalidation cost during reflow
* Con: high overhead for small paints (MazeSolver) (scaling up as displayports grow)
* '''Con''': high overhead for small paints (MazeSolver) (scaling up as displayports grow)
* Con: can trigger invalidation of scrolled content
* '''Con''': can trigger invalidation of scrolled content
* Con: complex
* '''Con''': complex
 
== What Blink does (in Gecko terms) ==


What Blink does (in Gecko terms)
* At frame construction time, assign layers to frames
* At frame construction time, assign layers to frames
* During reflow and at other times, accumulate invalid regions via explicit invalidation
* During reflow and at other times, accumulate invalid regions via explicit invalidation
Line 28: Line 30:
* On another Skia thread:
* On another Skia thread:
** Rasterize paint commands
** Rasterize paint commands
* Pro: low overhead for small paints
* '''Pro''': low overhead for small paints
* Pro: no invalidation of scrolled content
* '''Pro''': no invalidation of scrolled content
* Pro: simpler
* '''Pro''': simpler
* Con: unfixable CSS rendering bugs
* '''Con''': unfixable CSS rendering bugs
* Con: less efficient layer trees generated
* '''Con''': less efficient layer trees generated
* Con: less precise invalidation in some situations
* '''Con''': less precise invalidation in some situations
 
== Where Blink is going: "Slimming Paint" ==


Where Blink is going: "Slimming Paint"
* At paint time, generate display list and send it to the compositor
* At paint time, generate display list and send it to the compositor
** Actually a flat list with start/end markers for container items, in paint order
** Actually a flat list with start/end markers for container items, in paint order
Line 50: Line 53:
** Each SkPicture has relatively high overhead (88/136 bytes)
** Each SkPicture has relatively high overhead (88/136 bytes)
** May end up not using SkPicture
** May end up not using SkPicture
* Pro: may have low overhead for small paints
* '''Pro''': may have low overhead for small paints
** Incremental update of display lists is a struggle.
** Incremental update of display lists is a struggle.
** You can skip processing frame subtrees that are stacking contexts and have no invalid frames
** You can skip processing frame subtrees that are stacking contexts and have no invalid frames
** You can skip processing of display items that don't intersect invalid frames
** You can skip processing of display items that don't intersect invalid frames
* Pro: more efficient layer trees
* '''Pro''': more efficient layer trees
* Pro: CSS compliance
* '''Pro''': CSS compliance
* Con: moving towards Gecko's current scheme in complexity
* '''Con''': moving towards Gecko's current scheme in complexity
* Con: moves work from layout thread to compositor thread. This could be bad especially in multiprocess
* '''Con''': moves work from layout thread to compositor thread. This could be bad especially in multiprocess
http://dev.chromium.org/blink/slimming-paint
http://dev.chromium.org/blink/slimming-paint
https://docs.google.com/document/d/1L6vb9JEPFoyt6eNjVla2AbzSUTGyQT93tQKgE3f1EMc/edit#
https://docs.google.com/document/d/1L6vb9JEPFoyt6eNjVla2AbzSUTGyQT93tQKgE3f1EMc/edit#
https://docs.google.com/presentation/d/1zpGlx75eTNILTGf3s_F6cQP03OGaN2-HACsZwEobMqY/view#slide=id.g40cfae859_045
https://docs.google.com/presentation/d/1zpGlx75eTNILTGf3s_F6cQP03OGaN2-HACsZwEobMqY/view#slide=id.g40cfae859_045


Where Gecko should go (speculative proposal)
== Where Gecko should go (speculative proposal) ==
 
* Recap: our main problems are:
* Recap: our main problems are:
** Performance, especially for small incremental updates
** Performance, especially for small incremental updates
Line 91: Line 95:
*** But limit invalidation to the tiles marked for update
*** But limit invalidation to the tiles marked for update
** Repaint the invalid area of each PaintedLayer
** Repaint the invalid area of each PaintedLayer
* Pro: Preserves flexible, optimized layer assignment and precise invalidation
* '''Pro''': Preserves flexible, optimized layer assignment and precise invalidation
* Pro: Layer assignment on the content thread --- seems simpler, more scalable
* '''Pro''': Layer assignment on the content thread --- seems simpler, more scalable
* Pro: Can be implemented by evolving current code incrementally
* '''Pro''': Can be implemented by evolving current code incrementally
* Pro: More modular design than the current code
* '''Pro''': More modular design than the current code
* Pro: Painting overhead proportional to the number of invalid tiles
* '''Pro''': Painting overhead proportional to the number of invalid tiles
* Con: compared to Slimming Paint, less parallelism (busier main thread)
* '''Con''': compared to Slimming Paint, less parallelism (busier main thread)
* How much work to offload from the main thread is a key issue. Offloading frees up the main thread for parallelism but has more overhead (both perf and in code).
* How much work to offload from the main thread is a key issue. Offloading frees up the main thread for parallelism but has more overhead (both perf and in code).
** Currently I feel keep the compositor thread lean is very important, and a three-thread design would be even more overhead and complexity.
** Currently I feel keep the compositor thread lean is very important, and a three-thread design would be even more overhead and complexity.

Revision as of 02:29, 27 November 2014

What Gecko does

  • On every paint, on the main thread:
    • Build fresh display list (tree) for entire window (nsFrame::BuildDisplayList)
    • Analyze display list (FrameLayerBuilder)
      • Build layers for some items
      • Complex heuristics to choose resolution
      • Accumulate other items into PaintedLayers (buffers)
      • Simple PaintedLayers (solid colors, single image) optimized to ColorLayer/ImageLayer
      • Items assigned to PaintedLayers based on "animated geometry roots" (items placed in the same PaintedLayer when they move together via scrolling etc)
      • Choose resolutions for PaintedLayers
      • Recycle existing layers when possible
      • Deem some layers as "inactive" in which case we create them, but rasterize and composite them into PaintedLayers using the CPU
    • Track changes to items in PaintedLayers to compute precise invalid areas (DLBI)
    • Paint invalid areas of PaintedLayers (rasterizing on main thread)
    • Send all layer changes to compositor
  • Pro: Very flexible layer assignment (full CSS compliance)
  • Pro: Highly optimized layer tree (memory usage and component-alpha avoidance)
  • Pro: Very precise invalidation; minimal invalidation cost during reflow
  • Con: high overhead for small paints (MazeSolver) (scaling up as displayports grow)
  • Con: can trigger invalidation of scrolled content
  • Con: complex

What Blink does (in Gecko terms)

  • At frame construction time, assign layers to frames
  • During reflow and at other times, accumulate invalid regions via explicit invalidation
  • On every paint, on the main thread:
    • Repaint invalid areas using Skia
  • On another Skia thread:
    • Rasterize paint commands
  • Pro: low overhead for small paints
  • Pro: no invalidation of scrolled content
  • Pro: simpler
  • Con: unfixable CSS rendering bugs
  • Con: less efficient layer trees generated
  • Con: less precise invalidation in some situations

Where Blink is going: "Slimming Paint"

  • At paint time, generate display list and send it to the compositor
    • Actually a flat list with start/end markers for container items, in paint order
    • Generic "ContentDisplayItem" contains an SkPicture with drawing commands
    • Lists split by type with indices into the ContentDisplayItem list to track scope of effects
    • Incremental list updates will be supported
    • List includes hints from layout
  • In compositor
    • Group display items into layers
    • Render
    • Invalidation calculation; still crude compared to DLBI
    • Since compositor makes layerization decisions, it handles inactive layers
  • Optimizing SkPicture for small lists of drawing commands
    • Each SkPicture has relatively high overhead (88/136 bytes)
    • May end up not using SkPicture
  • Pro: may have low overhead for small paints
    • Incremental update of display lists is a struggle.
    • You can skip processing frame subtrees that are stacking contexts and have no invalid frames
    • You can skip processing of display items that don't intersect invalid frames
  • Pro: more efficient layer trees
  • Pro: CSS compliance
  • Con: moving towards Gecko's current scheme in complexity
  • Con: moves work from layout thread to compositor thread. This could be bad especially in multiprocess

http://dev.chromium.org/blink/slimming-paint https://docs.google.com/document/d/1L6vb9JEPFoyt6eNjVla2AbzSUTGyQT93tQKgE3f1EMc/edit# https://docs.google.com/presentation/d/1zpGlx75eTNILTGf3s_F6cQP03OGaN2-HACsZwEobMqY/view#slide=id.g40cfae859_045

Where Gecko should go (speculative proposal)

  • Recap: our main problems are:
    • Performance, especially for small incremental updates
    • Complexity
    • Invalidation of painted layers
  • Identify container layers during frame construction and store this as frame state (like old Blink)
    • Potential problem with merging display items for continuation frames of the same element
    • Not too difficult, associate state with first-in-flow
    • I don't think making container-layerization decisions in the compositor makes sense
    • Track layer activity status in frame tree too
      • Computing layer activity in the compositor makes more sense, but I still think it's OK to not be there
      • will-transform anti-abuse measures require some work
      • Need to eliminate BasicLayers component-alpha-avoidance flattening pass
    • With this change, painting and remaining layerization problems can be handled separately per active container layer
  • Set invalid bits on frames when repainting is needed (can be coarse)
  • At paint time, on the main thread, identify active container layers whose contents need to be updated. For each one:
    • Its child active container layers can be obtained from the frame tree, so we only care about descendant content without its own active container layer
    • For each relevant animated geometry root, split its coordinate space into tiles!
    • Create an invalid region
      • For every invalid frame, the tiles of its animated geometry root that it intersects are marked for update
      • The invalid region is the union of all invalid tiles across all animated geometry roots
    • Walk the frame tree to build a display list for the region
      • Only the container layer's frame subtree
      • Skipping frames entirely outside the region or with their own container layers
    • Maintain invalidation state for each PaintedLayer x tile combination (PaintedTile)
    • Walk the display list to assign each display item to a PaintedLayer
      • And run DLBI at the same time
      • But limit invalidation to the tiles marked for update
    • Repaint the invalid area of each PaintedLayer
  • Pro: Preserves flexible, optimized layer assignment and precise invalidation
  • Pro: Layer assignment on the content thread --- seems simpler, more scalable
  • Pro: Can be implemented by evolving current code incrementally
  • Pro: More modular design than the current code
  • Pro: Painting overhead proportional to the number of invalid tiles
  • Con: compared to Slimming Paint, less parallelism (busier main thread)
  • How much work to offload from the main thread is a key issue. Offloading frees up the main thread for parallelism but has more overhead (both perf and in code).
    • Currently I feel keep the compositor thread lean is very important, and a three-thread design would be even more overhead and complexity.
    • I'm comfortable with carrying on doing DL and Layer building on the main thread.
    • Maybe offload rasterization to another thread, but that's compatible with this.
  • Needs a cool name