Gecko:PortlandRendering

From MozillaWiki
Jump to: navigation, search

What Gecko does

  • On every paint, on the main thread:
    • Build fresh display list (tree) for entire window (nsFrame::BuildDisplayList)
    • Analyze display list (FrameLayerBuilder)
      • Build layers for some items
      • Complex heuristics to choose resolution
      • Accumulate other items into PaintedLayers (buffers)
      • Simple PaintedLayers (solid colors, single image) optimized to ColorLayer/ImageLayer
      • Items assigned to PaintedLayers based on "animated geometry roots" (items placed in the same PaintedLayer when they move together via scrolling etc)
      • Choose resolutions for PaintedLayers
      • Recycle existing layers when possible
      • Deem some layers as "inactive" in which case we create them, but rasterize and composite them into PaintedLayers using the CPU
    • Track changes to items in PaintedLayers to compute precise invalid areas (DLBI)
    • Paint invalid areas of PaintedLayers (rasterizing on main thread)
    • Send all layer changes to compositor
  • Pro: Very flexible layer assignment (full CSS compliance)
  • Pro: Highly optimized layer tree (memory usage and component-alpha avoidance)
  • Pro: Very precise invalidation; minimal invalidation cost during reflow
  • Con: high overhead for small paints (MazeSolver) (scaling up as displayports grow)
  • Con: can trigger invalidation of scrolled content
  • Con: complex

What Blink does (in Gecko terms)

  • At frame construction time, assign layers to frames
  • During reflow and at other times, accumulate invalid regions via explicit invalidation
  • On every paint, on the main thread:
    • Repaint invalid areas using Skia
  • On another Skia thread:
    • Rasterize paint commands
  • Pro: low overhead for small paints
  • Pro: no invalidation of scrolled content
  • Pro: simpler
  • Con: unfixable CSS rendering bugs
  • Con: less efficient layer trees generated
  • Con: less precise invalidation in some situations

Where Blink is going: "Slimming Paint"

  • At paint time, generate display list and send it to the compositor
    • Actually a flat list with start/end markers for container items, in paint order
    • Generic "ContentDisplayItem" contains an SkPicture with drawing commands
    • Lists split by type with indices into the ContentDisplayItem list to track scope of effects
    • Incremental list updates will be supported
    • List includes hints from layout
  • In compositor
    • Group display items into layers
    • Render
    • Invalidation calculation; still crude compared to DLBI
    • Since compositor makes layerization decisions, it handles inactive layers
  • Optimizing SkPicture for small lists of drawing commands
    • Each SkPicture has relatively high overhead (88/136 bytes)
    • May end up not using SkPicture
  • Pro: may have low overhead for small paints
    • Incremental update of display lists is a struggle.
    • You can skip processing frame subtrees that are stacking contexts and have no invalid frames
    • You can skip processing of display items that don't intersect invalid frames
  • Pro: more efficient layer trees
  • Pro: CSS compliance
  • Con: moving towards Gecko's current scheme in complexity
  • Con: moves work from layout thread to compositor thread. This could be bad especially in multiprocess

http://dev.chromium.org/blink/slimming-paint https://docs.google.com/document/d/1L6vb9JEPFoyt6eNjVla2AbzSUTGyQT93tQKgE3f1EMc/edit# https://docs.google.com/presentation/d/1zpGlx75eTNILTGf3s_F6cQP03OGaN2-HACsZwEobMqY/view#slide=id.g40cfae859_045

Where Gecko should go (speculative proposal)

  • Recap: our main problems are:
    • Performance, especially for small incremental updates
    • Complexity
    • Invalidation of scrolled layers
  • Identify container layers during frame construction and store this as frame state (like old Blink)
    • Potential problem with merging display items for continuation frames of the same element
    • Not too difficult, associate state with first-in-flow
    • I don't think making container-layerization decisions in the compositor makes sense
    • Track layer activity status in frame tree too
      • Computing layer activity in the compositor makes more sense, but I still think it's OK to not be there
      • will-transform anti-abuse measures require some work
      • Need to eliminate BasicLayers component-alpha-avoidance flattening pass
    • With this change, painting and remaining layerization problems can be handled separately per active container layer
  • Set invalid bits on frames when repainting is needed (can be coarse)
  • At paint time, on the main thread, identify active container layers whose contents need to be updated. For each one:
    • Its child active container layers can be obtained from the frame tree, so we only care about descendant content without its own active container layer
    • For each relevant animated geometry root, split its coordinate space into tiles!
    • Create an invalid region
      • For every invalid frame, the tiles of its animated geometry root that it intersects are marked for update
      • The invalid region is the union of all invalid tiles across all animated geometry roots
    • Walk the frame tree to build a display list for the region
      • Only the container layer's frame subtree
      • Skipping frames with their own container layers or that are entirely outside the dirty region (and will stay outside regardless of any async geometry changes!)
    • Maintain invalidation state for each PaintedLayer x tile combination (PaintedTile)
    • Walk the display list to assign each display item to a PaintedLayer
      • And run DLBI at the same time
      • But limit invalidation to the tiles marked for update
    • Repaint the invalid area of each PaintedLayer
  • Pro: Preserves flexible, optimized layer assignment and precise invalidation
  • Pro: Layer assignment on the content thread --- seems simpler, more scalable
  • Pro: Can be implemented by evolving current code incrementally
  • Pro: More modular design than the current code
  • Pro: Painting overhead proportional to the number of invalid tiles
  • Con: compared to Slimming Paint, less parallelism (busier main thread)
  • How much work to offload from the main thread is a key issue. Offloading frees up the main thread for parallelism but has more overhead (both perf and in code).
    • Currently I feel keep the compositor thread lean is very important, and a three-thread design would be even more overhead and complexity.
    • I'm comfortable with carrying on doing DL and Layer building on the main thread.
    • Maybe offload rasterization to another thread, but that's compatible with this.
  • Needs a cool name