Confirmed users
753
edits
| Line 87: | Line 87: | ||
|Optimizations discussed below for deferred GPUs; the only difference is that there is no need for front-to-back sorting, as HSR is efficiently handled by hardware. | |Optimizations discussed below for deferred GPUs; the only difference is that there is no need for front-to-back sorting, as HSR is efficiently handled by hardware. | ||
|} | |} | ||
In a '''tbd-rast''' GPU, upon submitting geometry, vertex shaders are run and resulting triangles are clipped, but instead of proceeding further down the pipeline as an immediate renderer would, the resulting triangles are only recorded in tile-specific triangle lists. The actual rasterization of the triangles in each tile is delayed until the frame needs to be resolved, whence the name: ''tile-based deferred rasterization''. Deferring rasterization until all the triangles in a given tile are known, allows tbd-rast GPUs to achieve higher efficiency, if only through higher cache coherency of framebuffer accesses --- in practice, the tile size is small enough that the framebuffer tile will fit in cache memory, considerably limiting framebuffer memory bandwidth. There probably are more gains too, although they will depend on GPU specifics. For example, deferred rendering may allow GPUs to sort primitives by textures, achieving higher texture cache coherency. | |||
All the same applies to '''tbd-hsr''' GPUs such as PowerVR's, which are similar to '''tbd-rast''' GPUs except for an additional optimization they they automatically perform: when a '''tbd-hsr''' is about to start rasterizing the triangles in a given tile, it will first identify for each fragment which primitives may be visible: see Section 4.4 in [http://www.imgtec.com/powervr/insider/docs/POWERVR%20Series5%20Graphics.SGX%20architecture%20guide%20for%20developers.1.0.8.External.pdf this PowerVR document]. What this means in practice is that a '''tbd-hsr''' GPU will be equally efficient regardless of the ordering of opaque primitives, whereas other types of GPUs will perform better if opaque geometry is submitted in front-to-back order. | |||
= Performance implications of "deferred" = | = Performance implications of "deferred" = | ||