|
|
| Line 94: |
Line 94: |
| = Performance implications of "deferred" = | | = Performance implications of "deferred" = |
|
| |
|
| == Draw-calls are very expensive, can only do 50 per frame to get 60 FPS == | | == Draw-calls are expensive == |
|
| |
|
| ...At least on ARM Mali, as ARM said in a session at GDC. They said the reason is that this GPU does "deferred rasterization" and somehow this makes each draw-call very expensive. Need to read ARM documentation carefully to make sense of this and understand to what extent that applies to other GPUs.
| | == Anything that can force immediate framebuffer resolving is expensive == |
|
| |
|
| === Corollary: we should batch draw-calls. That would mean that we group textures into bigger textures. That would be done by glTexSubImage2D. The idea was suggested by ARM people, so it's not crazy. | | === Framebuffer bindings are expensive === |
|
| |
|
| == Multiple passes with FBOs introduce stalls == | | === glCopyTexImage2D is expensive === |
|
| |
|
| Traditional GPU wisdom says that glReadPixels is evil because it introduce stalls. Deferred GPU wisdom says that any multi-pass rendering using a FBO as an intermediate surface also introduces stalls, because it introduces a barrier in how much rendering can be deferred. On the other hand, MRTs (multiple render targets) are said to be deferred-friendly.
| | == Anything that can force saving/restoring framebuffer memory is expensive == |
| | |
| | === Always call glClear immediately after glBindFramebuffer === |
| | |
| | == Overdraw is still expensive on tbd-rast GPUs == |
|
| |
|
| = How bad are we currently? = | | = How bad are we currently? = |