Platform/GFX/DeferredGPUs: Difference between revisions

Jump to navigation Jump to search
Line 94: Line 94:
= Performance implications of "deferred" =
= Performance implications of "deferred" =


== Draw-calls are very expensive, can only do 50 per frame to get 60 FPS ==
== Draw-calls are expensive ==


...At least on ARM Mali, as ARM said in a session at GDC. They said the reason is that this GPU does "deferred rasterization" and somehow this makes each draw-call very expensive. Need to read ARM documentation carefully to make sense of this and understand to what extent that applies to other GPUs.
== Anything that can force immediate framebuffer resolving is expensive ==


=== Corollary: we should batch draw-calls. That would mean that we group textures into bigger textures. That would be done by glTexSubImage2D. The idea was suggested by ARM people, so it's not crazy.
=== Framebuffer bindings are expensive ===


== Multiple passes with FBOs introduce stalls ==
=== glCopyTexImage2D is expensive ===


Traditional GPU wisdom says that glReadPixels is evil because it introduce stalls. Deferred GPU wisdom says that any multi-pass rendering using a FBO as an intermediate surface also introduces stalls, because it introduces a barrier in how much rendering can be deferred. On the other hand, MRTs (multiple render targets) are said to be deferred-friendly.
== Anything that can force saving/restoring framebuffer memory is expensive ==
 
=== Always call glClear immediately after glBindFramebuffer ===
 
== Overdraw is still expensive on tbd-rast GPUs ==


= How bad are we currently? =
= How bad are we currently? =
Confirmed users
753

edits

Navigation menu