Platform/GFX/DeferredGPUs: Difference between revisions

Platform/GFX/DeferredGPUs (view source)

599 bytes removed , 9 April 2013

Confirmed users

753

edits

@@ Line 94: / Line 94: @@
 = Performance implications of "deferred" =
-== Draw-calls are very expensive, can only do 50 per frame to get 60 FPS ==
+== Draw-calls are expensive ==
-...At least on ARM Mali, as ARM said in a session at GDC. They said the reason is that this GPU does "deferred rasterization" and somehow this makes each draw-call very expensive. Need to read ARM documentation carefully to make sense of this and understand to what extent that applies to other GPUs.
+== Anything that can force immediate framebuffer resolving is expensive ==
-=== Corollary: we should batch draw-calls. That would mean that we group textures into bigger textures. That would be done by glTexSubImage2D. The idea was suggested by ARM people, so it's not crazy.
+=== Framebuffer bindings are expensive ===
-== Multiple passes with FBOs introduce stalls ==
+=== glCopyTexImage2D is expensive ===
-Traditional GPU wisdom says that glReadPixels is evil because it introduce stalls. Deferred GPU wisdom says that any multi-pass rendering using a FBO as an intermediate surface also introduces stalls, because it introduces a barrier in how much rendering can be deferred. On the other hand, MRTs (multiple render targets) are said to be deferred-friendly.
+== Anything that can force saving/restoring framebuffer memory is expensive ==
+=== Always call glClear immediately after glBindFramebuffer ===
+== Overdraw is still expensive on tbd-rast GPUs ==
 = How bad are we currently? =