Even though graphics are displayed two-dimensionally as pixels, the common approach to drawing them is three-dimensionally with polygons. This tendency results because most graphics processors immediately start on the rendering phase of the job of handling the stream of polygonal graphics information being sent to it by the T&L unit despite the fact that such data is unordered. Because of the nature of 3D, polygons can be positioned behind and obscured by other polygons, and rendering them from an unordered stream of data will produce pixels that get drawn over others which had already been drawn, wasting the work previously done to produce them.
To avoid this unnecessary work, a fundamentally suited approach to graphics processing must be used. The most front-lying or visible pixels have to be identified for drawing, but this is a very involved procedure. It wouldnt get completed in time unless it was done very quickly.
The factor that most limits performance in computing is the slowness at which data can be moved around, not the speed of calculation surprisingly. There are hard limits to the transfer rates of data, so the key to good performance is in minimizing the need to access information which resides externally.
Therefore, in order to quickly enough identify which pixels are visible, the procedure needs to be executed within the processors core. The amount of memory which can fit inside a core is not nearly large enough to hold all of the graphics data, however.
The job has to get handled in separate pieces, so the target space, the full area of the screen, must be split up into small enough tiles.
Determining in which tile(s) each bit of polygon information belongs from the incoming graphics data is not immediately possible because the information is unordered. So, the stream of graphics data needs to be fully compiled and interpreted into lists that correspond to the appropriate tiles of screen area.
After the scene has been compiled and split into manageably sized tile lists to fit within the graphics core, processing can then be fast enough to determine only the visible pixels from the image. The image can be rendered from there, and only the final pixels will ever need to be drawn.
By rendering only the final image mostly from within the graphics core, the severest limitations to computing performance, external dependencies, are addressed. This results in minimized external data traffic which allows the system to use less expensive memory types, and therefore be more cost effective, and which also consumes less battery/socket power. It results in rendering that occurs at the high internal precision without compromise to external framebuffer settings, raising the image quality for tasks like color blending and flexibility for object depth sorting. It allows for extra samples of the image to be taken for anti-aliasing without requiring more from framebuffer memory. Also, overall operation becomes more effective since there is a high locality kept among the data being processed.
There is more overhead in chip logic to implement this tile-based deferred rendering process, but the savings in fillrate from not having to overdraw any obscured pixels means less pixel pipelines and texture mapping units can be used to achieve the same effective performance, thereby counteracting any overhead in chip size.
PowerVR are a designer of graphics architectures of this type. Their mobile graphics accelerator, MBX, was released in 2003 and is the solution chosen by almost every major chip maker in the world, so it's been driving the visuals of several products over the past year and will be behind many future cell phones, PDAs, in-car GPS/infotainment systems, and other embedded and mobile devices.
To avoid this unnecessary work, a fundamentally suited approach to graphics processing must be used. The most front-lying or visible pixels have to be identified for drawing, but this is a very involved procedure. It wouldnt get completed in time unless it was done very quickly.
The factor that most limits performance in computing is the slowness at which data can be moved around, not the speed of calculation surprisingly. There are hard limits to the transfer rates of data, so the key to good performance is in minimizing the need to access information which resides externally.
Therefore, in order to quickly enough identify which pixels are visible, the procedure needs to be executed within the processors core. The amount of memory which can fit inside a core is not nearly large enough to hold all of the graphics data, however.
The job has to get handled in separate pieces, so the target space, the full area of the screen, must be split up into small enough tiles.
Determining in which tile(s) each bit of polygon information belongs from the incoming graphics data is not immediately possible because the information is unordered. So, the stream of graphics data needs to be fully compiled and interpreted into lists that correspond to the appropriate tiles of screen area.
After the scene has been compiled and split into manageably sized tile lists to fit within the graphics core, processing can then be fast enough to determine only the visible pixels from the image. The image can be rendered from there, and only the final pixels will ever need to be drawn.
By rendering only the final image mostly from within the graphics core, the severest limitations to computing performance, external dependencies, are addressed. This results in minimized external data traffic which allows the system to use less expensive memory types, and therefore be more cost effective, and which also consumes less battery/socket power. It results in rendering that occurs at the high internal precision without compromise to external framebuffer settings, raising the image quality for tasks like color blending and flexibility for object depth sorting. It allows for extra samples of the image to be taken for anti-aliasing without requiring more from framebuffer memory. Also, overall operation becomes more effective since there is a high locality kept among the data being processed.
There is more overhead in chip logic to implement this tile-based deferred rendering process, but the savings in fillrate from not having to overdraw any obscured pixels means less pixel pipelines and texture mapping units can be used to achieve the same effective performance, thereby counteracting any overhead in chip size.
PowerVR are a designer of graphics architectures of this type. Their mobile graphics accelerator, MBX, was released in 2003 and is the solution chosen by almost every major chip maker in the world, so it's been driving the visuals of several products over the past year and will be behind many future cell phones, PDAs, in-car GPS/infotainment systems, and other embedded and mobile devices.