Specifications
On the chip, the shader units are organized in three SIMD groups with 16 processors per group, for a total of 48 processors. Each of these processors is composed of a 5-wide vector unit (total 5 FP32 ALUs) that can serially execute up to two instruction per cycle (a multiply and an addition). Thus each of the 48 processors can perform 10 floating-point ops per cycle. All processor in a SIMD group execute the same instruction, so in total up to three instruction threads can be simultaneously under execution.
500 MHz 10 MiB daughter embedded DRAM (@256GB/s) framebuffer on 90 nm process[citation needed].
NEC designed eDRAM die includes additional logic (192 parallel pixel processors) for color, alpha compositing, Z/stencil buffering, and anti-aliasing called Intelligent Memory, giving developers 4-sample anti-aliasing at very little performance cost.
105 million transistors [2]
8 Render Output units
Maximum pixel fillrate: 16 gigasamples per second fillrate using 4X multisample anti aliasing (MSAA), or 32 gigasamples using Z-only operation; 4 gigapixels per second without MSAA (8 ROPs × 500 MHz)
Maximum Z sample rate: 8 gigasamples per second (2 Z samples × 8 ROPs × 500 MHz), 32 gigasamples per second using 4X anti aliasing (2 Z samples × 8 ROPs × 4X AA × 500 MHz)[1]
Maximum anti-aliasing sample rate: 16 gigasamples per second (4 AA samples × 8 ROPs × 500 MHz)[1]
500 MHz parent GPU on 90 nm , 65 nm or 45nm TSMC process of total 232 million transistors
48 floating-point vector processors for shader execution, divided in three dynamically scheduled SIMD groups of 16 processors each. [3]
Unified shading architecture (each pipeline is capable of running either pixel or vertex shaders)
10 FP ops per vector processor per cycle (5 fused multiply-add)
Maximum vertex count: 6 billion vertices per second ( (48 shader vector processors × 2 ops per cycle × 500 MHz) / 8 vector ops per vertex) for simple transformed and lit polygons
Maximum polygon count: 500 million triangles per second[3]
Maximum shader operations: 96 Billion shader operations per second (3 shader pipelines*16 processors*4 ALUs*500 MHz)
240GFLOPS
MEMEXPORT shader function
16 texture filtering units (TF) and 16 texture addressing unit (TA)
16 filtered samples per clock
Maximum texel fillrate: 8 gigatexel per second (16 textures × 500 MHz)
16 unfiltered texture samples per clock
Maximum Dot product operations: 24 billion per second
Support for a superset of DirectX 9.0c API DirectX Xbox 360, and Shader Model 3.0+
Cooling: Both the GPU and CPU of the console have heatsinks. The CPU's heatsink uses heatpipe technology, to conduct heat from the CPU to the fins of the heatsink. The heatsinks are actively cooled by a pair of 60 mm exhaust fans. The new XCGPU chipset redesign is featured in the Xbox 360 S and integrates the CPU (Xenon) and GPU (Xenos) in one chip and is actively cooled by a single heatsink rather than two.