Here's a write-up I did a while ago that I'll just paste here:
My background is in VLSI design, but I know nothing about graphics programming. But here goes my analysis for the chip and system level design.
1) There are low to mid-range video cards that have both a DDR3 and GDDR5 version. This provides a direct comparison point. In the very low end, I found benchmarks showing that the GDDR5 version is only 10-20% faster. However, in mid-high range cards, the GDDR5 version can be almost 50% faster.
http://ht4u.net/reviews/2012/msi_ra...39.php&usg=ALkJrhi1G4TxkhzXnvN1ZfRJ3KdXukbpQQ
This makes sense. In low end cards, the GPU does not have enough processing power to be significantly bottlenecked by memory bandwidth, but in faster cards it definitely can.
So bandwidth is critical for GPU tasks. That's why high-end video cards use 384-bit wide interfaces to memory while CPU memory interfaces are only 64 bits wide (per channel)! It certainly is not cheap to dedicate that many IO pins to memory, so they do it for a good reason.
Memory bandwidth and latency is not too critical for most CPU tasks though. For the PC builders out there, you can find some benchmarks comparing different memory timings and speeds and in most cases you'd be better off buying a faster video card instead of spending money on better RAM.
2) GDDR5 having much higher latency than DDR3 is a myth that's been constantly perpetuated with no source to back it up. Go look up datasheets of the actual chips and you'll see that the absolute latency has always been the same, at around 10ns. It has been around that since DDR1. Since the data rates have been increasing, the latency in clock cycles has increased but the absolute latency has always been the same. Anyone who wants to argue with me should dig through datasheets to back their claims up.
From Wikipedia: DDR3 PC3-12800 @ IO frequency 800MHz has typical CAS latency 8. This means the absolute latency is 10ns. DDR2 PC2-6400 runs at IO frequency 400MHz, with CAS latency 4. This is also 10 ns.
Here's a typical GDDR5 chip datasheet:
http://www.hynix.com/datasheet/pdf/graphics/H5GQ1H24AFR(Rev1.0).pdf
Here is the table showing CAS latency vs frequency (page 43):
The data rates are a factor of 4x faster than the memory clock. So at a typical 5.0Gbps output data rate, the memory runs at 1.25GHz source: (page 6)
and supports CL latency of 15. This is 15/(1.25 GHz) = 12 ns
3) The Xbox One has additional SRAM cache to improve their bandwidth. However, they needed to dedicate additional silicon area and power budget in the cache and the cache controller. This is definitely a big cost-adder in terms of yield and power budget, but probably not as much as using GDDR5 chips. Chips these days are limited only by the amount of power they can dissipate and everything is a trade-off. By adding complexity in one area, the designer must remove it from another. So Microsoft spent some of their power budget on implementing a cache, while Sony could use it to actually improve the number of GPU cores. And it shows
Who knows how well the Xbox One's cache system will work to catch up to PS4's bandwidth advantage. But it is certainly not going to be _faster_ or simpler. When you're streaming in the huge textures needed for next-gen 720p to 1080p graphics, a 32MB cache is not big enough to constantly provide enough bandwidth.
Also, since the PS4 has more GPU power, it will definitely need all the bandwidth it can get.