• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

A comment on XDR and Direct RAMBUS from a locked thread

Panajev2001a

GAF's Pleasant Genius
This guy saying with so many words so little actual stuff and he is getting it wrong quite a bit of time.

You can summarize his post with: theoretical numbers are not matched by sustained perfromance in basically any architecture.

However the memory has a very high latency, I believe it was around 60ns though I could be wrong, more than that the architecture also incurs an additional latency on accessing

With only two devices ( each channel has a single DRAM chip ) and an embedded memory controller on the EE, the latency is quite lower from what you had on the fucked-up RDRAM memory controller on your PC.

Alpha with EV7 went with RDRAM, Alpha guys knew what they were doing.

Bus Width: 32bit
Clock Rate: 1600mhz
Phases: 4
Channels: 2
Size: 128MByte

Formula: 32*1600mhz*4*2/8=51.2GB/sec

The clock-rate is 400 MHz ( base clock ) and is multiplied on the DRAM chip and the memory controller ASIC by a PLL to 1600 MHz: from there we sample data on both edges of the clock ( falling and rising ).

The clock outside of the DRAM chips and the memory controller does not rise to such speed.

By his spiel, the number of phases per clock of XDR would still be 2.


32bit wide bus that is clocked very high at 1.6Ghz (1600mhz) and can actually transfer 2 bits of information on both edges of the clock cycle (this is basically Quad Data Rate RAM)

The bus is not clocked at 1600 MHz, but at 400 MHz.

Also, no... this is Octal Data Rate not QDR.

The physical external clock for XDR that uses a 3.2 data signalling rate is 400 MHz and based on that clock we take the ODR definition.

( 400 MHz * 8 bits per clock * 64 bits data bus ) / 8 bits per byte = 25.6 GB/s


, originally they was quoting a 25.6GB/sec memory bandwidth but they decided they can seperate this into 2 channels much like RDRAM and thus this is where you get the 51.2GB/sec memory bandwidth estimate. However again the memory has a very high latency of around 60ns (this was stated in that document) and because of the very same reasons why the PS2 memory had latency issues that will reoccur in the PS3, and this is also why the majority of the benefits of the extra bandwidth will disappear.

The original 25.6 GB/s was achieved with two channels already as the original Rambus documentation explained: each "channel" had two chips and each chip had a 16 bits interface ( 2 bytes ) for a total of 32 bits per channel and two channels that makes for a 64 bits memory controller.

RAMBUS always made the example that with 3.2 GHz of data signalling rate you needed a 64 bits memory controller.

Do you want more ? Increase the data signalling rate to 6.4 GHz or increase the data bus in 1 byte/8 bits increments... go for a 128 bits/256 data pins configuration.

If he says that XDR and Direct RDRAM have exactly the same latency issue, then he has not done his home-work: having separate address and data busses in XDR instead of multiplexing addresses and data ( and control ) on the same physical bus as Direct RDRAM does will change the data latency.

Also, with the memory controller integrated once again in the CPU the latency will be kept under control ( see Opteron/Athlon 64's memory latency vs Athlon XP's memory latency ).

It is not Sony's fault that earlier Intel chipsets for Direct RDRAM were a bit rushed and that Pentium 4 is such a latency troubled architecture ( thanks to the focus on pure clock-rate, a speed-racer design as they call it ).

A problem with RAMBUS on the PC space was also that people wanted quite a bit of devices open per channel: Direct RDRAM has much less problems with one device per channel, its serial nature does not bite you in the ass.

Look at a PC with Direct RDRAM: say 512 MB of RAM using 512 Mbits modules.

You have 4 modules per channel or 8 per channel if you are running a single channel solution.

XDR uses point to point data links with the memory devices attached and address data has its own bus so that you do not have to inter-leave/multiplex address and data bus on the same lines.

Direct RDRAM uses a serialized bus structure which means that each channel has the same data bus for all the attached memory devices:

Mem controller <---------> dev 1 <--------> dev 2 <-----------> dev 3 <----------->dev 4

You want data from the 4th device ? You have to send the address through the same path that you use to send addresses and receive data to and from device 1: this means that while you phisically send an address to device 4 you cannot receive data from device 1 ( you can of course packetize stuff intelligently in the memory controller to transfer addresses destined to multiple devices one after the other in a small burst, but still you would want to have separate busses ).

Comparison:

rambus1-fig1.gif


Add to that the fact that on the data bus we have to send addresses as well and you can get an idea on why the latency is higher than XDR.


See on Direct RDRAM:

rambus1-fig2.gif


http://realworldtech.com/page.cfm?ArticleID=RWT110799000000&p=2

Only one device can send data at any time on the shared bus.

While on XDR:

1_multipart_xF8FF_5_XDR_sys_top_lg.jpg


You have your own bus for Address and Control and a separate bus for data: each memory device/chip has its own data bus so you could technically receive/send data to each XDR memory chip at basically the same time.

On PS2 though you have this situation:

dev 1 <---------> Memory Controller <--------> dev 2

This structure minimizes the problems due to the nature of Direct RDRAM.

Sure data, address and control commands are still on the same bus, but each memory chip is independent and can work with the memory controller separately.
 
Top Bottom