AMD RDNA5 rumors point to AT0 flagship GPU with 512-bit memory bus, 96 Compute Units

Wolzard

Member
GPUShader Arrays (SA)Shader Engines (SE)Compute Units (CU)Memory Controller (UMC)
AT0 GPU816 (2 per SA)96 (6 per SE)16 (32-bit?)
AT0 GPU48 (2 per SA)40 (5 per SE)6 (32-bit?)
AT3 GPU24 (2 per SA)24 (6 per SE)8 (16-bit?)
AT4 GPU12 (2 per SA)12 (6 per SE)4 (16-bit?)

AT0 featuring 96 Compute Units, organized into 8 Shader Arrays. Each Shader Array would contain 16 Shader Engines, with every Shader Engine including 6 Compute Units.
AMD-RDNA5-AT0-KEPLER.jpg

AT2 GPU is said to feature 40 Compute Units, with each Shader Engine containing 5 CUs. Its UMC block (memory controller) is estimated at six, suggesting a 192-bit memory bus, while AT0 is expected to use a 512-bit bus.
AMD-RDNA5-AT2-KEPLER.jpg


AT3, meanwhile, is rumored to have 24 Compute Units. Interestingly, it would have more UMCs than AT2 which may come from AT3 and AT4 being designed with LPDDR5X memory, requiring more controllers. The AT4 could have 12 CUs, with each memory controller on these chips believed to be 16-bit wide.
AMD-RDNA5-AT3-KEPLER.jpg

AMD-RDNA5-AT4-KEPLER.jpg


https://videocardz.com/newz/amd-rdn...-gpu-with-512-bit-memory-bus-96-compute-units
 
Kepler called it a "schizo post".
I don't know what he means by that.

And Videocardz has it mixed up.
It's 2 Shader Arrays per Shader Engine and 6 CUs per Shader Array.
 
What happened to the 152 CU AT0?
After reading through AnandTech forum RDNA 5 / UDNA (CDNA Next) speculation thread.

I saw a post that says, "1 RDNA 4 WGP == 1 RDNA 5 CU"

So it seems that the CU in the diagram represents a WGP but in RDNA5 the WGP (the 2 CUs) are probably now one CU that can act as 1 CU or 2 CUs.

That's how I see it now being the case based on AT2. RDNA always pair the CUs in a WGP, AT2 is uneven with 5 CUs.

I wonder if K KeplerL2 can share some more on RDNA5 CUs.
 
Last edited:
Again, without clockspeed you can't even say that.
Kepler said he doesn't expect Magnus to be clocked below 3 GHz, and that is a 68 CU part that is likely power constrained. In addition, RDNA 2 and RDNA 3 saw a clock speed variance of ~10% or less between the high and low end parts, and we would expect roughly the same clock speed boost for going from N4P to N3P. So even in the worst case, we would expect clock speeds to be around where the 9070 XT already is, assuming the product is not heavily power constrained.

Now you're probably going to say that the architectural changes with RDNA 5 may make the chip clock lower, which is possible but I don't recall ever happening to any significant degree on AMD cards. The rumoured IPC boost is only 5%-10%, which would be wiped out by significantly slower clocks.
 
It is really weird to have a 96 cu gpu and nothing and then a 40 cu gpu. They should have a 84, 72, 64 gpus at least.
The chart goes from AT0 to AT2, do there is a missing AT1.

Edit: If they don't want to do a unique die for a 80 class card, they could just do a cut down of AT0 as well. The CUs are arranged in groups of 12, so a cut down version would presumably have CUs that are a multiple of 12 as well. 72 (12x6) would be perfect, but probably too good.
 
Last edited:
Kepler said he doesn't expect Magnus to be clocked below 3 GHz, and that is a 68 CU part that is likely power constrained. In addition, RDNA 2 and RDNA 3 saw a clock speed variance of ~10% or less between the high and low end parts, and we would expect roughly the same clock speed boost for going from N4P to N3P. So even in the worst case, we would expect clock speeds to be around where the 9070 XT already is, assuming the product is not heavily power constrained.

Now you're probably going to say that the architectural changes with RDNA 5 may make the chip clock lower, which is possible but I don't recall ever happening to any significant degree on AMD cards. The rumoured IPC boost is only 5%-10%, which would be wiped out by significantly slower clocks.

I just wanted you to be explicit about your assumptions. Your take is much more reasonable when fleshed out the way you have here. Thanks.
 
After reading through AnandTech forum RDNA 5 / UDNA (CDNA Next) speculation thread.

I saw a post that says, "1 RDNA 4 WGP == 1 RDNA 5 CU"

So it seems that the CU in the diagram represents a WGP but in RDNA5 the WGP (the 2 CUs) are probably now one CU that can act as 1 CU or 2 CUs.

That's how I see it now being the case based on AT2. RDNA always pair the CUs in a WGP, AT2 is uneven with 5 CUs.

I wonder if K KeplerL2 can share some more on RDNA5 CUs.
MI400 deprecated CU/WGP distinction, now ony CU mode is supported with WGP-sized structures.
 
It is really weird to have a 96 cu gpu and nothing and then a 40 cu gpu. They should have a 84, 72, 64 gpus at least.
They already have 4 chips, that's double what they have now. They're probably very confident on the 40 cu part being capable of competing with the 6070. The 96cu part will be cut down in 3 ways so that's 3 cards from 1 chip like the 7900xtx.
 
MI400 deprecated CU/WGP distinction, now ony CU mode is supported with WGP-sized structures.
Is this only for CDNA5 or is this also applied to RDNA5.

Just wondering cause MLiD doubled down on his leak.
YeFYqaNocr5p5w7V.jpg


I would assume both is correct. AT0 full die being 96/192 if we take into account the CU/WGP distinction.
 
They already have 4 chips, that's double what they have now. They're probably very confident on the 40 cu part being capable of competing with the 6070. The 96cu part will be cut down in 3 ways so that's 3 cards from 1 chip like the 7900xtx.
I'm interested in knowing what ways the AT0 chip will be cut down. I guessed that they could do a 60 cu card or a 72 cu card, but I have no real idea.
 
Top Bottom