Microsoft Xbox Series X's AMD Architecture Deep Dive at Hot Chips 2020

M1chl

Currently Gif and Meme Champion
mU23LQCtEe9ePVZe3DpU7S-650-80.jpg.webp


So Zen2 with 8MB L3 cache, not that much tho. CPU is probably around AMD RYZEN 7 PRO 4750G .

qxkCgRwh6Y5oJ5wLDtgA9L-650-80.jpg.webp


fqvK7bgMNGxQdNKNnHKZHQ-1366-80.png


Seems like it's not going to be cheap:

u6DpdwnZaBFNAYjjjMijVN-650-80.jpg


Seems like MS have some Audio secret sauce too:
GkrDv6ja9TiDYz3incXznN-970-80.jpg.webp


More slides here:
 
Last edited:
Can't.....help......but......feel.....like.....posting.....a.....Craig......meme.......ughhhhh!!!

For real though, interested to see someone break the presentation down to laymen terms for us to understand the gist of the power :messenger_ok:
 
Interesting that they are still using Gen 4 pcie but cut down to 8 lanes instead of the full fat 16 lanes. I think most people assumed they were going with pcie gen 3 x16.

For comparison the 3700x has a total L3 cache of 32mb (16mb per ccx).
 
Last edited:
I would love to see how this differs to the PS5.

Well at least on the audio we know PS5's Tempest Engine is a reworked CU so roughly 285 GFLOPs raw performance there and able to take up to 20 GB/s of system bandwidth

OTOH Series X's audio solution being equivalent to One X CPU would put it at around 148 GFLOPs if this chart is to be believed. That's raw numbers, anyway. They say it's actually greater though, so it would be at least 150 GFLOPs of audio performance here, and then you also need to take architectural gains into account. I don't know the IPC gains Zen 2 has over Jaguar, but assuming it's same as RDNA1 over GCN, that'd be a 50% IPC gain, so it would be equivalent to 222 - 225 GFLOPs of the One X CPU for their audio solution (on the low end, it could be more than that in actuality).

So not quite as capable as Sony's but not far off, either. They're within great range of one another and that's just off of what I could analyze very quickly.
 
Last edited:
The main question is : Can it run Resident Evil VIII at 4k?

I refuse to believe there is a computer on the earth that can run it above 15fps, we've seen all the evidence we need to see to say the PS5 is shite. Case. Closed.

Until the XSX comes out of course, then we'll see what 120hz Resident Evil looks like. Probably 240-360hz, given the awesome power of the SX.
 
Interesting that they are still using Gen 4 pcie but cut down to 8 lanes instead of the full fat 16 lanes. I think most people assumed they were going with pcie gen 3 x16.

There's no need, NVMe SSDs are only 4 lanes each and there are up to two of them in the system. But yes, they potentially could in future use faster SSDs but they could end up being bottlenecked by other bits of the hardware.
 
Interesting that they are still using Gen 4 pcie but cut down to 8 lanes instead of the full fat 16 lanes. I think most people assumed they were going with pcie gen 3 x16.

For comparison the 3700x has a total L3 cache of 32mb (16mb per ccx).

PCIe 4.0 makes better sense because it has better encoding scheme, 128b/130b instead of Gen 3's 8b/10b encoding. That lowers the overhead costs significantly.
 
Hopefully Sony do this deep dive sometime next week. I can't see it until September now though as Gamescom so nearby. I hate no E3 and covid. We would know Everything normally by now.
 
PCIe 4.0 makes better sense because it has better encoding scheme, 128b/130b instead of Gen 3's 8b/10b encoding. That lowers the overhead costs significantly.

Yeh I think most people saw the peak I/O numbers were around the cap of pcie gen 3 nvme drives so assumed that must be the interface they are using, but it makes more sense for them to go with gen 4, especially considering zen 2 supports gen 4 directly from the CPU.

Anyone know what the difference is between the scalable data fabric and the infinity fabric we normally seen on zen 2 CPU block diagrams? Or is it just a different name for the same thing?
 
Well at least on the audio we know PS5's Tempest Engine is a reworked CU so roughly 285 GFLOPs raw performance there and able to take up to 20 GB/s of system bandwidth

OTOH Series X's audio solution being equivalent to One X CPU would put it at around 148 GFLOPs if this chart is to be believed. That's raw numbers, anyway. They say it's actually greater though, so it would be at least 150 GFLOPs of audio performance here, and then you also need to take architectural gains into account. I don't know the IPC gains Zen 2 has over Jaguar, but assuming it's same as RDNA1 over GCN, that'd be a 50% IPC gain, so it would be equivalent to 222 - 225 GFLOPs of the One X CPU for their audio solution (on the low end, it could be more than that in actuality).

So not quite as capable as Sony's but not far off, either. They're within great range of one another and that's just off of what I could analyze very quickly.

Can you show us where you got this? Been ten minutes searching and it seems like you pulled numbers from nowhere and they all landed conveniently near each other.
 
It doesn't bode well for pricing when in a tech deep dive for the damn SoC they're damage controlling costs......

The optimist in me hopes it is just their way of explaining only a 2X increase in RAM and other Moore's Law brick walls. At least Flash looks like it should decrease in price YoY.
 
Did anyone spot the "coherency" mention in the chip diagram? I thought that was a unique thing to PS5 but obviously not.
 
-The GPU slides confirm it's using a Mesh Shading Geometry Engine. So yeah, it is of course still using a Geometry Engine, if that wasn't clear before. But It seems maybe the mesh shading functions have been integrated into it?

-VRS giving between 10% - 30% performance gains

-SFS has Tile Residency and Tile Request maps

-SFS can give up to 60% I/O and memory footprint savings

-GPU has custom RT and Ray-Triangle units. These appear to be custom hardware added to the GPU, outside of the CUs (I'm speculating).

-Shader can run in parallel for BVH traversals, material shading etc.

-ML Interface Acceleration (additional bits of hardware on the GPU from the sounds of it)

Overall not quite as detailed on some things as I'd like but it does give some new bits of info and a bit of clarification on other things.

Did anyone spot the "coherency" mention in the chip diagram? I thought that was a unique thing to PS5 but obviously not.

Where was that mentioned? I was skimming through these pretty quick so might've missed the mention.

Can you show us where you got this? Been ten minutes searching and it seems like you pulled numbers from nowhere and they all landed conveniently near each other.

In Road to PS5 Cerny said the Tempest Engine was about equivalent to a single CU core. PS5 TF is 10.275. 10.275 / 36 = 285 GFLOPs. So that's roughly what Tempest Engine is. They also said it can consume about 20 GB/s of memory bandwidth.

The Jaguar FLOPs I had to go to an older GAF thread to find, I'm taking that person's numbers at face value but they break down One X's CPU FLOPs to about 148 GFLOPs. However MS have already used "raw" numbers before kind of underplaying their GPU capabilities (e.g Series X "2x GPU of One X", doesn't account for architecture changes, which actually puts it much higher than "just" 2x One X's GPU). RDNA1 IPC over GCN was 50%, RDNA2 IPC over RDNA1 is roughly 25%.

Assuming Zen 2 gains over Jaguar architecture at least mirror RDNA1 over GCN, that puts Series X's audio solution at least around 222 GFLOPs - 225 GFLOPs. However I also remember MS saying they had "4x CPU performance" for Series X over One X and XBO. So at an extreme example that'd actually put Series X's audio solution at equivalent of 592 GFLOPs of One X's CPU cluster. That might be an extreme end though and I'm not nearly as such about that given figure vs. the 222 GFLOPs - 225 GFLOPs one.

So other way of seeing the audio solutions would be PS5 as 285 GFLOPs of RDNA2 equivalent and Series X as 222 GFLOPs (or 225 GFLOPs) equivalent of GCN. However it may be closer to 592 GFLOPs of GCN equivalent taking MS's statement of Series X CPU being "4x" that of One X's into account.
 
Last edited:
So they have an audio chip we didn't know about?

Other than that, sounds like some actual GPU customization, specifically regarding ray tracing?

Someone summarize what was unknown before vs. known plz lol
 
-The GPU slides confirm it's using a Mesh Shading Geometry Engine. So yeah, it is of course still using a Geometry Engine, if that wasn't clear before. But It seems maybe the mesh shading functions have been integrated into it?

-VRS giving between 10% - 30% performance gains

-SFS has Tile Residency and Tile Request maps

-SFS can give up to 60% I/O and memory footprint savings

-GPU has custom RT and Ray-Triangle units. These appear to be custom hardware added to the GPU, outside of the CUs (I'm speculating).

-Shader can run in parallel for BVH traversals, material shading etc.

-ML Interface Acceleration (additional bits of hardware on the GPU from the sounds of it)

Overall not quite as detailed on some things as I'd like but it does give some new bits of info and a bit of clarification on other things.
Have you looked into that presentation slides on Tom's HW? There is more of it, but I did not want to litter pictures here.
 
Now we can officially definitively calculate the raytracing performance of Xbox Series X


fqvK7bgMNGxQdNKNnHKZHQ-1366-80.png



XSX - 4 x 52 x 1.825 = 379.6 billion ray triangle intersection per clock

compared to PS5

PS5 - 4 x 36 x 2.23 = 321.12 billion
 
Last edited:
Is it normal to get this breakdown this early or Microsoft super confident with what they have?
I would say it's pretty late, this presentation could be here way sooner I would say, if no Covid. Because numbers are known for long time.
 
Now we can officially definitively calculate the raytracing performance of Xbox Series X


fqvK7bgMNGxQdNKNnHKZHQ-1366-80.png



XSX - 4 x 52 x 1.825 = 379.6 billion ray triangle intersection per clock

compared to PS5

PS5 - 4 x 36 x 2.23 = 321.12 billion


I could have sworn the darker spots in the middle of each green piece were moving back and forth for a second there lol
 
Damn, MS did put some friggin' amazing effort on designing this console. Kudos to them. Now put the same energy on creating exclusive games (I know they're doing that right now).
 
Have you looked into that presentation slides on Tom's HW? There is more of it, but I did not want to litter pictures here.

Yeah, I looked through all the slides. There's stuff in there I don't understand of course, and some stuff I skipped through without even realizing (like T-Cake T-Cake mentioning they brought up "coherency" in one of the slides).
 
Can anyone explain in layman terms as to why one would go with 52 CU at a lower clock speed than having 36 CU at higher clocks ? Die size, yields and cost is the first topic of dicussion in the slides, signifying the importance. what are the tradeoffs in both cases?
 
Well at least on the audio we know PS5's Tempest Engine is a reworked CU so roughly 285 GFLOPs raw performance there and able to take up to 20 GB/s of system bandwidth

OTOH Series X's audio solution being equivalent to One X CPU would put it at around 148 GFLOPs if this chart is to be believed. That's raw numbers, anyway. They say it's actually greater though, so it would be at least 150 GFLOPs of audio performance here, and then you also need to take architectural gains into account. I don't know the IPC gains Zen 2 has over Jaguar, but assuming it's same as RDNA1 over GCN, that'd be a 50% IPC gain, so it would be equivalent to 222 - 225 GFLOPs of the One X CPU for their audio solution (on the low end, it could be more than that in actuality).

So not quite as capable as Sony's but not far off, either. They're within great range of one another and that's just off of what I could analyze very quickly.
We know how flops translate to image. Is there even a way to translate flops to sound.... "performance"? Or "quality"?
How different is 100gflops? (or 300? Or 500?)
To me it's a unit really hard to interpret in real life usage.
 
Have you looked into that presentation slides on Tom's HW? There is more of it, but I did not want to litter pictures here.

Yeah I looked through them all, but did quick-scan a LOT for the moment. Some of the stuff mentioned I'm expecting others to catch or clarify.

We know how flops translate to image. Is there even a way to translate flops to sound.... "performance"? Or "quality"?
How different is 100gflops? (or 300? Or 500?)
To me it's a unit really hard to interpret in real life usage.

Higher-quality sounds and better clarity would be my go-tos. More crispness, and the such.

Honestly except for audiophiles I don't think many will necessarily notice or care about the audio gains next-gen brings if you aren't a game dev. But I'm sure they will still be beneficial all around.
 
Last edited:
Top Bottom