Next-Gen PS5 & XSX |OT| Console tEch threaD

Status
Not open for further replies.
This compression stuff needs further examination. Sony claims in excess of 20gb/s under high compression, so is the 9gb/s figure an average, or the minimum? Also not certain what the 50% means from MS, average? Minimum? Max?

No clear cut apples vs apples here.
Have MS confirmed the 4.8 figure is only for BCPack, or is this an assumption on your end? Because I don't recall them specifying what compression tools (or combinations thereof) are providing some of those numbers.
Good point, BCPack is definitely included that explains the 50% compression rate. Though it may be in combination with Zlib?
False.

It can't compress data 50% better than Kraken.
It is Kraken 30-40% vs BCPack 50%.
It is more like 10-20% better at best.
The 6GB/s is a theoretical max output from XSX hardware decompressing block just like 22GB/s is a theoretical max throughput of PS5s hardware decompression unit

Yes... hence 2.4GB/s->4.8GB/s
There might be edge cases were it goes over 4.8GB/s just like PS5 might go over 9GB/s but these are the typical figures we are working working with no theoretical peaks
Wrong again.

You try to downplay PS5 compression at the same time you overestimate the Xbox one.

There is no 7-9GB/s... stop to made up... there is 8-9GB/s that is what Sony devs got in typical cases.
MS shared 4.8GB/s that already include the use of BCPack... Zlib can't reach 50% compression sorry.

That 6GB/s is the same as PS5's 22GB/s... best case scenario with very specific type of data.
The last few pages have been one big mess. People are confusing decompression speed with compression rates and missing a lot of info regarding how these consoles really work.

Both Sony and MS have a decompression block, Sony has a single purpose decompression block which specializes in Kraken while MS has two different decompression methods in their block, one is Zlib and the other is BCPack. There are two different metrics when we talk about these blocks and they are confusing because both translate to GB/s. The first is how small the decompression method can make a piece of data. For instance, you take a 10MB image and compress it with Kraken it should result in an average around 6MB while if you compress it using BC (not BCpack because we don't have BCPack numbers yet) it would be around 5MB. Because the data becomes smaller, it can transfer faster. If we've compressed a 10MB file on one end to 5MB, transmitted it and on the other side we decompress it again, we can transfer it in half the time so that's where numbers like 8GB/s or 4.8GB/s are coming from. The other metric which also translates to GB/s is how fast the decompressor can decompress the data. If we look at our 5MB compressed file, decompressing it will take time. So for instance, if it takes 5 seconds to decompress the 5MB back to its' original 10MB, our decompressor has a 2MB/s decompression rate. The decompression rate is depended on four things - the compression algorithm (BCPack, Zlib, maybe Kraken?), the data type (image, sound, text?), specifics of the data within its' type (is the image full of black blobs or is it very noisy?) and how powerful is the hardware that decompresses the data (are we using a 2700K or a 9700k?).

So as you can see, there isn't really a fixed speed we can latch to and say "PS5 SSD is 129.2525% faster", every single frame the data that flows off the SSD will be compressed with different efficiency, a different speed. Even before I'm going to get into numbers, I think it's pretty obvious that the PS5 solution is faster, they just have too much raw bandwidth for MS to keep up with them so it doesn't really matter how anyone manipulated the numbers, even if we are talking about a very ambiguous field that's hard to slap a single number on it.

So after my extremely long opening, I want to refer to what MS and Sony had said and how it probably translates to the real world. Sony is using Kraken, developers can use whatever compression they like but if they want the PS5's decompression block to decompress the data instead of the CPU, they will use Kraken. Every compression method has edge cases, so a figure like "22GB/s" shouldn't be taken as anything other than a figure that Sony and Oodle got here and there when Kraken had hit some perfect piece of data that got compressed by 75% which quadruples the bandwidth as result. Cerny told us the typical compression will result at around 8-9 GB/s. It means that some data will run through the bus at 6GB/s, other at 14GB/s but at the end, the average is around 8-9 GB/s.

MS, on the other hand, is a bit more all over the place, we have a few figures to talk about. The first one is "Over 6GB/s", that's how fast the decompressor block can decompress data. It means that, on the average usage, the decompressor block will decompress data at over 6GB/s. It doesn't mean the data was compressed 2.5x times so the SSD bandwidth will hit 6GB/s+. It's like saying "my 9700K can decompress this file at 6GB per second", it doesn't tell us anything about the size or how well compressed is the file (was the 10MB file compressed to 9MB? to 3MB? Who knows?), just how fast the hardware can decompress it. It also means that sometimes the block will decompress at 4GB/s and on others 10GB/s, it depends on the data type and other variables we've talked about three paragraphs ago.

The second number MS gave us was 4.8GB/s, that's the equivalent to Cerny's 8-9 GB/s figure. That number is a culmination of the speed of the decompression block which acts as some form of a ceiling to the whole decompression process, Zlib which is the compression method all the data which isn't textures will be compressed in and BCPack which is the compression method all the textures will be compressed in. If we flatten the discussion to "BCPack can do 50%", then 4.8GB/s is basically a combination of the raw transfer rate of the SSD, X% of the data which will use Zlib and Y% of the data which will use BCPack. The reason MS's number is X2 the raw bandwidth isn't because BC has 50% average compression ratio, but because some data will use Zlib, most data will use BCPack and BCPack has better compression on average than BC's 50%.

So to sum it up, MS is expecting that on average data will be compressed around 50%, yielding 4.8GB/s while Sony is expecting that on average data will be compressed by 31% - 39%. It doesn't mean BCPack has 50% compression, it actually means it has better than 50% because after you average it out with Zlib which is sub-30% compression on average, we get 50% compression. The 6GB/s figure MS has thrown around should be discarded from this discussion, because it doesn't tell us much and we don't even have a number for the PS5's decompression block. BC can have edge cases that hit over 75% compression and BCPack is more efficient than BC, which means that some rare data on XSX will transfer at well over 10GB/s, but using that number will be bullshit, just like the 22GB/s number from an edge case on PS5.

Such a long post which can be summed by one simple line, XSX has 4.8GB/s typical bandwidth while PS5 has 8-9 GB/s typical bandwidth. Everything else is just noise.
 
Unfortunately for Sony, this variable frequency feature isn't very marketable and not easy to understand, so of course trolls are trying to spin the following narratives:
  • the CPU and GPU can't reach peak frequencies at the same time;
  • the GPU (or CPU) can't reach the max frequency when its utilization is at 100%;
  • The GPU lowers the CPU frequency to steal power and reach higher frequencies.
All of which are false. People need to understand that Power utilization ≠ Frequency ≠ Percentage of GPU/CPU used.
 
XSX has 44% more RT units but each RT unit on PS5 does 22% more work at any given time. This closes the gap to 18% (same as compute)
Droping the resolution by 18% should free enough resources for PS5 to match XSX output.

I've been saying for ages that I think RT will run better on the PS5 because the logic favours higher clocks, and after writing down the words Mark used about RT. There is a quote that certainly should give pause on the idea that slow and wide is better.

Mark Cerny
"While the intersection engine is processing the requested ray-triangle ray-box intersections the shaders are free to do other work."
"Having said that, the ray tracing instruction is pretty memory intensive, so it's a good mix with logic heavy code."

Logic heavy code, reads to me that the code runs longer (more cycles) ,rather than less cycles - which would suit to wider and slower.

The memory intensive part also couples with his explanation of why he prefers faster clocks, and when you scale the PS5 clock to the XsX clock, the faster L1 and L2 caches and command buffer on logic heavy code - as opposed to lookup heavy code - code that is probably fine with memory being clock diff % of cycles further away, it starts to look like the PS5 GPU has the edge on RT (IMHO).

",there are a lot of other units and those other units all run faster when the GPU frequency is higher. At 33% higher frequency rasterization goes 33% faster."
"Processing the command buffer goes that much faster."
"The L2 and other caches have that much higher bandwidth, and so on."
"About the only downside is that system memory is 33% further away - in terms more cycles."
"But the large number of benefits more than counterbalanced that..."
 
Last edited:
SSD SSD SSD SSD....

giphy.gif
The generation of:

PSSD5.

XSSDX.

:messenger_beaming:
 
This has come up a few times regarding the PS5. Even Jim mentioned a while back their biggest surprises were still under wraps.

What could it possibly be?



If I was to guess I would say it was most likely stuff related to the OS/user experience given Matt's previous role. It could be many different things though (Games, specs, other).

Show the games, Sony! Please.....
 
I've been saying for ages that I think RT will run better on the PS5 because the logic favours higher clocks, and after writing down the words Mark used about RT. There is a quote that certainly should give pause on the idea that slow and wide is better.

Mark Cerny
"While the intersection engine is processing the requested ray-triangle ray-box intersections the shaders are free to do other work."
"Having said that, the ray tracing instruction is pretty memory intensive, so it's a good mix with logic heavy code."

Logic heavy code, reads that the code runs longer (more cycles) rather than less cycles suited to wider.

The memory intensive part also couples with his explanation of why he prefers faster clocks, and when you scale the PS5 clock to the XsX clock, the faster L1 and L2 caches and command buffer on logic heavy code - as opposed to lookup heavy code - that is probably fine with memory being 33% of cycles further away, it starts to look like the PS5 GPU has the edge on RT.

",there are a lot of other units and those other units all run faster when the GPU frequency is higher. At 33% higher frequency rasterization goes 33% faster."
"Processing the command buffer goes that much faster."
"The L2 and other caches have that much higher bandwidth, and so on."
"About the only downside is that system memory is 33% further away - in terms more cycles."
"But the large number of benefits more than counterbalanced that..."
RT performance scales with TF, the intersection engine is inside the CU and follows the same clock speed. In addition, RT is also shader intensive so it likes more TF and RT it extremely parallel, which means it doesn't care much if there are more CUs or higher clock speed. On top of all of that, RT also loves memory bandwidth.

We don't know how each company has implemented or customized things so it's too early to say (for instance, does XSX has 4MB of L2 or 5MB? If it follows the RDNA rules then PS5 has 4MB of L2 and XSX has 5MB. But we don't know how it works on RDNA2 and surely don't know what customization MS and Sony did), but if I had to guess, XSX has the RT advantage. On the other hand, because IMO games on PS5 will run at a lower resolution anyway (weaker GPU and less memory bandwidth), RT scales really well with resolution so at the end both consoles will output the same regarding RT IMO, just in different resolutions.
 
Last edited:
"22GB/s" shouldn't be taken as anything other than a figure that Sony and Oodle got here and there when Kraken had hit some perfect piece of data
The first one is "Over 6GB/s", that's how fast the decompressor block can decompress data. It means that, on the average usage, the decompressor block will decompress data at over 6GB/s.
BC can have edge cases that hit over 75% compression and BCPack is more efficient than BC, which means that some rare data on XSX will transfer at well over 10GB/s, but using that number will be bullshit
Agree with most of what you've said, this part i'd like to go over:
Based on the wording used i think both figures (over 6GB/s & 22GB/s) are the physical limit for the respective decompression blocks in super idealistic scenarios i.e the max amount of data they can decompress/output in one second. MS said over 6GB/s so i think its safe to assume under 7GB/s but higher than 6GB/s. For simplicity's sake let's round it up to 6.5GB/s

My question is how does this hypothetical 10GB/s number of yours work considering its surpassing the physical limit (6.5GB/s) of the decompression unit?
So to sum it up, MS is expecting that on average data will be compressed around 50%, yielding 4.8GB/s while Sony is expecting that on average data will be compressed by 31% - 39%. It doesn't mean BCPack has 50% compression, it actually means it has better than 50% because after you average it out with Zlib which is sub-30% compression on average, we get 50% compression.
This i was wondering yesterday how do we know if MS 4.8GB/s is a mix texture and general data (zlib + bcpack) or just bcpack to paint their solution in a better light?
 
Last edited:
The last few pages have been one big mess. People are confusing decompression speed with compression rates and missing a lot of info regarding how these consoles really work.

Both Sony and MS have a decompression block, Sony has a single purpose decompression block which specializes in Kraken while MS has two different decompression methods in their block, one is Zlib and the other is BCPack. There are two different metrics when we talk about these blocks and they are confusing because both translate to GB/s. The first is how small the decompression method can make a piece of data. For instance, you take a 10MB image and compress it with Kraken it should result in an average around 6MB while if you compress it using BC (not BCpack because we don't have BCPack numbers yet) it would be around 5MB. Because the data becomes smaller, it can transfer faster. If we've compressed a 10MB file on one end to 5MB, transmitted it and on the other side we decompress it again, we can transfer it in half the time so that's where numbers like 8GB/s or 4.8GB/s are coming from. The other metric which also translates to GB/s is how fast the decompressor can decompress the data. If we look at our 5MB compressed file, decompressing it will take time. So for instance, if it takes 5 seconds to decompress the 5MB back to its' original 10MB, our decompressor has a 2MB/s decompression rate. The decompression rate is depended on four things - the compression algorithm (BCPack, Zlib, maybe Kraken?), the data type (image, sound, text?), specifics of the data within its' type (is the image full of black blobs or is it very noisy?) and how powerful is the hardware that decompresses the data (are we using a 2700K or a 9700k?).

So as you can see, there isn't really a fixed speed we can latch to and say "PS5 SSD is 129.2525% faster", every single frame the data that flows off the SSD will be compressed with different efficiency, a different speed. Even before I'm going to get into numbers, I think it's pretty obvious that the PS5 solution is faster, they just have too much raw bandwidth for MS to keep up with them so it doesn't really matter how anyone manipulated the numbers, even if we are talking about a very ambiguous field that's hard to slap a single number on it.

So after my extremely long opening, I want to refer to what MS and Sony had said and how it probably translates to the real world. Sony is using Kraken, developers can use whatever compression they like but if they want the PS5's decompression block to decompress the data instead of the CPU, they will use Kraken. Every compression method has edge cases, so a figure like "22GB/s" shouldn't be taken as anything other than a figure that Sony and Oodle got here and there when Kraken had hit some perfect piece of data that got compressed by 75% which quadruples the bandwidth as result. Cerny told us the typical compression will result at around 8-9 GB/s. It means that some data will run through the bus at 6GB/s, other at 14GB/s but at the end, the average is around 8-9 GB/s.

MS, on the other hand, is a bit more all over the place, we have a few figures to talk about. The first one is "Over 6GB/s", that's how fast the decompressor block can decompress data. It means that, on the average usage, the decompressor block will decompress data at over 6GB/s. It doesn't mean the data was compressed 2.5x times so the SSD bandwidth will hit 6GB/s+. It's like saying "my 9700K can decompress this file at 6GB per second", it doesn't tell us anything about the size or how well compressed is the file (was the 10MB file compressed to 9MB? to 3MB? Who knows?), just how fast the hardware can decompress it. It also means that sometimes the block will decompress at 4GB/s and on others 10GB/s, it depends on the data type and other variables we've talked about three paragraphs ago.

The second number MS gave us was 4.8GB/s, that's the equivalent to Cerny's 8-9 GB/s figure. That number is a culmination of the speed of the decompression block which acts as some form of a ceiling to the whole decompression process, Zlib which is the compression method all the data which isn't textures will be compressed in and BCPack which is the compression method all the textures will be compressed in. If we flatten the discussion to "BCPack can do 50%", then 4.8GB/s is basically a combination of the raw transfer rate of the SSD, X% of the data which will use Zlib and Y% of the data which will use BCPack. The reason MS's number is X2 the raw bandwidth isn't because BC has 50% average compression ratio, but because some data will use Zlib, most data will use BCPack and BCPack has better compression on average than BC's 50%.

So to sum it up, MS is expecting that on average data will be compressed around 50%, yielding 4.8GB/s while Sony is expecting that on average data will be compressed by 31% - 39%. It doesn't mean BCPack has 50% compression, it actually means it has better than 50% because after you average it out with Zlib which is sub-30% compression on average, we get 50% compression. The 6GB/s figure MS has thrown around should be discarded from this discussion, because it doesn't tell us much and we don't even have a number for the PS5's decompression block. BC can have edge cases that hit over 75% compression and BCPack is more efficient than BC, which means that some rare data on XSX will transfer at well over 10GB/s, but using that number will be bullshit, just like the 22GB/s number from an edge case on PS5.

Such a long post which can be summed by one simple line, XSX has 4.8GB/s typical bandwidth while PS5 has 8-9 GB/s typical bandwidth. Everything else is just noise.
That is good for these that want to read all.
Just to add PS5 can use Zlib or Kraken for compression.
 
The last few pages have been one big mess. People are confusing decompression speed with compression rates and missing a lot of info regarding how these consoles really work.

Both Sony and MS have a decompression block, Sony has a single purpose decompression block which specializes in Kraken while MS has two different decompression methods in their block, one is Zlib and the other is BCPack. There are two different metrics when we talk about these blocks and they are confusing because both translate to GB/s. The first is how small the decompression method can make a piece of data. For instance, you take a 10MB image and compress it with Kraken it should result in an average around 6MB while if you compress it using BC (not BCpack because we don't have BCPack numbers yet) it would be around 5MB. Because the data becomes smaller, it can transfer faster. If we've compressed a 10MB file on one end to 5MB, transmitted it and on the other side we decompress it again, we can transfer it in half the time so that's where numbers like 8GB/s or 4.8GB/s are coming from. The other metric which also translates to GB/s is how fast the decompressor can decompress the data. If we look at our 5MB compressed file, decompressing it will take time. So for instance, if it takes 5 seconds to decompress the 5MB back to its' original 10MB, our decompressor has a 2MB/s decompression rate. The decompression rate is depended on four things - the compression algorithm (BCPack, Zlib, maybe Kraken?), the data type (image, sound, text?), specifics of the data within its' type (is the image full of black blobs or is it very noisy?) and how powerful is the hardware that decompresses the data (are we using a 2700K or a 9700k?).

So as you can see, there isn't really a fixed speed we can latch to and say "PS5 SSD is 129.2525% faster", every single frame the data that flows off the SSD will be compressed with different efficiency, a different speed. Even before I'm going to get into numbers, I think it's pretty obvious that the PS5 solution is faster, they just have too much raw bandwidth for MS to keep up with them so it doesn't really matter how anyone manipulated the numbers, even if we are talking about a very ambiguous field that's hard to slap a single number on it.

So after my extremely long opening, I want to refer to what MS and Sony had said and how it probably translates to the real world. Sony is using Kraken, developers can use whatever compression they like but if they want the PS5's decompression block to decompress the data instead of the CPU, they will use Kraken. Every compression method has edge cases, so a figure like "22GB/s" shouldn't be taken as anything other than a figure that Sony and Oodle got here and there when Kraken had hit some perfect piece of data that got compressed by 75% which quadruples the bandwidth as result. Cerny told us the typical compression will result at around 8-9 GB/s. It means that some data will run through the bus at 6GB/s, other at 14GB/s but at the end, the average is around 8-9 GB/s.

MS, on the other hand, is a bit more all over the place, we have a few figures to talk about. The first one is "Over 6GB/s", that's how fast the decompressor block can decompress data. It means that, on the average usage, the decompressor block will decompress data at over 6GB/s. It doesn't mean the data was compressed 2.5x times so the SSD bandwidth will hit 6GB/s+. It's like saying "my 9700K can decompress this file at 6GB per second", it doesn't tell us anything about the size or how well compressed is the file (was the 10MB file compressed to 9MB? to 3MB? Who knows?), just how fast the hardware can decompress it. It also means that sometimes the block will decompress at 4GB/s and on others 10GB/s, it depends on the data type and other variables we've talked about three paragraphs ago.

The second number MS gave us was 4.8GB/s, that's the equivalent to Cerny's 8-9 GB/s figure. That number is a culmination of the speed of the decompression block which acts as some form of a ceiling to the whole decompression process, Zlib which is the compression method all the data which isn't textures will be compressed in and BCPack which is the compression method all the textures will be compressed in. If we flatten the discussion to "BCPack can do 50%", then 4.8GB/s is basically a combination of the raw transfer rate of the SSD, X% of the data which will use Zlib and Y% of the data which will use BCPack. The reason MS's number is X2 the raw bandwidth isn't because BC has 50% average compression ratio, but because some data will use Zlib, most data will use BCPack and BCPack has better compression on average than BC's 50%.

So to sum it up, MS is expecting that on average data will be compressed around 50%, yielding 4.8GB/s while Sony is expecting that on average data will be compressed by 31% - 39%. It doesn't mean BCPack has 50% compression, it actually means it has better than 50% because after you average it out with Zlib which is sub-30% compression on average, we get 50% compression. The 6GB/s figure MS has thrown around should be discarded from this discussion, because it doesn't tell us much and we don't even have a number for the PS5's decompression block. BC can have edge cases that hit over 75% compression and BCPack is more efficient than BC, which means that some rare data on XSX will transfer at well over 10GB/s, but using that number will be bullshit, just like the 22GB/s number from an edge case on PS5.

Such a long post which can be summed by one simple line, XSX has 4.8GB/s typical bandwidth while PS5 has 8-9 GB/s typical bandwidth. Everything else is just noise.
Thanks dr.keo. sums everything up quite nice
 
RT performance scales with TF, the intersection engine is inside the CU and follows the same clock speed. In addition, RT is also shader intensive so it likes more TF and RT it extremely parallel, which means it doesn't care much if there are more CUs or higher clock speed. On top of all of that, RT also loves memory bandwidth.

We don't know how each company has implemented or customized things so it's too early to say (for instance, does XSX has 4MB of L2 or 5MB? If it follows the RDNA rules then PS5 has 4MB of L2 and XSX has 5MB. But we don't know how it works on RDNA2 and surely don't know what customization MS and Sony did), but if I had to guess, XSX has the RT advantage. On the other hand, because IMO games on PS5 will run at a lower resolution anyway (weaker GPU and less memory bandwidth), RT scales really well with resolution so at the end both consoles will output the same regarding RT IMO, just in different resolutions.


Excellent post and it clears up alot of the confusion over the decompression blocks.

Hopefully those who still struggle take the time to read your explanation as it's very well thought out.
 
Agree with most of what you've said, this part i'd like to go over:
Based on the wording used i think both figures (over 6GB/s & 22GB/s) are the physical limit for the respective decompression blocks in super idealistic scenarios i.e the max amount of data they can decompress in one second. MS said over 6GB/s so i think its safe to assume under 7GB/s but higher than 6GB/s. For simplicity's sake let's round it up to 6.5GB/s

My question is how does this hypothetical 10GB/s number of yours work considering its surpassing the physical limit (6.5GB/s) of the decompression unit?
MS actually didn't call the "over 6GB/s" figure a maximum, that's the quote:
Our second component is a high-speed hardware decompression block that can deliver over 6GB/s, This is a dedicated silicon block that offloads decompression work from the CPU and is matched to the SSD so that decompression is never a bottleneck.
When Cerny talked about the 22GB/s, he talked about the throughput of the SSD at a very specific edge case. On MS's side, they are talking about the "over 6GB/s" regarding the power of the decompression block. The decompression rate also changes depending on the data. So IMO, from how they worded it, the decompression block probably hits high heights when it gets data that is easy to work with and low lows when it gets data that is not and in the end, it operates at an average of ~6GB/s. It doesn't mean data will flow from the SSD at 6GB/s, just that the decompression block receives the data on the other side of the pipe and decompresses it at 6GB/s. So certain data might flow at 7GB/s and the decompressor block at 9GB/s so the end result is 7GB/s or on the other hand, data might flow at 5GB/s but the decompressor block will only work at 3.5GB/s so the end result will be 3.5GB/s.

MS didn't really make a deep-dive or told us about super high-speed edge cases, Cerny did but as an anecdote. It doesn't really matter because an edge case is, well, an edge case which sounds cool when you are impressing people with how amazing your SSD is but it's not really a figure we should use or care about. Even with 4.8GB and 8GB Sony still has almost a 70% advantage which is HUGE and the real numbers might even be higher.

This i was wondering yesterday how dow we know if MS 4.8GB/s is a mix texture and general data (zlib + bcpack) or just bcpack to paint their solution in a better light?
BC has 50% average texture compression, but BCPack has a higher compression ratio than that (this is guy works on the XSX tools design team):


So if MS wanted to bullshit numbers using BCPack alone, they would have used more than 4.8GB/s.
 
Last edited:
RT performance scales with TF, the intersection engine is inside the CU and follows the same clock speed. In addition, RT is also shader intensive so it likes more TF and RT it extremely parallel, which means it doesn't care much if there are more CUs or higher clock speed. On top of all of that, RT also loves memory bandwidth.

We don't know how each company has implemented or customized things so it's too early to say (for instance, does XSX has 4MB of L2 or 5MB? If it follows the RDNA rules then PS5 has 4MB of L2 and XSX has 5MB. But we don't know how it works on RDNA2 and surely don't know what customization MS and Sony did), but if I had to guess, XSX has the RT advantage. On the other hand, because IMO games on PS5 will run at a lower resolution anyway (weaker GPU and less memory bandwidth), RT scales really well with resolution so at the end both consoles will output the same regarding RT IMO, just in different resolutions.
I think the XsX memory setup is going to be a real problem for efficiently scheduling workloads in RT - compared to the PS5 - and that combined with inferior IO to get the acceleration structures into RAM, and the Logic heavy work the WGP will do for RT (going by Mark's words), and then when you factor in a faster command processor that will also help the shader, and get through redundant work faster. Mark also implied hundreds of millions of rays had a modest impact on the GPU in the title he said he saw with (GI, shadows) and reflections with complex animation, with only full RT(path tracing) at billions being left on his slide. The xsx not managing Minecraft RT at a locked 1080p60 with very simple geometry gives an idea of how things might shape up on the XsX side.
 
This has come up a few times regarding the PS5. Even Jim mentioned a while back their biggest surprises were still under wraps.

What could it possibly be?


I've said this before many times.

There are other reasons for Sony's radio silence, slow drip feeding of info on PS5 and nothing to do with yet to be released PS4 titles like Last of Us 2 or Ghost of Tushima.

Also devs are very quiet regarding anything PS5. Sony are holding all their cards might close to their chest and i suspect this is going to result in a massive blowout. Question is when will this happen because at the moment all i want to see are the games. I'm so over the tech talk now. The last few pages have been nothing but an SSD wankfest which has made me physically ill that i've stayed away from this thread.
 
DrKeo DrKeo

Good point indeed compression speeds is way different from decompression speeds... decompression is mainly faster and what the MS and Sony are using when they say 4.8GB/s and 8-9GB/s... that is reasonable because devs can take all the time to compress the data to be saved in SSD.

Now decompression is the critical point because it needs to be done in render time.

Probably both Kraken and BCPack can compress way better but at the decompression pass it will take too much time when you need the data ready to be used.... so what didactic how much compression devs will use is how much time they have to decompress and so how fast the data will be decompressed.
 
Last edited:
The xsx not managing Minecraft RT at a locked 1080p60 with very simple geometry gives an idea of how things might shape up on the XsX side.


That makes me worry about the future of Ray Tracing on these systems. While it is impressive that it's implemented in Minecraft, Minecraft isn't a very advanced game to begin with. What I want to see is a demo like that for a much more advanced game. I'm thinking about Gears 6 running at 4K 60FPS with all the ray tracing features in it. Unfortunately I don't believe these consoles can handle something like that.
 
The last few pages have been one big mess. People are confusing decompression speed with compression rates and missing a lot of info regarding how these consoles really work.

Both Sony and MS have a decompression block, Sony has a single purpose decompression block which specializes in Kraken while MS has two different decompression methods in their block, one is Zlib and the other is BCPack. There are two different metrics when we talk about these blocks and they are confusing because both translate to GB/s. The first is how small the decompression method can make a piece of data. For instance, you take a 10MB image and compress it with Kraken it should result in an average around 6MB while if you compress it using BC (not BCpack because we don't have BCPack numbers yet) it would be around 5MB. Because the data becomes smaller, it can transfer faster. If we've compressed a 10MB file on one end to 5MB, transmitted it and on the other side we decompress it again, we can transfer it in half the time so that's where numbers like 8GB/s or 4.8GB/s are coming from. The other metric which also translates to GB/s is how fast the decompressor can decompress the data. If we look at our 5MB compressed file, decompressing it will take time. So for instance, if it takes 5 seconds to decompress the 5MB back to its' original 10MB, our decompressor has a 2MB/s decompression rate. The decompression rate is depended on four things - the compression algorithm (BCPack, Zlib, maybe Kraken?), the data type (image, sound, text?), specifics of the data within its' type (is the image full of black blobs or is it very noisy?) and how powerful is the hardware that decompresses the data (are we using a 2700K or a 9700k?).

So as you can see, there isn't really a fixed speed we can latch to and say "PS5 SSD is 129.2525% faster", every single frame the data that flows off the SSD will be compressed with different efficiency, a different speed. Even before I'm going to get into numbers, I think it's pretty obvious that the PS5 solution is faster, they just have too much raw bandwidth for MS to keep up with them so it doesn't really matter how anyone manipulated the numbers, even if we are talking about a very ambiguous field that's hard to slap a single number on it.

So after my extremely long opening, I want to refer to what MS and Sony had said and how it probably translates to the real world. Sony is using Kraken, developers can use whatever compression they like but if they want the PS5's decompression block to decompress the data instead of the CPU, they will use Kraken. Every compression method has edge cases, so a figure like "22GB/s" shouldn't be taken as anything other than a figure that Sony and Oodle got here and there when Kraken had hit some perfect piece of data that got compressed by 75% which quadruples the bandwidth as result. Cerny told us the typical compression will result at around 8-9 GB/s. It means that some data will run through the bus at 6GB/s, other at 14GB/s but at the end, the average is around 8-9 GB/s.

MS, on the other hand, is a bit more all over the place, we have a few figures to talk about. The first one is "Over 6GB/s", that's how fast the decompressor block can decompress data. It means that, on the average usage, the decompressor block will decompress data at over 6GB/s. It doesn't mean the data was compressed 2.5x times so the SSD bandwidth will hit 6GB/s+. It's like saying "my 9700K can decompress this file at 6GB per second", it doesn't tell us anything about the size or how well compressed is the file (was the 10MB file compressed to 9MB? to 3MB? Who knows?), just how fast the hardware can decompress it. It also means that sometimes the block will decompress at 4GB/s and on others 10GB/s, it depends on the data type and other variables we've talked about three paragraphs ago.

The second number MS gave us was 4.8GB/s, that's the equivalent to Cerny's 8-9 GB/s figure. That number is a culmination of the speed of the decompression block which acts as some form of a ceiling to the whole decompression process, Zlib which is the compression method all the data which isn't textures will be compressed in and BCPack which is the compression method all the textures will be compressed in. If we flatten the discussion to "BCPack can do 50%", then 4.8GB/s is basically a combination of the raw transfer rate of the SSD, X% of the data which will use Zlib and Y% of the data which will use BCPack. The reason MS's number is X2 the raw bandwidth isn't because BC has 50% average compression ratio, but because some data will use Zlib, most data will use BCPack and BCPack has better compression on average than BC's 50%.

So to sum it up, MS is expecting that on average data will be compressed around 50%, yielding 4.8GB/s while Sony is expecting that on average data will be compressed by 31% - 39%. It doesn't mean BCPack has 50% compression, it actually means it has better than 50% because after you average it out with Zlib which is sub-30% compression on average, we get 50% compression. The 6GB/s figure MS has thrown around should be discarded from this discussion, because it doesn't tell us much and we don't even have a number for the PS5's decompression block. BC can have edge cases that hit over 75% compression and BCPack is more efficient than BC, which means that some rare data on XSX will transfer at well over 10GB/s, but using that number will be bullshit, just like the 22GB/s number from an edge case on PS5.

Such a long post which can be summed by one simple line, XSX has 4.8GB/s typical bandwidth while PS5 has 8-9 GB/s typical bandwidth. Everything else is just noise.


Yue8jj7.jpg
 
That makes me worry about the future of Ray Tracing on these systems. While it is impressive that it's implemented in Minecraft, Minecraft isn't a very advanced game to begin with. What I want to see is a demo like that for a much more advanced game. I'm thinking about Gears 6 running at 4K 60FPS with all the ray tracing features in it. Unfortunately I don't believe these consoles can handle something like that.
We are still some way off 4K to do full path tracing in all likelihood. Path traced 1080p30 or 60 for AAA games would still be the promised land IMHO, for this coming gen.
 
MS actually didn't call the "over 6GB/s" figure a maximum, that's the quote:
When Cerny talked about the 22GB/s, he talked about the throughput of the SSD at a very specific edge case. On MS's side, they are talking about the "over 6GB/s" regarding the power of the decompression block.
If you look at their wording it isn't much different
Andrew Goossen said:
high-speed hardware decompression block that can deliver over 6GB/s
Mark Cerny said:
The unit itself is capable of outputting as much as 22GB/s
Both were specifically talking about the capabilities of the decompression block , the only difference is MS used a vague in between 6GB/s & 7GB/s figure. If their hardware block was capble of outputting 7GB/s they'd no doubt use that figure when talking about the block capabilities. It wasn't a average either
The decompression rate also changes depending on the data.
It doesn't mean data will flow from the SSD at 6GB/s
No argument here, agree
It doesn't really matter because an edge case is, well, an edge case which sounds cool when you are impressing people with how amazing your SSD is but it's not really a figure we should use or care about.
Yeah for sure its mostly at nitpick from me
 
Last edited:
I think the XsX memory setup is going to be a real problem for efficiently scheduling workloads in RT - compared to the PS5 - and that combined with inferior IO to get the acceleration structures into RAM, and the Logic heavy work the WGP will do for RT (going by Mark's words), and then when you factor in a faster command processor that will also help the shader, and get through redundant work faster. Mark also implied hundreds of millions of rays had a modest impact on the GPU in the title he said he saw with (GI, shadows) and reflections with complex animation, with only full RT(path tracing) at billions being left on his slide. The xsx not managing Minecraft RT at a locked 1080p60 with very simple geometry gives an idea of how things might shape up on the XsX side.
MS actually has the advantage in the I/O regarding the BVH because the BVH is built-in real-time and updated every few frames so it has nothing to do with the SSD, the BVH structure is pure GPU and memory work. That's actually part of the advantage, you keep the BVH in the 560GB/s 10GB and MS has the advantage. I actually don't see a single advantage to the PS5 in RT right now (as long as we don't know more specifics at least, I don't know what customization both did or what changes RDNA2 brought over RDNA1) except for the faster cache. Regarding computational load, RT is actually a very light computational load done billions of times per second, it's basically "death by a thousand cuts" for the GPU.

Regarding Minecraft, in case you haven't seen it on PC, it's the heaviest RT app around right now and it's a full path traced game, not just one effect. Just to compare, Metro Exodus does GI using a single bounce will Minecraft does 8 bounces and on top of that like a million other RT stuff because the game is 100% ray traced. So yeah, Minecraft isn't exactly a looker, but the RT work it requires to run is heavier than any AAA game implementation of a single RT effect.

It seems from your post that you expect the PS5 to be worlds apart from the XSX regarding RT even though the XSX has almost every advantage in the book regarding RT (shader compute, cache size, memory bandwidth, and intersection engine). So you should ready yourself for a disappointment, not only that the PS5 won't have a huge advantage on XSX regarding RT, IMO from what we know right now, XSX probably has the advantage. And just to be clear, I'm talking about something like 15%, something that will be insignificant anyways because IMO 3rd party games will probably run at a slightly lower resolution on PS5.

If you look at their wording it isn't much different


Both were specifically talking about the capabilities of the decompression block , the only difference is MS used a vague in between 6GB/s & 7GB/s figure. If their hardware block was capble of outputting 7GB/s they'd no doubt use that figure when talking about the block capabilities.
I just looked at the Cerny video again and you are right, his 22GB/s remark was regarding the final unit output. But still, that's an edge case for the decompression unit, it's not comparable to the 6GB/s MS figure. BC can, for instance, compress some textures at a 1:4 ratio, which will result in a 9.6GB/s output from the XSX SSD, yet MS didn't talk about that figure (which is probably higher because BCPack improves upon BC). The big difference is that MS gave us the average compressed data rate (4.8GB/s) and the average decompression speed (over 6GB/s) while Sony gave us the average compressed data rate (8-9 GB/s) and an edge case for the decompressor (22GB/s). We don't have an edge case for the XSX decompressor but does it really matter considering it's an edge case?
 
Unfortunately for Sony, this variable frequency feature isn't very marketable and not easy to understand, so of course trolls are trying to spin the following narratives:
  • the CPU and GPU can't reach peak frequencies at the same time;
  • the GPU (or CPU) can't reach the max frequency when its utilization is at 100%;
  • The GPU lowers the CPU frequency to steal power and reach higher frequencies.
All of which are false. People need to understand that Power utilization ≠ Frequency ≠ Percentage of GPU/CPU used.
All of what you said is true. But there is an additional one that is true and you didn't list, and that is;
  • The GPU AND CPU can't both reach the max frequency when the workload of both is at 100%

In reality that doesn't happen often though. But it does happen from time to time.
 
Last edited:
All of what you said is true. But there is an additional one that is true and you didn't list, and that is;
  • The GPU AND CPU can't both reach the max frequency when the workload of both is at 100%

In reality that doesn't happen often though. But it does happen from time to time.

I thought Cerny had a quote directly refuting this? He said something like "it's not the case that for one of the CPU or GPU to run high, the other one has to throttle down in balance. In theory they could both run maxed out".
 
^ Thing is the world has been a captive audience for a month+ now and you would think it would be taken fully advantage of!
A few things that don't sit well with me.

- Companies world wide ability to function/work
- The climbing death toll
- Uncertainty in the global market

The above obvious due to Covid19.

Yes you have a much bigger audience now as lots of people are staying home but would it be morally correct to expect people to be hyped for a games console considering the above issues?
 
I'll repost this as there appears to be some confusion regarding the underlying technology behind PS5 variable frequency & AMD generic smartshift
PS5
  • SoC has a fixed power budget (300W) GPU & CPU each have a power budget within SoC. Say the reference values are GPU 250W & CPU 50W
  • Smartshift only diverts power to GPU if there are leftover CPU resources while at 3.5GHz and GPU is exceeding its power budget. CPU only using 30W so 20W diverted to GPU->270W
  • There's key difference in how smartshift operates on PS5: Its limited only to a scenario in which there's left over CPU resources so there won't be
Let me piggyback on your post with actual illustration of what SonGoku SonGoku is talking about. Some people are under the impression that when a console or PC is running a game, 100% of the processing capability is used. This couldn't be further from the truth. For example, Ryzen 3700X and RTX2080S which is the equivalent of what next gen consoles are competing against.

Battlefield V - Look at the CPU utilization, hardly reaches 50% while GPU utilization is at the upper bounds of 90%. This is just an AGGREGATE of CPU and GPU ACTIVITY and if you go even lower, some parts of the GPU don't even see 60% utilization. This is what Sony is exploiting for their variable frequency strategy, they monitor each frame and look at the workload the CPU and GPU are doing and dynamically scale clock speed as well as power balance based on what the GPU and CPU are actually doing. Instead of using the old strategy of locking the clock lower. This means that they can push the GPU harder than usual because the CPU can be downclocked to maintain stable operation. No game or application ever uses 100% of the CPU and GPU.
yZf1jif.gif


Fortnite - The utilization is even lower. CPU hardly touches 40& and GPU below 70%

WD6bmmo.gif
 
Last edited:
BC can, for instance, compress some textures at a 1:4 ratio, which will result in a 9.6GB/s
Ok so you are saying it has higher chances of getting closer to 6+GB/s than PS5 to 22GB/s. That's reasonable but would they go much further to the detriment of IQ? (BCPack is lossy no?)
My only objection is they wont compress further than the decompression block can handle ~6.5GB/s

) and the average decompression speed (over 6GB/s)
I don't believe it's an average. If their hardware block was capble of outputting 7GB/s or hell even 9.8GB/s they'd no doubt use that figure when talking about the block capabilities instead of over 6GB/s.
We don't have an edge case for the XSX decompressor but does it really matter considering it's an edge case?
No i don't think so, i just brought it up because of the compression rate you mentioned surpassed what i believe to be the decompression block limit
 
Last edited:
These are not utilization numbers. These are "busy" numbers for both CPU and GPU.
Meaning it's not idle. Any task that GPU or CPU is doing is counted as busy, even waiting for memory.
Real utilization numbers are much much lower than these.
I noted that
This is just an aggregate and if you go even lower, some parts of the GPU don't even see 60% utilization.
 
A few things that don't sit well with me.

- Companies world wide ability to function/work
- The climbing death toll
- Uncertainty in the global market

The above obvious due to Covid19.

Yes you have a much bigger audience now as lots of people are staying home but would it be morally correct to expect people to be hyped for a games console considering the above issues?

Of course the Covid19 situation is serious but at the end of the day both Sony and Microsoft have said their plans for launch haven't changed yet.

Would you prefer they both come out and cancel all next-gen related activities until next year for example?
 
I thought Cerny had a quote directly refuting this? He said something like "it's not the case that for one of the CPU or GPU to run high, the other one has to throttle down in balance. In theory they could both run maxed out".
His whole premise is that the workload on the GPU and CPU vary constantly, and rarely are they both at the same time at max workload, so, both can achieve the max clocks most of the time, and the power will be delivered in proportion to the required workload. In the rare instance that both are at 100% workload, one of them has to throttle down the clocks a couple of percentages to reduce power by double digit percentages.

For the XSX, the clocks are constant, and the workload and power delivery varies. So the higher the workload, the higher the power consumption. And Cerny was talking about how when designing a console you have to 'guess' what the workload will be, and design your cooling around that. That's the classic approach and the approach that Microsoft has taken with the Xbox. Microsoft definitely has not designed the cooling of the XSX to handle both the CPU and GPU at 100% workload all the time, but an expectation of for example around 80% CPU workload and 90% GPU workload on average is quite reasonable. I guess that's why the XSX cooling is so beefy.

Cerny mentioned how they had trouble keeping things cool on a constant 3 GHz CPU clock and a constant 2 GHz GPU clock. But by using this tech, they can control the power consumption, creating less heat and thus achieve higher clocks than if they went the same route as Microsoft. So technically, the PS5 design is more efficient in terms of power and in terms of getting the most out of the hardware.

People have been saying that the PS5 will downclock the one that is idle, but that is not how it works. Basically every hardware out there, including your PC, consoles and phones, drops clocks when there is a low workload. The most important thing regarding this subject Cerny mentioned was that the way they are doing things is different than how things have traditionally been done in all other hardware. They flipped things on their head with the PS5. The idea he mentioned is to have high clocks at "low" workload, and drop the clocks when there is a high workload that would exceed the pre-defined power and cooling budget that they set for themselves. That will rarely happen (for now).
 
Last edited:
This means that they can push the GPU harder than usual because the CPU can be downclocked to maintain stable operation
I'd just like to add that if CPU is downclocked to prioritize GPU that isn't smartshift but performance profiles at play
Smartshift won't downclock CPU because its only giving out unused CPU power

Also i would advise against using CPU utilization metrics of current gen games since they were designed to run on Jaguar CPUs
Any task that GPU or CPU is doing is counted as busy, even waiting for memory.
How is this busy metric translated to percentages shown in display. To put it simply whats the difference between 50% and 98% busy ?
 
Last edited:
Of course the Covid19 situation is serious but at the end of the day both Sony and Microsoft have said their plans for launch haven't changed yet.

Would you prefer they both come out and cancel all next-gen related activities until next year for example?
No I don't say they should cancel their launch plans.

The current worldwide situation has obviously had an impact on their messaging regarding next gen. You already heard yesterday from a reliable insider that announcements have been moved out of E3 week to later in the year and some have been moved up to before E3.
 
I've said this before many times.

There are other reasons for Sony's radio silence, slow drip feeding of info on PS5 and nothing to do with yet to be released PS4 titles like Last of Us 2 or Ghost of Tushima.

Also devs are very quiet regarding anything PS5. Sony are holding all their cards might close to their chest and i suspect this is going to result in a massive blowout. Question is when will this happen because at the moment all i want to see are the games. I'm so over the tech talk now. The last few pages have been nothing but an SSD wankfest which has made me physically ill that i've stayed away from this thread.

Agreed, the recycled tech talk at least but I'm still interested in the unknown. Hopefully next month we receive more info
 
Ok so you are saying it has higher chances of getting closer to 6+GB/s than PS5 to 22GB/s. That's reasonable but would they go much further to the detriment of IQ? (BCPack is lossy no?)
My only objection is they wont compress further than the decompression block can handle ~6.5GB/s


I don't believe it's an average. If their hardware block was capble of outputting 7GB/s or hell even 9.8GB/s they'd no doubt use that figure when talking about the block capabilities instead of over 6GB/s.

No i don't think so, i just brought it up because of the compression rate you mentioned surpassed what i believe to be the decompression block limit
Decompression rate is different than decompression ratio and I think we are confusing the two. There are three steps to the process (lets us the PS5 with 8GB/s):
1) The developer takes a 10MB texture and compresses it to 6.874MB (31.25% compression) using Kraken. This happens while the game is being developed and that 6.874MB compressed texture is placed in the game install and will later sit on the player's SSD.
2) The game needs that texture so it whizzes from the SSD at 5.5GB/s until it reaches the PS5's Krekan decompression block.
3) The decompression block receives the 6.874MB of texture data, decompresses it at some rate higher than 8GB/s and writes it to the GDDR6 memory at 8GB/s.

So that's the problem when ready Cerny's and MS's description of the SSD bandwidth. It's capped by three things, the raw bandwidth itself (2.4GB/s or 5.5GB/s), how small the compression method used by the developer made the data, and how fast the decompressor block can process that data.

Now we are getting to reading into the wording and how things are phrased by both parties, which I'm not a big fan of. Considering BC can hit 1:4 ratio on rare cases, if 6GB/s+ is the highest speed the decompressor can hit (stage 3), it means the XSX is limited by its' decompressor and at the same time the PS5 decompressor is X4 faster than the XSX decompressor, which sounds pretty ludicrous to me. If the XSX decompressor never goes over 6GB/s, I'm having a really hard time believing MS will ever get even close to 4.8GB/s average. That's why, IMO, Cerny talked about the whole system throughput (step 2 through 3) while MS talked about their decompressor average ability (which should always be higher than the current throughput or else it will bound it).

But as I've said, we are getting into the wording and how things are phrased territory, so I can actually think of another interpretation for the "over 6GB/s" comment. 4.8GB/s is the average throughput once we take into account both the Zlib decompressor and BCPack decompressor. So BCPack throughput is obviously higher than 4.8GB/s because Zlib is lower than 50% average compression, or in other words, BCPack is the higher figure, Zlib is the lower figure and 4.8GB/s is the weighted average between them. So the BCPack compressed data bandwidth is "over 6GB/s", the Zlib compressed data is lower, and the weighted average is 4.8GB/s.

But I guess there could be multiple interpretations of what both Sony and MS said (I just brought two in this post), so who knows?
 
Last edited:
No I don't say they should cancel their launch plans.

The current worldwide situation has obviously had an impact on their messaging regarding next gen. You already heard yesterday from a reliable insider that announcements have been moved out of E3 week to later in the year and some have been moved up to before E3.

I don't disagree, but with Sony it seems like their plans haven't changed except for the Road to PS5 changing from a GDC presentation to a pre-recorded stream. They didn't even modify it to include any consumer focused elements (e.g. game trailers/demos). Everything else so far seems to be crumbs every few weeks/months over the last year.

I doubt we'll ever find out if Sony had planned for a reveal in March or April now if they ever did. I just wonder why they have kept stuff so tightly to their chest for so long (even allowing for the current Covid19 situation).

Microsoft on the other hand seem to be executing a combination of what worked with One X and their usual pre-E3 May blowout. All they will need to change is how they present their E3 stuff. And the current situation doesn't appear to have altered their plans much at all.

I think we can all agree about wanting to see proper first-party next-gen only games now.
 
MS actually has the advantage in the I/O regarding the BVH because the BVH is built-in real-time and updated every few frames so it has nothing to do with the SSD, the BVH structure is pure GPU and memory work. That's actually part of the advantage, you keep the BVH in the 560GB/s 10GB and MS has the advantage. I actually don't see a single advantage to the PS5 in RT right now (as long as we don't know more specifics at least, I don't know what customization both did or what changes RDNA2 brought over RDNA1) except for the faster cache. Regarding computational load, RT is actually a very light computational load done billions of times per second, it's basically "death by a thousand cuts" for the GPU.

Regarding Minecraft, in case you haven't seen it on PC, it's the heaviest RT app around right now and it's a full path traced game, not just one effect. Just to compare, Metro Exodus does GI using a single bounce will Minecraft does 8 bounces and on top of that like a million other RT stuff because the game is 100% ray traced. So yeah, Minecraft isn't exactly a looker, but the RT work it requires to run is heavier than any AAA game implementation of a single RT effect.

It seems from your post that you expect the PS5 to be worlds apart from the XSX regarding RT even though the XSX has almost every advantage in the book regarding RT (shader compute, cache size, memory bandwidth, and intersection engine). So you should ready yourself for a disappointment, not only that the PS5 won't have a huge advantage on XSX regarding RT, IMO from what we know right now, XSX probably has the advantage. And just to be clear, I'm talking about something like 15%, something that will be insignificant anyways because IMO 3rd party games will probably run at a slightly lower resolution on PS5.
...
Sorry if I gave off the wrong impression. I think Minecraft RT is an excellent use of Path Tracing to make the bland look sublime, and at 1080p that type of better pixels matter more to me than higher resolution. And I was merely point out that the game missed their own performance target which tells us something. I've said much earlier in this thread that Full RT (in this context for games) is probably using 10 rays per pixel which would place the Minecraft demo below 1 Giga Ray/s, and is less than billions of ray/s that Mark Cerny's words implied the PS5 Full RT can do.

As for the memory setup on the XsX, I've looked at it repeatedly, and from the information they've provided so far, from a data comms background in real-world use it won't match the consistent performance of the unified setup of the PS5. Semantically it isn't even a true HSA design IMHO. AFAIK acceleration structures use a combination of pre-calculated and run-time built data., and for real-time built the GPU cache speeds should play a part - assuming they are generated using GPGPU.

Again, sorry if I gave the impression with my "PS5 GPU has the edge on RT (IMHO)" comment you thought I meant the 'edge' would be big – I don't think that. I think that in spite of the 12TF vs 10TF talk and blind assumption that the XsX has bigger numbers, so more powerful, the PS5 overall will be more performant hardware. But I won't be disappointed either way, because the first party game studios rarely fail to impress each generation regardless of the techniques they use.
 
I'd just like to add that if CPU is downclocked to prioritize GPU that isn't smartshift but performance profiles at play
Smartshift won't downclock CPU because its only giving out unused CPU power

Also i would advise against using CPU utilization metrics of current gen games since they were designed to run on Jaguar CPUs
I know all that mate. Was just using your comment as a spring board to further elaborate on what was talked about in the road to PS5 video. Furthermore, I doubt CPU and GPU utilization will see a drastic jump from current gen. Sure lots of clever stuff like VRS and Mesh shader would help in utilizing the GPU more effectively.
How is this busy metric translated to percentages shown in display. To put it simply whats the difference between 50% and 98% busy ?
I think what you are looking for is something like Nvidia SMI which generates a bunch of numbers and figures that these softwares hook into to generate a percentage based on some of the numbers. The only way to see true detailed CPU/GPU activity would be to use a profiler like game developers use to optimize games.
 
Last edited:
Sorry if I gave off the wrong impression. I think Minecraft RT is an excellent use of Path Tracing to make the bland look sublime, and at 1080p that type of better pixels matter more to me than higher resolution. And I was merely point out that the game missed their own performance target which tells us something. I've said much earlier in this thread that Full RT (in this context for games) is probably using 10 rays per pixel which would place the Minecraft demo below 1 Giga Ray/s, and is less than billions of ray/s that Mark Cerny's words implied the PS5 Full RT can do.

As for the memory setup on the XsX, I've looked at it repeatedly, and from the information they've provided so far, from a data comms background in real-world use it won't match the consistent performance of the unified setup of the PS5. Semantically it isn't even a true HSA design IMHO. AFAIK acceleration structures use a combination of pre-calculated and run-time built data., and for real-time built the GPU cache speeds should play a part - assuming they are generated using GPGPU.

Again, sorry if I gave the impression with my "PS5 GPU has the edge on RT (IMHO)" comment you thought I meant the 'edge' would be big – I don't think that. I think that in spite of the 12TF vs 10TF talk and blind assumption that the XsX has bigger numbers, so more powerful, the PS5 overall will be more performant hardware. But I won't be disappointed either way, because the first party game studios rarely fail to impress each generation regardless of the techniques they use.
Minecraft is throwing an insane amount of rays per pixel. First, think of their GI which has 8 bounces per pixel, that's x8 more rays than Metro Exodus GI solution which uses 1 bounce. Now add to that their reflections which bounce 8 times on smooth surfaces and 2 times on rough per pixel, that's way over x2 than BFV reflections and dependent on the scene can be way more than x8 times rays than BFV. So we are talking about an order or magnitude more rays per pixel than Metro and BFV combined And that's just GI and reflections, Minecraft RTX does EVERYTHING using ray tracing. That's the thing about Minecraft RTX, it uses way more than an order of magnitude, even two orders of magnitude, more rays than the average PC RT effect. So yes, Minecraft's visuals are extremely simplistic, but that's exactly the reason they can do a full path traced game that throws so many rays into the scene, because it's so simple looking. I know Minecraft doesn't sound like a GPU heavy game to run but believe it or not, Minecraft RTX is THE RT benchmark we have today, a monster.

BTW, if you look at Minecraft RTX performance on PC, it depends on the level and the number of chunks in your settings. 2080 ti which is obviously more powerful than both consoles can't keep a steady 60fps with 24 chunks @1080p on benchmarks I've seen and even with 16 chunks, it can't hit a 60fps average at most levels (especially if there are water involved). So I would say that, depending on the chunks settings on XSX, its' Minecraft RTX performance falls between the 2080 SUPER and 2080 ti.
 
The first one is "Over 6GB/s", that's how fast the decompressor block can decompress data.
Nah.
The one thing you can take with plenty of certainty is that a custom asic, purpose built for the I/O device its paired with, will Not perform 3x faster than said device can provide it with data.
The quoted number is output, because that's all end user cares about anyway, as you're never going to "see" the input number. And we have at least 4 consoles worth of precedent in last 2 decades for similar decompressors and their specs, we always state speed in terms of data output.
These decompressors are built to not slow down the drive, so their speed of consuming data will be closely matching 2.4/5.5 respectively, that's all they ever need.

XSX has 4.8GB/s typical bandwidth while PS5 has 8-9 GB/s typical bandwidth. Everything else is just noise.
Yes, although directly comparing those 2 numbers is basically noise as well. They haven't used the same methodology or input data set, so relative differences could be well... anywhere really.
 
I think the XsX memory setup is going to be a real problem for efficiently scheduling workloads in RT - compared to the PS5 - and that combined with inferior IO to get the acceleration structures into RAM, and the Logic heavy work the WGP will do for RT (going by Mark's words), and then when you factor in a faster command processor that will also help the shader, and get through redundant work faster. Mark also implied hundreds of millions of rays had a modest impact on the GPU in the title he said he saw with (GI, shadows) and reflections with complex animation, with only full RT(path tracing) at billions being left on his slide. The xsx not managing Minecraft RT at a locked 1080p60 with very simple geometry gives an idea of how things might shape up on the XsX side.

Quite the amusing doomsday view. Until we see PS5 running that Minecraft demo, we really can't make assumptions off of that. I think there would be some advantages to casting more simultaneous rays, something that a wider approach leans into.
 
People don't think of the SSD's changing game design.

Personally have not seen a game (on console) that can load the evinroment (in real time) like the above example gif.
Just imagine what you can do with the next Mass Effect (if the IP is not death) , will be awesome to see it.

The gif in case someone doesn't know is from Star Citizen which is maybe the first AAA game optimize for an SSD only for PC.
 
Last edited:
Just imagine what you can do with the next Mass Effect (if the IP is not death) , will be awesome to see it.

The gif in case someone doesn't know is from Star Citizen which is maybe the first AAA game optimize for an SSD only for PC.

Just imagine a game where you can be in a huge open world town, and as you open the door and immediately walk in the game will use all of its rendering and memory prowess on the inside of a building.

Right now interiors are sorely lacking in detail. But games will now be able to make every single open world game transformer into the same detail as something more confined and linear like The Order

it's going to be incredible not seeing bland interiors
 
Status
Not open for further replies.
Top Bottom