• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

PlayStation 5 Pro Unboxed, 16.7 TFLOPs GPU Compute Confirmed

Lysandros

Member
That could also be an option.
The NPU in the AI H300 series is a block from CDNA2. And it could be used as well on the Pro.
Interesting. How do we arrive to the "custom" part of the equation with this existing unit, would it be customized/modified by Sony?
 

winjer

Gold Member
Interesting. How do we arrive to the "custom" part of the equation with this existing unit, would it be customized/modified by Sony?

The custom part is always overplayed in console presentations.
Most of the tech these consoles use is just off the shelf AMD blocks.
Sometimes, it's tweaked to better suit what Sony or MS want, such as trimming the FP units on the Zen2 CPU.
 

Lysandros

Member
The custom part is always overplayed in console presentations.
Most of the tech these consoles use is just off the shelf AMD blocks.
Sometimes, it's tweaked to better suit what Sony or MS want, such as trimming the FP units on the Zen2 CPU.
I of course agree as far as the base architecture goes but don't you think there are some more significant customizations within the PS5 hardware besides the CPU?
 
The custom part is always overplayed in console presentations.
Most of the tech these consoles use is just off the shelf AMD blocks.
Sometimes, it's tweaked to better suit what Sony or MS want, such as trimming the FP units on the Zen2 CPU.
Disagree - Cerny went out his way on more than one occasion to mention how their ML architecture was "custom", even DF pointed it out.

Then again, most of the disagreements are stemming from out interpretation of statements made by people and ambiguous information, I guess we'll have to wait and see.
 

winjer

Gold Member
I of course agree as far as the base architecture goes but don't you think there are some more significant customizations within the PS5 hardware besides the CPU?

Of course there are tweaks to the GPU. But the underlining tech is still AMD's.
Just look at the PS5, the custom GPU is just RDNA2 minus some features.
And chances are that the custom GPU on the Pro is some sort of blocks from RDNA3 and RDNA4.

But it seems that some people here like to pretend that when Sony or MS say they are using a custom solution, that means some sort of solution made from the ground up.
It's not. It's just tweaks to already existing blocks.
 

AW_CL

Banned
Microsoft's mid-gen refresh vs Sony's mid-gen refresh
o7Vz6h4.png

hVnaXjW.png

cmnPOAt.png

LQMVP46.png

xnmkGrL.png

0AyCmX2.png
 

SonGoku

Member
If I'm not mistaken, part of the leak suggested there was a NPU block on the Pro. I can't say a care either way. Also,if the NPU block is small enough to be a non-issue then I don't see why it can't be included on the pro, because the theoretical TOPs of the Pro is more than the 7900 XTX, that only makes with an NPU+ the GPU.
Looking at Zen 5 NPUs and how much space they take for a mere 50TOPs. What is the benefit of including a NPU when they can get more performance for less die space by just increasing the number of AI units
NPUs aren't really meant for gaming either, their main benefit comes from specific apps where the NPU does most of the heavy lifting while the CPU/GPU remain with very low usage increasing power efficiency dramatically. But in typical gaming workloads where both GPU/CPU are heavily used it makes little sense to rely on a NPU from a power efficiency perspective

So I ask you what is the benefit of the NPU what can it provide to gaming that AI units cant do better?
That could also be an option.
The NPU in the AI H300 series is a block from CDNA2. And it could be used as well on the Pro.

I really wish Cerny had made a more detailed presentation of the Pro, so we didn't have to be speculating so much about it.
But for what purpose would they include a NPU, what task could a NPU do in gaming better than AI units? They could just increase the number of AI units to achive that hypothetical combined tops for less die space than what it would take to include a separate NPU block

Now if NPUs were drastically better at gaming tasks i would understand the trade off but if not, makes little sense to waste die space like that.
Sure it can. DLSS has also a GPU cost in ms despite tensor cores. Those dedicated components help but don't make the reconstruction free. The quote says "it frees up a lot of the GPU to render pure graphics" not all.
Singular "An actual custom silicon AI upscaler that performs the upscaling, anti aliasing and frees up a lot of the GPU to render pure graphics" sounds like "AI units within CUs to you? It wouldn't be much difficult to word it that way if he is referring to it don't you think? I am not saying CUs themselves don't have ML capabilities via VOPD+Sparsity/lower precision at all by the way but neither of those features are "custom" to PS5 PRO. I am only sticking to statements coming from actual Sony sources. They can design and integrate things like the hardware ID buffer, Cache Scrubbers, the Tempest Engine, an entire I/O complex within the APU just fine regardless of die space implications, so why a dedicated custom ML block becomes such an unfathomable sacrilege all of a sudden?
You are putting too much stock on semantics and taking wording too literaly
For you first post using DLSS does free a lot of the GPU to render graphics despite incurring in a cost of say 2ms that still leaves most of the GPU or "Alot of the GPU" to render pure graphics.
Also we already seen examples of Pro games using most of their budget to render games at reconstructed 4k evidence by their use of PS5 perfomance settings. So Pro is not exempt from this

Your second post, there can be any number of explanations including broken English, these devs are not exactly English majors thinking about the precise repercussion of their every word. They are there to provide a general idea of how the new hardware benefits their game.

That said he can be referring to PSSR here as the "upscaler" that relies on custom silicon.

Bespoke elements you mentioned dont really apply to this comparison. For one each of those components do a specific task in a more efficient way than what could be achieved without them. Second they dont take that much die space. Tempest is merely one CU worth of die space.

A NPU not only takes a ridiculous amount of die space for only 50 TOPs but in gaming they don't have any advantage compared to just increasing the amount of AI units to achieve the hypothetical combined tops adding a separate NPU block would provide while wasting less die area in the process

I would even argue having a NPU would make the Pro performance inferior due to the impact on the die space budget
 
Last edited:

PaintTinJr

Member
The custom part is always overplayed in console presentations.
Most of the tech these consoles use is just off the shelf AMD blocks.
Sometimes, it's tweaked to better suit what Sony or MS want, such as trimming the FP units on the Zen2 CPU.
I think you aren't looking at the evidence hard enough to see the genesis of some AMD GPU blocks or strategies that were PlayStation customizations.

In the GPU space new tech gets used within 12 to 18 months of prototyping to make it into consumer graphics card products, whereas PlayStation prototyping means it is closer to 36months or more from a working APU prototype to a production consumer unit, meaning the R&D to arrive brand new at the same time as a consumer GPU had its genesis in PlayStation customization, and wasn't technology available to other AMD customers in the same timeline.
 

winjer

Gold Member
I think you aren't looking at the evidence hard enough to see the genesis of some AMD GPU blocks or strategies that were PlayStation customizations.

In the GPU space new tech gets used within 12 to 18 months of prototyping to make it into consumer graphics card products, whereas PlayStation prototyping means it is closer to 36months or more from a working APU prototype to a production consumer unit, meaning the R&D to arrive brand new at the same time as a consumer GPU had its genesis in PlayStation customization, and wasn't technology available to other AMD customers in the same timeline.

What tech was not available for PC GPUs?
For example, RDNA2 on PC was made to be compatible with DX12_2 spec.
The RDNA2 version on the PS5 has fewer capabilities, than either the RDNA2 version on the PC or the Series S/X.
 

shamoomoo

Banned
Looking at Zen 5 NPUs and how much space they take for a mere 50TOPs. What is the benefit of including a NPU when they can get more performance for less die space by just increasing the number of AI units
NPUs aren't really meant for gaming either, their main benefit comes from specific apps where the NPU does most of the heavy lifting while the CPU/GPU remain with very low usage increasing power efficiency dramatically. But in typical gaming workloads where both GPU/CPU are heavily used it makes little sense to rely on a NPU from a power efficiency perspective

So I ask you what is the benefit of the NPU what can it provide to gaming that AI units cant do better?

But for what purpose would they include a NPU, what task could a NPU do in gaming better than AI units? They could just increase the number of AI units to achive that hypothetical combined tops for less die space than what it would take to include a separate NPU block

Now if NPUs were drastically better at gaming tasks i would understand the trade off but if not, makes little sense to waste die space like that.


You are putting too much stock on semantics and taking wording too literaly
For you first post using DLSS does free a lot of the GPU to render graphics despite incurring in a cost of say 2ms that still leaves most of the GPU or "Alot of the GPU" to render pure graphics.
Also we already seen examples of Pro games using most of their budget to render games at reconstructed 4k evidence by their use of PS5 perfomance settings. So Pro is not exempt from this

Your second post, there can be any number of explanations including broken English, these devs are not exactly English majors thinking about the precise repercussion of their every word. They are there to provide a general idea of how the new hardware benefits their game.

That said he can be referring to PSSR here as the "upscaler" that relies on custom silicon.

Bespoke elements you mentioned dont really apply to this comparison. For one each of those components do a specific task in a more efficient way than what could be achieved without them. Second they dont take that much die space. Tempest is merely one CU worth of die space.

A NPU not only takes a ridiculous amount of die space for only 50 TOPs but in gaming they don't have any advantage compared to just increasing the amount of AI units to achieve the hypothetical combined tops adding a separate NPU block would provide while wasting less die area in the process

I would even argue having a NPU would make the Pro performance inferior due to the impact on the die space budget
I really haven't put too much thought into the whole NPU thing outside of leaked documents.

Let me preface this by saying I'm not a developer nor do I now how exactly GPUs outside of them seeming life fancy calculators.


From what I understand,when doing calculations, the calculations uses resources on the chip and the inclusion of an NPU would be dedicated silicon for PSSR if the developer is using most of the GPU for games. Now, I'm making wild assumptions. I'm basing these ideas off the stuff that has unfolded from the start of this generation of the SeriesX vs the PS5,that none of Microsoft's first part used its bigger GPU for novel stuff like muscle bit in Spider-Man or the image quality in GOW Ragnarok.

I'm not saying MGS can't implement such features, it's just weird that Microsoft talked about AI elements implemented in the SeriesX and not much has come from it.
 

PaintTinJr

Member
What tech was not available for PC GPUs?
For example, RDNA2 on PC was made to be compatible with DX12_2 spec.
The RDNA2 version on the PS5 has fewer capabilities, than either the RDNA2 version on the PC or the Series S/X.
Does it? Or is the feature set just prioritising new features for gaming.

In real-terms the design of the PS5 APU is at least 7.5years old, probably 8 due to COVID, and yet RDNA1 is only 5years old, and RDNA2 is only 3years old, meaning the ability to do things like BVH lookups and texturing simultaneously all has genesis in PlayStation that predates RDNA1 and RDNA2.

Then look at the doubling of fillrate by ROPs from RDNA1 to RDNA2, that mirrors the way the PS5 is setup compared to the XsX, which has the older RDNA1 ROP to CU balance.

Then look at the Ragnarok technical paper with ML AI, look at the instructions being used and the efficiency and then ask yourself if that looks more like RDNA3 ISA efficiency?

I believe the Series silicon finalised much later than PS5, like the same gap between the PS4 Pro and X1X, giving Xbox another 24months of AMD R&D maturity, and yet you can see the SeriesX despite being RDNA2 is inferior in things like running nanite - kitbashing - and has no examples of ML AI in games despite the claim it had ML AI hardware.

We know that the PS5 Pro was in development before the PS5 launched from recent Sony interview comments, and the finished consoles have been sat in boxes for the last 1-2years from Gaf insider comments, meaning the silicon was finalised at least 3.5years ago if not longer, and it is running ML AI for PSSR, and yet FSR4 with ML AI is only now being announced for RDNA4 which isn't out, and this is about 22months after RDNA3 already, again suggesting that WMMA is actually a result of the partnership and not just PlayStation picking blocks of AMD IP.
 

PaintTinJr

Member
Looking at Zen 5 NPUs and how much space they take for a mere 50TOPs. What is the benefit of including a NPU when they can get more performance for less die space by just increasing the number of AI units
NPUs aren't really meant for gaming either, their main benefit comes from specific apps where the NPU does most of the heavy lifting while the CPU/GPU remain with very low usage increasing power efficiency dramatically. But in typical gaming workloads where both GPU/CPU are heavily used it makes little sense to rely on a NPU from a power efficiency perspective

So I ask you what is the benefit of the NPU what can it provide to gaming that AI units cant do better?

But for what purpose would they include a NPU, what task could a NPU do in gaming better than AI units? They could just increase the number of AI units to achive that hypothetical combined tops for less die space than what it would take to include a separate NPU block

Now if NPUs were drastically better at gaming tasks i would understand the trade off but if not, makes little sense to waste die space like that.


You are putting too much stock on semantics and taking wording too literaly
For you first post using DLSS does free a lot of the GPU to render graphics despite incurring in a cost of say 2ms that still leaves most of the GPU or "Alot of the GPU" to render pure graphics.
Also we already seen examples of Pro games using most of their budget to render games at reconstructed 4k evidence by their use of PS5 perfomance settings. So Pro is not exempt from this

Your second post, there can be any number of explanations including broken English, these devs are not exactly English majors thinking about the precise repercussion of their every word. They are there to provide a general idea of how the new hardware benefits their game.

That said he can be referring to PSSR here as the "upscaler" that relies on custom silicon.

Bespoke elements you mentioned dont really apply to this comparison. For one each of those components do a specific task in a more efficient way than what could be achieved without them. Second they dont take that much die space. Tempest is merely one CU worth of die space.

A NPU not only takes a ridiculous amount of die space for only 50 TOPs but in gaming they don't have any advantage compared to just increasing the amount of AI units to achieve the hypothetical combined tops adding a separate NPU block would provide while wasting less die area in the process

I would even argue having a NPU would make the Pro performance inferior due to the impact on the die space budget
The simple answer is that ASICs have their uses and some of the advantages are probably: lower latency, greater power efficiency, end programmer use simplicity, deterministic performance, to name just some.

At the end of the rendering per frame PSSR needs to do its thing so it is primarily a one way data flow so having it reload every frame into general GPU compute, or have it permanently occupy some of the CUs might not be the best use of GPU bandwidth and cache bandwidth, versus having an NPU doing heuristics stuff with motion vectors and the previous frames to await the native frame to enhance for output to hdmi,

Much like digital cameras can have ASICs for lowest latency features like histogram changes, an ASIC can automatically do the spectral analysis on minimal clock cycles compared to generalized GPU compute so might even have advantages to have core PSSR work done on an NPU and higher quality output for lower frame-rate PSSR additional work done on the CUs.
 

Embearded

Member
If PSSR uses 2ms of GPU time, it can't be a separate AI block, can it? Although this information about the 2ms comes from a rumor, I don't remember who from.
I believe "2ms of the pipeline" is a better way to put it.
In PCs where CPUs and GPUs can be discrete components and not necessarily APUs in the same chip, both units share the same deadline to prepare a frame.
 

winjer

Gold Member
Does it? Or is the feature set just prioritising new features for gaming.

In real-terms the design of the PS5 APU is at least 7.5years old, probably 8 due to COVID, and yet RDNA1 is only 5years old, and RDNA2 is only 3years old, meaning the ability to do things like BVH lookups and texturing simultaneously all has genesis in PlayStation that predates RDNA1 and RDNA2.

Then look at the doubling of fillrate by ROPs from RDNA1 to RDNA2, that mirrors the way the PS5 is setup compared to the XsX, which has the older RDNA1 ROP to CU balance.

Then look at the Ragnarok technical paper with ML AI, look at the instructions being used and the efficiency and then ask yourself if that looks more like RDNA3 ISA efficiency?

I believe the Series silicon finalised much later than PS5, like the same gap between the PS4 Pro and X1X, giving Xbox another 24months of AMD R&D maturity, and yet you can see the SeriesX despite being RDNA2 is inferior in things like running nanite - kitbashing - and has no examples of ML AI in games despite the claim it had ML AI hardware.

We know that the PS5 Pro was in development before the PS5 launched from recent Sony interview comments, and the finished consoles have been sat in boxes for the last 1-2years from Gaf insider comments, meaning the silicon was finalised at least 3.5years ago if not longer, and it is running ML AI for PSSR, and yet FSR4 with ML AI is only now being announced for RDNA4 which isn't out, and this is about 22months after RDNA3 already, again suggesting that WMMA is actually a result of the partnership and not just PlayStation picking blocks of AMD IP.

Yes, it does. And they were proposals by AMD and NVidia for the new standard that would become the DirectX12_2, aka DirectX Ultimate.
AMD proposal was RDNA1 and Nvidia's proposal was Turing. But because Nvidia's proposal was so much more advanced, it was chosen as the standard.
Just consider that AMD's proposal with RDNA1 was little more than Primitive Shaders. While Nvidia's proposal with Turing included DirectX raytracing, Variable Rate Shading, Mesh Shaders and Sampler Feedback.
But ML acceleration capabilities were not a part of the standard of DX12_2. And that is why AMD got away with just having DP4A, just like Nvidia had done in Pascal, several years prior.
RDNA2 was AMD making a GPU that fit the DX12_2 standard. And because Microsoft was deep in the process of the standard of DX12_2, they got the full RDNA2 feature set.
Meanwhile the PS5 uses a mixture of blocks from RDNA1 and RDNA2.
Nanite was never a part of any API standard. It's just rasterization done in compute, instead of using the fixed function quad rasterizers in the GPU. And it's performance it's not a matter of any feature set, but of clock speeds affecting caches and shader execution time.

You pretend like Sony starts development on a new console, and that automatically means that at that point in time, they already have developed all the tech. That is not how it works.
And for some reason, you think that AMD has no development time.
Just consider that the first attempt at a Primitive Shader setup from AMD, was with Vega, released in 2017. And Vega was based of the MI25 from 2016, and AMD was developing that feature years before that.
So the Primitive Shaders that Sony is using on the PS5, was made by AMD many years prior to any of these consoles released.

And WMMA is not an answer to Sony. It's an answer to Nvidia's Tensor Cores.
Remember that Nvidia already had products in the market with Tensor Cores and ML capabilities, since 2018.
In case you have not noticed yet, it's not AMD or Sony that are setting the pace for new graphical features. It's Nvidia.
Mesh Shaders, Variable Rate Shading, Sampler Feedback, Tensor operations, Ray-tracing, AI upscaling, Frame Generation, Matrix sparsity operations, Ray-tracing operations re-ordering, Tensor Flow, and so much more.
 
Last edited:

PaintTinJr

Member
Yes, it does. And they were proposals by AMD and NVidia for the new standard that would become the DirectX12_2, aka DirectX Ultimate.
You are weirdly comparing a 4 year old software API that wasn't even stable by the time the consoles launched verse an 8year old APU hardware design.
AMD proposal was RDNA1 and Nvidia's proposal was Turing. But because Nvidia's proposal was so much more advanced, it was chosen as the standard.
Just consider that AMD's proposal with RDNA1 was little more than Primitive Shaders. While Nvidia's proposal with Turing included DirectX raytracing, Variable Rate Shading, Mesh Shaders and Sampler Feedback.
But ML acceleration capabilities were not a part of the standard of DX12_2. And that is why AMD got away with just having DP4A, just like Nvidia had done in Pascal, several years prior.
RDNA2 was AMD making a GPU that fit the DX12_2 standard. And because Microsoft was deep in the process of the standard of DX12_2, they got the full RDNA2 feature set.
Meanwhile the PS5 uses a mixture of blocks from RDNA1 and RDNA2.
Nanite was never a part of any API standard. It's just rasterization done in compute, instead of using the fixed function quad rasterizers in the GPU. And it's performance it's not a matter of any feature set, but of clock speeds affecting caches and shader execution time.
None of that means anything other, Nvidia controls DirectX as the primary author that rewrote what is the modern day DirectX for Microsoft with the OG Xbox. primitive shaders are actually more versatile than Mesh shaders which is just a software wrapper over that hardware feature. Same with RT in the PS5 is more versatile than the RT in DirectX - which has needed multiple revisions on efficiency because the software API as it was didn't traverse efficiently.
The software wrapper to provide Variable Rate shading in DirectX was also way behind the software solution in CoD being done on the PS5 Geometry engine, that was again more versatile than a software wrapper.

PS5 uses a mix of features that transcend all of RDNA1-3 as shown by the ML AI solution in Ragnarok and the ability of the PS5 GPU to do RT and texturing simultaneously without blocking.

As for Nanite, your whole strawman of DirectX doesn't change the reality that the PS5 geometry engine is more versatile and runs nanite more efficiently than either RDNA2 or Turing despite it being twice the hardware age of either.

You pretend like Sony starts development on a new console, and that automatically means that at that point in time, they already have developed all the tech. That is not how it works.
No, I'm saying the PS5 prototype was ready 8years ago, not the start of the R&D that will have began at the finalization of the PS4 Pro hardware that went into boxes for sale on shelves.
And for some reason, you think that AMD has no development time.
Just consider that the first attempt at a Primitive Shader setup from AMD, was with Vega, released in 2017. And Vega was based of the MI25 from 2016, and AMD was developing that feature years before that.
So the Primitive Shaders that Sony is using on the PS5, was made by AMD many years prior to any of these consoles released.
Primitive shaders were already done a decade earlier on the PS2 with SotC, and done again on the SPUs
And WMMA is not an answer to Sony. It's an answer to Nvidia's Tensor Cores.
Remember that Nvidia already had products in the market with Tensor Cores and ML capabilities, since 2018.
The tensor cores weren't even out when the PS5 APU prototype was finished.
 
Last edited:

clarky

Gold Member
Sony make great products but they really need to up their game on the packaging, its fucking cheap as.

I know most people won't give a shit but I do like some nice premium packaging. I ripped the cheap ass sleave on the regular 5 myself.
 

Can check my history of comments where i said that it is impossible for PS5 Pro to beat RTX 4070 or give equal performance while, Sony itself claiming that PS5 Pro is "up to 45% faster than PS5", where as, RTX 4070 is about 80% faster than PS5.

I was also mocked by some people saying that 16.7 TFLOPS will be converted into 2X just because it is console hardware. The fact is that even PS5 Pro would not able to fully utilize it's raw performance.

Today graphic engines are much flexible and less optimized.

Glad that i was right and DF was wrong.
 
Last edited:

winjer

Gold Member
You are weirdly comparing a 4 year old software API that wasn't even stable by the time the consoles launched verse an 8year old APU hardware design.

Are you trying to pretend that the PS5 SoC was already designed 8 years ago. That is complete nonsense.
And yes, it's DX12_2 that is setting the standard for modern GPUs. Be it from AMD, Intel or Nvidia.

None of that means anything other, Nvidia controls DirectX as the primary author that rewrote what is the modern day DirectX for Microsoft with the OG Xbox.

Neither Nvidia nor AMD control the DirectX standard.
It's controlled by Microsoft and they always consult the main graphics vendors, when defining the next version.
Both AMD/ATI, Nvidia, Intel and other companies have contributed to defining DirectX features, across several iterations.

primitive shaders are actually more versatile than Mesh shaders which is just a software wrapper over that hardware feature. Same with RT in the PS5 is more versatile than the RT in DirectX - which has needed multiple revisions on efficiency because the software API as it was didn't traverse efficiently.
The software wrapper to provide Variable Rate shading in DirectX was also way behind the software solution in CoD being done on the PS5 Geometry engine, that was again more versatile than a software wrapper.

Mesh Shaders replace more of the traditional geometry pipeline, than primitive shaders. And give more control to developers on how it works. This is why is was chosen as the standard for both DirectX and Vulkan.
Yes, the RT API is more efficient than the DXR API, but it does not change that RT was started by Nvidia and it became the standard for DirectX 12_2
It was not Sony, nor AMD that started implementing ray-tracing in APIs, games or hardware.

The VRS standard implemented by Nvidia in Turing calls for a hardware solution. Something that Nvidia had since Turing and AMD since RDNA2.
The PS5 does not have hardware VRS. Not you can say all you want about how good the software VRS is on CoD, but that is not the standard that is used by all current GPUs, be it from AMD, Intel or Nvidia.
And once again, it came from Nvidia. Not from AMD and not from Sony. And it's a feature missing on the PS5, but it's likely to be on the Pro.

PS5 uses a mix of features that transcend all of RDNA1-3 as shown by the ML AI solution in Ragnarok and the ability of the PS5 GPU to do RT and texturing simultaneously without blocking.

The Ai solution on Ragnarock uses normal FP16 calculations, without any ML acceleration. It just runs on shader code.
It's good that Sony managed to get something like that from a software based solution. But it's nothing comparable to DP4A, and much less to WMMA or Tensor Cores.
The Pro is the first Sony console to have dedicated hardware to accelerate Tensor operations. And this came from a response to Nvidia.
The reality is that Nvidia was the first to have an AI upscaler, and when everyone else noticed how good it was, that is when the rush to catch up started.
So now we have similar solutions from Intel, Sony, Qualcomm, AMD, Apple, etc.

As for Nanite, your whole strawman of DirectX doesn't change the reality that the PS5 geometry engine is more versatile and runs nanite more efficiently than either RDNA2 or Turing despite it being twice the hardware age of either.

Nanite supports both primitive and mesh shaders. And the default rendering method is using mesh shaders, as it delivers the best performance.
Only when a GPU does not support mesh shaders, such as AMD's RX5000 series or the PS5, does nanite fallback to primitive shaders.

No, I'm saying the PS5 prototype was ready 8years ago, not the start of the R&D that will have began at the finalisation of the PS4 Pro

Primitive shaders were already done a decade earlier on the PS2 with SotC, and done again on the SPUs

Yes, older consoles used a geometry pipeline that resembles modern primitive shaders.
But modern primitive and mesh shaders are a form of compute for geometry.
The PS3 didn't even have support for programable shaders.

The tensor cores weren't even out when the PS5 APU prototype was finished.

Nvidia started development of Tensor Cores right after they snatched the developers that were working at Intel, for the Larrabee project.
We are talking about 2009-2010 period, for Nvidia to start working on this feature. And it paid off, as they are now the market leader in AI, by a gigantic margin.

The PS5 APU was probably ready in early 2020. And the SoC for the Series S/X, a bit later.
Nvidia's first Tensor Core was released in 2017, with Volta.
 
Can check my history of comments where i said that it is impossible for PS5 Pro to beat RTX 4070 or give equal performance while, Sony itself claiming that PS5 Pro is "up to 45% faster than PS5", where as, RTX 4070 is about 80% faster than PS5.

I was also mocked by some people saying that 16.7 TFLOPS will be converted into 2X just because it is console hardware. The fact is that even PS5 Pro would not able to fully utilize it's raw performance.

Today graphic engines are much flexible and less optimized.

Glad that i was right and DF was wrong.
All that post based on a tweet talking about a game that wasn't even optimized for the Pro and all the upgrade being automatic lmao. But sure. You're the man and DF sucks.
 

shamoomoo

Banned
Are you trying to pretend that the PS5 SoC was already designed 8 years ago. That is complete nonsense.
And yes, it's DX12_2 that is setting the standard for modern GPUs. Be it from AMD, Intel or Nvidia.



Neither Nvidia nor AMD control the DirectX standard.
It's controlled by Microsoft and they always consult the main graphics vendors, when defining the next version.
Both AMD/ATI, Nvidia, Intel and other companies have contributed to defining DirectX features, across several iterations.



Mesh Shaders replace more of the traditional geometry pipeline, than primitive shaders. And give more control to developers on how it works. This is why is was chosen as the standard for both DirectX and Vulkan.
Yes, the RT API is more efficient than the DXR API, but it does not change that RT was started by Nvidia and it became the standard for DirectX 12_2
It was not Sony, nor AMD that started implementing ray-tracing in APIs, games or hardware.

The VRS standard implemented by Nvidia in Turing calls for a hardware solution. Something that Nvidia had since Turing and AMD since RDNA2.
The PS5 does not have hardware VRS. Not you can say all you want about how good the software VRS is on CoD, but that is not the standard that is used by all current GPUs, be it from AMD, Intel or Nvidia.
And once again, it came from Nvidia. Not from AMD and not from Sony. And it's a feature missing on the PS5, but it's likely to be on the Pro.



The Ai solution on Ragnarock uses normal FP16 calculations, without any ML acceleration. It just runs on shader code.
It's good that Sony managed to get something like that from a software based solution. But it's nothing comparable to DP4A, and much less to WMMA or Tensor Cores.
The Pro is the first Sony console to have dedicated hardware to accelerate Tensor operations. And this came from a response to Nvidia.
The reality is that Nvidia was the first to have an AI upscaler, and when everyone else noticed how good it was, that is when the rush to catch up started.
So now we have similar solutions from Intel, Sony, Qualcomm, AMD, Apple, etc.



Nanite supports both primitive and mesh shaders. And the default rendering method is using mesh shaders, as it delivers the best performance.
Only when a GPU does not support mesh shaders, such as AMD's RX5000 series or the PS5, does nanite fallback to primitive shaders.



Yes, older consoles used a geometry pipeline that resembles modern primitive shaders.
But modern primitive and mesh shaders are a form of compute for geometry.
The PS3 didn't even have support for programable shaders.



Nvidia started development of Tensor Cores right after they snatched the developers that were working at Intel, for the Larrabee project.
We are talking about 2009-2010 period, for Nvidia to start working on this feature. And it paid off, as they are now the market leader in AI, by a gigantic margin.

The PS5 APU was probably ready in early 2020. And the SoC for the Series S/X, a bit later.
Nvidia's first Tensor Core was released in 2017, with Volta.
The original pitch of DLSS is different from how it turned out,both Sony and Nvidia tried/have novel approaches to anti-aliasing.

Nvidia has to sell GPU and it makes no sense to pay for parts that isn't being used. RT was the "new hotness" for Turing with DLSS being a surprise hit because the initial premise didn't pan out on top of any serious RT being years away,so Nvidia has to justify their new GPU prices with something tangible and image quality seems to be the easiest way to achieve that because Nvidia already tried various AA methods on prior GPUs.

I personally don't what DLSS has to do with what Sony is doing now because most of the upscaler seem to function the same. I bring this up because Guerilla Games has a new AA/upscaler because of a bigger GPU,so I'm inclined to believe if Sony or Microsoft had more power,a portion of that power may have gone to IQ
 

HeisenbergFX4

Gold Member
Can check my history of comments where i said that it is impossible for PS5 Pro to beat RTX 4070 or give equal performance while, Sony itself claiming that PS5 Pro is "up to 45% faster than PS5", where as, RTX 4070 is about 80% faster than PS5.

I was also mocked by some people saying that 16.7 TFLOPS will be converted into 2X just because it is console hardware. The fact is that even PS5 Pro would not able to fully utilize it's raw performance.

Today graphic engines are much flexible and less optimized.

Glad that i was right and DF was wrong.
I was one of those who have said for a very long time the Pro was about equal to a 4070 in real world performance (not brute strength)

I was only repeating what I had been told from someone working on Black Ops 6 enhanced version of the Pro was very comparable to that of a 4070 PC
 
I was one of those who have said for a very long time the Pro was about equal to a 4070 in real world performance (not brute strength)

I was only repeating what I had been told from someone working on Black Ops 6 enhanced version of the Pro was very comparable to that of a 4070 PC
For call of duty it is a very bad benchmark reference because RX 6700 XT is beating RTX 4070 is all call of duty games in including Black ops 6 by 10%.
 

SonGoku

Member
The simple answer is that ASICs have their uses and some of the advantages are probably: lower latency, greater power efficiency, end programmer use simplicity, deterministic performance, to name just some.

At the end of the rendering per frame PSSR needs to do its thing so it is primarily a one way data flow so having it reload every frame into general GPU compute, or have it permanently occupy some of the CUs might not be the best use of GPU bandwidth and cache bandwidth, versus having an NPU doing heuristics stuff with motion vectors and the previous frames to await the native frame to enhance for output to hdmi,

Much like digital cameras can have ASICs for lowest latency features like histogram changes, an ASIC can automatically do the spectral analysis on minimal clock cycles compared to generalized GPU compute so might even have advantages to have core PSSR work done on an NPU and higher quality output for lower frame-rate PSSR additional work done on the CUs.
But how about doing it the nvidia way which is much more efficient for gaming by having dedicated AI/Tensor cores integrated into the GPU. Much like RT cores tehy serve the purpose of an ASIC but in a much more area efficient way.

NPUs just take way too much die space per TOP, unless 1 NPU TOP is worth like 10 GPU TOPs, I dont see any point in including it in a chip made for gaming.
 

Gaiff

SBI’s Resident Gaslighter
For call of duty it is a very bad benchmark reference because RX 6700 XT is beating RTX 4070 is all call of duty games in including Black ops 6 by 10%.
Pretty sure this is not true.
performance-1920-1080.png


NVIDIA does run poorly in this game, but not to the point where a 4070 gets beaten by a 6700. It still beats the 6700 XT by 21%, so it would still beat the regular one by 30% or more. Far from the usual 60%+, but still a sizable lead.

But looking at this chart, if the PS5 performs like a 7700 XT, it’d be in line with a 4070 as HeisenbergFX4 HeisenbergFX4 said. Dunno how it will be for other games, but if COD was the example his people on the inside gave him, then he was correct at least for that one.
 
Last edited:
Top Bottom