[Digital Foundry]PS5 uncovered

It's amazing how better specs printed on paper without possible interpretation will result in "disappointed Xbox fans"... because a bunch of marketeers & nobodies on the internet praise a rival product. The Xbox Series X has better specs, end of. The "special sauce" insinuations are just cringy at this point. We'll see whether the price reflects the difference.

I've implied to CPU and GPU can't be at max. clock at the same time. Yeah, that's a disappointment for them
 
Dude oberon from github at 9.2 tf was B0 stepping .ps5 now is at E0 . It has changed. Get over it .ps5 is 10.3 tf and variable clock is for when the game code doesn't need it so the fan stays quiet and system stays cool .

Again cerny said :
"There's enough power that both CPU and GPU can run at their limits of 3.5GHz and 2.23GHz, it isn't the case that the developer has to choose to run one of them slower."
Just think about it for a minute. Let's say this is true and with the cursor of the AMD smartshift shared power budget exactly in the middle (or somewhere else), the CPU runs at 3.5 and the GPU cores at 2.23 each.
What happens when the cursor shifts? pushing it in one direction or the other would cause one chip to overclock and run at higher than its max frequency. Which it clearly can't do since it's capped. I mean I know that's what Cerny seems to be saying but it's self contradicting. If there's a cursor position on that axis that allows both chips to run at max frequency and within a power budget that is cooling controlled, then there's no need to ever shift from that position. You shift because you want to get one side to the level of performance needed for a given task and if you can do it without affecting the other then why shift? to save electricity? Then why not allow the console to just draw what it needs at any point, not more not less (aka the XSX design) ?
When something requires this much explanation, it's generally because the mileage clock has been tinkered with (car salesman analogy).
 
As expected, more CUs better performance and scaling with Teraflops.
That means Xbox series X on paper is at worst 18% faster than max PS5 boost, in reality with CU scaling and variable PS5 clocks, XSX will be around 25-30% faster on average during graphics demanding workloads.
OjLZVb5.jpg

There's no getting around a massive difference in shader counts.

However, we don't know how RDNA 2 scales with frequency vs shaders. Tbh, we don't even know how well RDNA1 scales with CU counts, but in a lot of games an overclocked 5700 gets uncomfortably close to the 5700XT (which is why AMD capped its frequency well below the XT's frequency cap). The same shit happened with Vega 56 vs Vega 64.

Ideally, you'd want to see as linear scaling with CU and frequency with respect to performance, but that simply will not be the case.
 
You either have not read the article which the video does a serious dis-service to because it is packed with detail; or you are showing a serious lack of reading comprehension. Either-way please stop.

The profiles are there for debugging


It is simply there so allow developers to isolate the GPU or CPU to see their activity without one affecting the other. That is how profiling and optimization works. You look at the GPU and CPU activity to see what is taking forever to process and causing increase in frame time.
And that contradicts what I said... How?
 
Not at all. It is like saying a supercharged camry will have more power and acceleration than one that isnt. Here are some suoercharged shitbangers:


Stop with this bullshit, cars and compute power does not work. It was here before, in Xbox One launch and well that turned out just "great", right?
 
Just think about it for a minute. Let's say this is true and with the cursor of the AMD smartshift shared power budget exactly in the middle (or somewhere else), the CPU runs at 3.5 and the GPU cores at 2.23 each.
What happens when the cursor shifts? pushing it in one direction or the other would cause one chip to overclock and run at higher than its max frequency. Which it clearly can't do since it's capped. I mean I know that's what Cerny seems to be saying but it's self contradicting. If there's a cursor position on that axis that allows both chips to run at max frequency and within a power budget that is cooling controlled, then there's no need to ever shift from that position. You shift because you want to get one side to the level of performance needed for a given task and if you can do it without affecting the other then why shift? to save electricity? Then why not allow the console to just draw what it needs at any point, not more not less (aka the XSX design) ?
When something requires this much explanation, it's generally because the mileage clock has been tinkered with (car salesman analogy).
Because then the tempreture becomes unkown and the fan noise and speed gets high.dude he explained all of this and why he wants to ps5 to stay quiet at all times. So when code doesn't need gpu, gpu drops by few % in frequency (and that's minor )
 
Except it does, because that's how any silicon chip works, obviously they could do some massive cooler. Sure.

Except it doesn't.

No, it's not thermal bound in a traditional sense, it's bound by the electric draw. The system won't throttle by heat.

 
Except it doesn't.



I don't know who that matt is, but this would be really dangerous path to went on. That power controller on that SoC seems like it throttles based on thermals, well it could be done by power, because more heat on silicion the more power it draws. That's basic physics. Thermal sensors works this way.
 
Has there been confirmation that they won't support it? Or are people just assuming they won't because Cerny hasn't mentioned it?

If the architecture is capable of doing it, then all they'd have to do is support it in their APIs like Microsoft did with DirectX 12. If Sony really won't support it, then that would either mean that their programmers are too dumb to do it, or they're refusing to do it just for the hell of it. Neither of those seem particularly likely to me.

VRS is natively supported in DX12, Vulkan, and Open GL, and with XBSX and PS5 sharing the same architecture, there's no reason why the processes won't be Implemented in PS5 games. It's the same tactics MS used in 2013 with DX12 and tiled resources. PS5 will support VRS but you will never hear them call it VRS.

The 2018 MS VRS patent even references a patent by Cerny:

MS Patent: https://patents.google.com/patent/US20180047203A1/en
Cerny Patent: https://patents.google.com/patent/US10438312B2
 
I don't know who that matt is, but this would be really dangerous path to went on. That power controller on that SoC seems like it throttles based on thermals, well it could be done by power, because more heat on silicion the more power it draws. That's basic physics. Thermal sensors works this way.

Matt, IIRC when he was active here. He's working at some big 3rd party company. Also, he guessed that XSX will be more powerful in the end. Well, he was right on that.
But Cerny is a system architect. I'm pretty sure he certainly knew what he was doing designing the PS5. I addition with Sony patented cooling solution. Whatever will be in the end. As Cerny at GDC stated that will see it soon in teardown
 
Last edited:
Supercharged car is going to be more HP/KW a thus is going to be faster with more power. There is a lot things which imapct performance of car, like aerodynamic, etc. Which does not exist in compute world.

But Cerny is a system architect. I'm pretty sure he certainly knew what he was doing designing the PS5. I addition with Sony patented cooling solution. Whatever will be in the end. As Cerny at GDC stated that will see it soon in teardown
Sure I am just saying that in this case, there is going to be need for beefy cooler : )
 
Supercharged car is going to be more HP/KW a thus is going to be faster with more power. There is a lot things which imapct performance of car, like aerodynamic, etc. Which does not exist in compute world.


Sure I am just saying that in this case, there is going to be need for beefy cooler : )

Just edited the post about who Matt is
 
None of this shit makes any sense to me and most people. Way too fucking technical. Just show me the games. It's time.
 
OK where is the new info? So much promise without delivery, reminds me of a Phil Spencer press briefing......

From where I'm standing, Leadbetter had more questions than answers, I'm not sure he even had an interview with Mark Cerny......He spent most of the video asking questions on the boost clock, something many of us had already figured out within minutes of watching Cerny's GDC presentation...….

Instead, he spent most of his time comparing RDNA 1 architecture to RDNA 2 architecture in PS5 where the latter is on an improved node has transitioned fully to pure gaming and has efficiencies and improved performance per watt by over 50% over RDNA 1......Why would you compare CU and clock differences on RDNA 1 arch, it gives no indication of the perf and efficiency increase....

Leadbetter went on forever of trying to impress us how a very tasking game will reduce clocks, he is going on and on of how he needs to see the games before making judgement, he has seen no games on PS5 and needs to, but what he saw on Series X was just a video of Hellblade at 60% of 4K at 24fps......At least we've seen Godfall and Quantum Error......Oh I forgot, the first game seen on Series X is Orphan of the Machine from Dynamic Voltage Games, so that's one.....

Yet, he barely touched sound, he basically said nothing on the SSD, which is what everyone is talking about, so what exactly did he ask Cerny then? Where is the talk of the geometry engine and cache scrubbers, how about getting more info on that? Richard was more interested in telling us that he is not sure if PS5 will have VRS and the whole "is it fully RDNA 2.0 shebang skip-rope game that we've seen before on these forums" as guys like Dynamite Cop always ask in such threads. Uhmmm, Sony never said outright said it, so it's not there, so they are casting doubt in the name of "I want to know, spell it out for me".....Same with their skepticism on hardware raytracing before, and that PS5 would not be more than 8-9TF and on RDNA 1 etc.....

We know that XBOX has BC baked in, so you are interviewing Cerny if you cared to know more about PS5 BC, why didn't you ask him instead of telling us what we already know and we know Shadowfall is still impressive, it's still the most technical and graphically impressive FPS released this whole gen....I went into this video, thinking there was something, but Richard seems to have done more pointless comparisons, had more questions and tried to cast more doubt than the very clear presentation that Cerny already gave...…..The next time you get a breakdown or further details on PS5 is probably the next time we see Cerny or when there is another Wired article.....Should have known this....
 
Can you dumb cunts who are parotting "Several developers speaking to Digital Foundry have stated that their current PS5 work sees them throttling back the CPU in order to ensure a sustained 2.23GHz clock on the graphics core." missed the above bit of bold?

Obviously not, more cherry-picking to push something which has been explained to death many times.

It's a serious reading comprehension issue. Looks like shutting down the schools for the pandemic is having an effect on some of the posters here.
 
Because what u said was bs based point people are writing. Sorry was obvious u didn't read the article .
You cannot beat physics or reality. And there is nothing to indicate that it's for debugging or isolation in the article. It is simply stated that developers like to have static clocks rather than variable ones. There is nothing to indicate what I said was wrong.
 
DF redemption! Last week they were accused of being Microsoft shills now everything is all well again I suppose. I guess it depends if you're in favor of their analysis or not.
 
Last edited:
The PS4 allows power consumption to vary and controls the temperature using the fans. The PS5 holds the power level constant, and allows the temperature to vary without changing the fan speed (in theory). When the workload would push the power consumption over the limit, the system reduces the clock speeds to compensate.
But in this case this is something that developers have to pay attention for, isn't it? That's a huge task if I understand this correctly. Just an example: let's say they have to test a certain scene where everything is blowing up, there are lots of characters etc. They achieve their target fps with a certain number of characters, but if they added more the framerate would be worse. And they have to test everything like this. They always have to keep the "budget" in mind which is quite difficult in a dynamic open world game. They can't let the system just take care of this problem because that can mean that a certain number of characters is too much for the system, it donwclocks itself and the framerate tanks. I hope it's clear what I mean and I'm not a developer so it is easily possible that I'm talking nonsense. I just want to understand this whole thing.
 
Because then the tempreture becomes unkown and the fan noise and speed gets high.dude he explained all of this and why he wants to ps5 to stay quiet at all times. So when code doesn't need gpu, gpu drops by few % in frequency (and that's minor )
I seriously doubt that this unpredictable performance approach is just a way of keeping our consoles quiet. Sony have just sold 110 million of some the loudest consoles ever made. No developer would request this approach in exchange of a quieter device. Developers would have said "It's not my job to worry about the cooling, it's the manufacturer's". Microsoft heard this and put in a cooling solution that may or may not be up to par but they answered the call and took on the challenge. Sony has handled this by compromising the performance available to developers (DF says some are throttling the CPU because it's unpredictable) and by misleading the consumer with a 10.3 TFlops number that is not the sustained performance.

Sony cares about the experience that developers have on their machine. It's what gave the PS4 that insurmountable start. Hence the widespread theory that they heard about what MS achieved and quickly used whatever they could to be able to claim a double digit Tflop number.
 
Timestamped quote by DF....

"More than one developer has told us that they are running the CPU throttled back, allowing for excess power to pour into the GPU to ensure a consistently locked 2.23 GHz."



Youre missing the point.
Devkits only has profiles the devs must choose, hence the quote.
Retail units are not bound to those profiles since they have smartshift, and devkits don't.
 
Made up what ?

He didn't confirm it in the presentation and with DF Interview with him

The absence of the VRS is odd

Man Sony need to start give some money to some of PS fanboys they're working day and night for their beloved plastic

Yet here you are on a Sony thread, for a console you will not own, spreading FUD. Talk about fanboys working 24/7
 
DF redemption! Last week they were accused of being Microsoft shills now everything is all well again I suppose. I guess it depends if you're in favor of their analysis or not.
this is definitely excellent work by Richard. It's disingenuous to say all of DF is bad, we all know which one of them has been downplaying the ps5 since the first wired article.

that said, this doesnt clear up anything and raises two questions for every answer cerny gives. richard's dev sources seem to be saying the complete opposite of what cerny is claiming and it's really hard to figure out who to trust. richard and his sources or cerny. i think the video is confusing on purpose. maybe because richard couldnt get a straight answer or maybe cerny is being confusing on purpose.

at the end of the day, he needs to show the games. and we need to see comparisons. specs are hard data and should not have resulted in this much FUD, but here we are. i think its time for cerny to show games.
 
A quote from the article:

"Several developers speaking to Digital Foundry have stated that their current PS5 work sees them throttling back the CPU in order to ensure a sustained 2.23GHz clock on the graphics core. It makes perfect sense as most game engines right now are architected with the low performance Jaguar in mind - even a doubling of throughput (ie 60fps vs 30fps) would hardly tax PS5's Zen 2 cores. However, this doesn't sound like a boost solution, but rather performance profiles similar to what we've seen on Nintendo Switch. "Regarding locked profiles, we support those on our dev kits, it can be helpful not to have variable clocks when optimising. Released PS5 games always get boosted frequencies so that they can take advantage of the additional power," explains Cerny."

throttling back the CPU to ENSURE GPU sustained clock.
I have to say that reading comprehension really shows the level of stupidity of some folks here.
 
Youre missing the point.
Devkits only has profiles the devs must choose, hence the quote.
Retail units are not bound to those profiles since they have smartshift, and devkits don't.
No. YOU are missing the point.

Ugh. Let me spell this out since people are apparently incapable of understanding.

FACT 1: Several developers speaking to Digital Foundry have stated that their current PS5 work sees them throttling back the CPU in order to ensure a sustained 2.23GHz clock on the graphics core.
Fact 2: The max clock of the GPU is 2.23 GHz.

If you are developing your game on a profile that ALREADY has the GPU running at 2.23 GHz constantly, you can talk about boost all you want, the GPU will not be outputting more performance. The CPU still can. That is what I said, but somehow people are telling me I'm wrong. To say that I am wrong, you are saying that the GPU can clock beyond 2.23 GHz, which it will not.

The question is how much performance the additional CPU boost will give. Most likely = not much.
 
Last edited:
cerny probably had to deal with the 400 dollars top budget imposed by sony and that's why he tried to squeeze the most from a weaker console with exotic stuff like this variable boost clock.
i don't see any reason to not use a 12tf gpu with fixed clock if not for a money problem.
And that's fine, I have no problem with that. Budget constricts performance; that has always been the case, in all feats of engineering. These consoles can be absolute beasts if they could sell them at USD800, but at those prices, they won't get much sales. Especially with huge downturn that world economies are set to face with this pandemic.

But for anyone to sit there and try to act as though, somehow, this thing is more powerful than the Xbox Series X is just ridiculous. It's not more powerful in paper, and so far, developers seem to be suggesting that it's not more power in practice either. I guess time will surely tell, but if I had to bet on it, I'd bet that we'd see consistently better performance on Series X in next gen.

Most of my posts in these topics have mostly been me trolling Sony fans, because it's funny. But I'm really not that invested in either. Yes, I game mostly on an Xbox, but Phil isn't my cousin, and when an Xbox sells, I don't get a percentage of the money. These businesses are businesses first, and outside of the fun and games of mindless console wars, none of these companies owe us anything, nor are they loyal to us. They're loyal to their pockets, bottom line and top line, period. It just so happens that a lot of the time, pleasing customers also satisfy those other areas.

That said; SUCK IT PONIES! I hope your PS5's overheat and burn down the houses you've built your miserable little lives in! *insert evil laugh here*
 
Several developers speaking to Digital Foundry have stated that their current PS5 work sees them throttling back the CPU in order to ensure a sustained 2.23GHz clock on the graphics core.

"Regarding locked profiles, we support those on our dev kits, it can be helpful not to have variable clocks when optimising. Released PS5 games always get boosted frequencies so that they can take advantage of the additional power," explains Cerny.

Cerny said that "running a GPU at 2GHz was looking like an unreachable target with the old fixed frequency strategy" and also "similarly running the CPU at 3GHz was causing headaches with the old strategy".

So, if there's a profile where the GPU is always at its max clock (2.23), then the CPU needs to ran below 3GHz.

Of course, I'm considering that this profile is just like the one on Switch (wich Richard did use to compare, saying it's similar), where the clock is fixed. If it is a profile fixing how much power the CPU and the GPU will get, then it can (and probably will) be a different scenario.
 
And that contradicts what I said... How?
You kidding me mate? This is why people need to read the article, Rich just muddied the discussion for no reason with the video.
What you quoted does in no way change what was said about developers throttling back the CPU to ensure they always reach 2.23 GHz. Why? Because when the developers use the profiles, they will automatically be programming to load the CPU less and the GPU more. This means that the CPU might get a slight bump in performance compared to the profile when running the retail games on the retail console, but since developers were already maxing out the GPU, the GPU performance will stay the same. The additional performance from the slight CPU bump will not make much of a difference in practice. The main difference is that one is manually done by the developers to get the results they want, and the other is that the system does it automatically with SmartShift, with a slight boost to only CPU performance in this case, IF there is any leeway.
There is no manual throttling, it is simply there for profiling and optimization purposes. They are not throttling, they are restricting the GPU or CPU to isolate them to better understand the activity each frame. They do the same for both.

If a developer is targeting 60fps performance, the frame time has to be at 16.67ms, if they go above that then the frame rate drops. In order to see the cause they can lock the frequency of the GPU or CPU independently to see what is causing the spike in frame-time. This is essentially taking variable frequency and smart-shift out of the equation as possible causes of the performance issues. They essential profile the game with variable frequency and with locked frequency to see any issues.

They are not programming the game to MAX the GPU or CPU as you put. It is just a profiling tool for optimization.
 
Supercharged car is going to be more HP/KW a thus is going to be faster with more power. There is a lot things which imapct performance of car, like aerodynamic, etc. Which does not exist in compute world.


Sure I am just saying that in this case, there is going to be need for beefy cooler : )

i was looking at a quote and explaining what it was communicating. You can design a cheap low powered system that is, in fact, supercharged whether it be a pc or a car.
 
Has anyone seen this shit floating around on Twitter? It is posted by a WindowsCentral guy so...salt mine worth of salt. Also, I seriously hope it is not true. No one wants to see the console delayed. Also, does anyone know who the fuck the guy saying all that is?

 
Last edited:
No. YOU are missing the point.

Ugh. Let me spell this out since people are apparently incapable of understanding.

FACT 1: Several developers speaking to Digital Foundry have stated that their current PS5 work sees them throttling back the CPU in order to ensure a sustained 2.23GHz clock on the graphics core.
Fact 2: The max clock of the GPU is 2.23 GHz.

If you are developing your game on a profile that ALREADY has the GPU running at 2.23 GHz constantly, you can talk about boost all you want, the GPU will not be outputting more performance. The CPU still can. That is what I said, but somehow people are telling me I'm wrong. To say that I am wrong, you are saying that the GPU can clock beyond 2.23 GHz, which it will not.

The question is how much performance the additional CPU boost will give. Most likely = not much.
Yes because on the devkits they dont have the smartshift tech, so they choose between static set profiles.
Retail units have smartshift wich is more dynamic and can change on-the-fly when needed.
You won't need full speed CPU and full spedd GPU at the same time. Not for some years anyways.
Cerny talked about this in his presentation. (".. when that game finally comes along...")
As he also said, the GPU could be clocked even higher, but they capped it.
 
Last edited:
There is no manual throttling, it is simply there for profiling and optimization purposes. They are not throttling, they are restricting the GPU or CPU to isolate them to better understand the activity each frame. They do the same for both.
My god... Tell me you're joking. Choosing a profile is done manually, is it not? SmartShift is done automatically, is it not?
A developer choosing a profile to max GPU automatically throttles the CPU, does it not?
 
Last edited:
Has anyone seen this shit floating around on Twitter? It is posted by a WindowsCentral guy so...salt mine worth of salt. Also, I seriously hope it is not true. No one wants to see the console delayed. Also, does anyone know who the fuck the guy saying all that is?



Was posted on Next Gen thread. Was spotted by Windows Central in the same day Digital Foundry release a video about PS5. Maybe this give you an ideia:
 
Last edited:
Cerny:

"There's enough power that both CPU and GPU can run at their limits of 3.5GHz and 2.23GHz, it isn't the case that the developer has to choose to run one of them slower."


Can we put the 9.2 tf bs to bed now .
Where are those posts that said developers had to choose lol I remember reading some
 
Just watched the video and this thing is fucking insane. The PS5 will have enough power budget to run both the CPU and GPU at peak most of the time, but still have enough left over to potentially shift more to the GPU and CPU depending on the task? No wonder developers are creaming their pants talking about PS5, this thing is a monster. 😳😳😳
 
Just watched the video and this thing is fucking insane. The PS5 will have enough power budget to run both the CPU and GPU at peak most of the time, but still have enough left over to potentially shift more to the GPU and CPU depending on the task? No wonder developers are creaming their pants talking about PS5, this thing is a monster. 😳😳😳
PR talk works well on you, doesn't it.

Yes because on the devkits they dont have the smartshift tech, so they choose between static set profiles.
Retail units have smartshift wich is more dynamic and can change on-the-fly when needed.
You won't need full speed CPU and full spedd GPU at the same time. Not for some years anyways.
Cerny talked about this in his presentation. (".. when that game finally comes along...")
As he also said, the GPU could be clocked even higher, but they capped it.
All this is true.
 
Last edited:
i was looking at a quote and explaining what it was communicating. You can design a cheap low powered system that is, in fact, supercharged whether it be a pc or a car.
Super charged with what exactly, it could have less bottleneck, sure, but supercharging does not really work, like something called "blast processing".
 
My god... Tell me you're joking. Choosing a profile is done manually, is it not? SmartShift is done automatically, is it not?
A developer choosing a profile to max GPU automatically throttles the CPU, does it not?
Smartshift has nothing to do with frequency, it has everything to do with adjusting power budget. They have a profiler built into the systems that looks at the GPU activity and adjust frequency. Smartshift just redirect power.

No, a developer selects a profile that raises the frequency to the max for either the CPU or GPU to monitor what is going on each frame. They are not programming for specific performance profile using those profiles which is my sole contention with your post.
 
Last edited:
Was posted on Next Gen thread. Was spotted by Windows Central in the same day Digital Foundry release a video about PS5. Maybe this give you an ideia:


Ah good find. I didn't see Jason reply. And the guy privated his account now. I am sure people will read into that as well haha. It is interesting that WindowsCentral "claim" they have been hearing the same rumblings before this but have not said anything. It all certainly smells of console war bullshit, but even pretending it is true I seriously hope the PS5 isn't getting delayed for any reason let alone this one. And IF they are worried about changing the form factor looking as they are copying Microsoft I seriously do not give a shit. I think the Xbox Series X is the coolest new look for a console in a very very long time. If I have two boxes that look similar that woudl be fine by me. But again, I am not assuming any of this is even true.
 
Ah good find. I didn't see Jason reply. And the guy privated his account now. I am sure people will read into that as well haha. It is interesting that WindowsCentral "claim" they have been hearing the same rumblings before this but have not said anything.

He not only get his account private, but also deleted the tweets Jason spotted.
 
Last edited:
You kidding me mate? This is why people need to read the article, Rich just muddied the discussion for no reason with the video.

There is no manual throttling, it is simply there for profiling and optimization purposes. They are not throttling, they are restricting the GPU or CPU to isolate them to better understand the activity each frame. They do the same for both.

If a developer is targeting 60fps performance, the frame time has to be at 16.67ms, if they go above that then the frame rate drops. In order to see the cause they can lock the frequency of the GPU or CPU independently to see what is causing the spike in frame-time. This is essentially taking variable frequency and smart-shift out of the equation as possible causes of the performance issues. They essential profile the game with variable frequency and with locked frequency to see any issues.

They are not programming the game to MAX the GPU or CPU as you put. It is just a profiling tool for optimization.
Game developer to Digital Foundry: "I'm having tho THROTTLE the cpu to stabilise the gpu"
GAF console warrior: "They're not throttling, they're isolating".

Sony couldn't pay some of you enough for all the spin and free damage control.
 
Super charged with what exactly, it could have less bottleneck, sure, but supercharging does not really work, like something called "blast processing".

since you asked:

What Does 'Supercharged' Mean, Anyway?
The PlayStation 4's architecture looks very familiar, at first blush -- and it is. But Cerny maintains that his team's work on it extends it far beyond its basic capabilities.

For example, this is his take on its GPU: "It's ATI Radeon. Getting into specific numbers probably doesn't help clarify the situation much, except we took their most current technology, and performed a large number of modifications to it."

To understand the PS4, you have to take what you know about Cerny's vision for it (easy to use, but powerful in the long term) and marry that to what the company has chosen for its architecture (familiar, but cleverly modified.) That's what he means by "supercharged."

"The 'supercharged' part, a lot of that comes from the use of the single unified pool of high-speed memory," said Cerny. The PS4 packs 8GB of GDDR5 RAM that's easily and fully addressable by both the CPU and GPU.

If you look at a PC, said Cerny, "if it had 8 gigabytes of memory on it, the CPU or GPU could only share about 1 percent of that memory on any given frame. That's simply a limit imposed by the speed of the PCIe. So, yes, there is substantial benefit to having a unified architecture on PS4, and it's a very straightforward benefit that you get even on your first day of coding with the system. The growth in the system in later years will come more from having the enhanced PC GPU. And I guess that conversation gets into everything we did to enhance it."

The CPU and GPU are on a "very large single custom chip" created by AMD for Sony. "The eight Jaguar cores, the GPU and a large number of other units are all on the same die," said Cerny. The memory is not on the chip, however. Via a 256-bit bus, it communicates with the shared pool of ram at 176 GB per second.

"One thing we could have done is drop it down to 128-bit bus, which would drop the bandwidth to 88 gigabytes per second, and then have eDRAM on chip to bring the performance back up again," said Cerny. While that solution initially looked appealing to the team due to its ease of manufacturability, it was abandoned thanks to the complexity it would add for developers. "We did not want to create some kind of puzzle that the development community would have to solve in order to create their games. And so we stayed true to the philosophy of unified memory."

In fact, said Cerny, when he toured development studios asking what they wanted from the PlayStation 4, the "largest piece of feedback that we got is they wanted unified memory."

The three "major modifications" Sony did to the architecture to support this vision are as follows, in Cerny's words:

  • "First, we added another bus to the GPU that allows it to read directly from system memory or write directly to system memory, bypassing its own L1 and L2 caches. As a result, if the data that's being passed back and forth between CPU and GPU is small, you don't have issues with synchronization between them anymore. And by small, I just mean small in next-gen terms. We can pass almost 20 gigabytes a second down that bus. That's not very small in today's terms -- it's larger than the PCIe on most PCs!
  • "Next, to support the case where you want to use the GPU L2 cache simultaneously for both graphics processing and asynchronous compute, we have added a bit in the tags of the cache lines, we call it the 'volatile' bit. You can then selectively mark all accesses by compute as 'volatile,' and when it's time for compute to read from system memory, it can invalidate, selectively, the lines it uses in the L2. When it comes time to write back the results, it can write back selectively the lines that it uses. This innovation allows compute to use the GPU L2 cache and perform the required operations without significantly impacting the graphics operations going on at the same time -- in other words, it radically reduces the overhead of running compute and graphics together on the GPU."
  • Thirdly, said Cerny, "The original AMD GCN architecture allowed for one source of graphics commands, and two sources of compute commands. For PS4, we've worked with AMD to increase the limit to 64 sources of compute commands -- the idea is if you have some asynchronous compute you want to perform, you put commands in one of these 64 queues, and then there are multiple levels of arbitration in the hardware to determine what runs, how it runs, and when it runs, alongside the graphics that's in the system."


Another thing the PlayStation 4 team did to increase the flexibility of the console is to put many of its basic functions on dedicated units on the board -- that way, you don't have to allocate resources to handling these things.

"The reason we use dedicated units is it means the overhead as far as games are concerned is very low," said Cerny. "It also establishes a baseline that we can use in our user experience."

"For example, by having the hardware dedicated unit for audio, that means we can support audio chat without the games needing to dedicate any significant resources to them. The same thing for compression and decompression of video." The audio unit also handles decompression of "a very large number" of MP3 streams for in-game audio, Cerny added.

At the New York City unveiling of the system, Cerny talked about PlayGo, the system by which the console will download digital titles even as they're being played.

"The concept is you download just a portion of the overall data and start your play session, and you continue your play session as the rest downloads in the background," he explained to Gamasutra.

However, PlayGo "is two separate linked systems," Cerny said. The other is to do with the Blu-ray drive -- to help with the fact that it is, essentially, a bit slow for next-gen games.

"So, what we do as the game accesses the Blu-ray disc, is we take any data that was accessed and we put it on the hard drive. And if then if there is idle time, we go ahead and copy the remaining data to the hard drive. And what that means is after an hour or two, the game is on the hard drive, and you have access, you have dramatically quicker loading... And you have the ability to do some truly high-speed streaming."

To further help the Blu-ray along, the system also has a unit to support zlib decompression -- so developers can confidently compress all of their game data and know the system will decode it on the fly. "As a minimum, our vision is that our games are zlib compressed on media," said Cerny.

There's also another custom chip to put the system in a low-power mode for background downloads. "To make it a more green hardware, which is very important for us, we have the ability to turn off the main power in the system and just have power to that secondary custom chip, system memory, and I/O -- hard drive, Ethernet. So that allows background downloads to happen in a very low power scenario. We also have the ability to shut off everything except power to the RAMs, which is how we leave your game session suspended."

Sounds Good, But... Bottlenecks?
One thing Cerny was not at all shy about discussing are the system's bottlenecks -- because, in his view, he and his engineers have done a great job of devising ways to work around them.

"With graphics, the first bottleneck you're likely to run into is memory bandwidth. Given that 10 or more textures per object will be standard in this generation, it's very easy to run into that bottleneck," he said. "Quite a few phases of rendering become memory bound, and beyond shifting to lower bit-per-texel textures, there's not a whole lot you can do. Our strategy has been simply to make sure that we were using GDDR5 for the system memory and therefore have a lot of bandwidth."

That's one down. "If you're not bottlenecked by memory, it's very possible -- if you have dense meshes in your objects -- to be bottlenecked on vertices. And you can try to ask your artists to use larger triangles, but as a practical matter, it's difficult to achieve that. It's quite common to be displaying graphics where much of what you see on the screen is triangles that are just a single pixel in size. In which case, yes, vertex bottlenecks can be large."

"There are a broad variety of techniques we've come up with to reduce the vertex bottlenecks, in some cases they are enhancements to the hardware," said Cerny. "The most interesting of those is that you can use compute as a frontend for your graphics."

This technique, he said, is "a mix of hardware, firmware inside of the GPU, and compiler technology. What happens is you take your vertex shader, and you compile it twice, once as a compute shader, once as a vertex shader. The compute shader does a triangle sieve -- it just does the position computations from the original vertex shader and sees if the triangle is backfaced, or the like. And it's generating, on the fly, a reduced set of triangles for the vertex shader to use. This compute shader and the vertex shader are very, very tightly linked inside of the hardware."

It's also not a hard solution to implement, Cerny suggested. "From a graphics programmer perspective, using this technique means setting some compiler flags and using a different mode of the graphics API. So this is the kind of thing where you can try it in an afternoon and see if it happens to bump up your performance."

These processes are "so tightly linked," said Cerny, that all that's required is "just a ring buffer for indices... it's the Goldilocks size. It's small enough to fit the cache, it's large enough that it won't stall out based on discrepancies between the speed of processing of the compute shaders and the vertex shaders."

He has also promised Gamasutra that the company is working on a version of its performance analysis tool, Razor, optimized for the PlayStation 4, as well as example code to be distributed to developers. Cerny would also like to distribute real-world code: "If somebody has written something interesting and is willing to post the source for it, to make it available to the other PlayStation developers, then that has the highest value."

https://www.gamasutra.com/view/feature/191007/inside_the_playstation_4_with_mark_.php?print=1
 
Smartshift has nothing to do with frequency, it has everything to do with adjusting power budget. They have a profiler built into the systems that looks at the GPU activity and adjust frequency. Smartshift just redirect power.

No, a developer selects a profile that raises the frequency to the max for either the CPU or GPU to monitor what is going on each frame. They are not programming for specific performance profile using those profiles which is my sole contention with your post.

Are you not feeding more power to boost a frequency? I mean we may be arguing semantics here, but if SmartShift is changing frequencies it regulates power on some level I would assume?
 
Top Bottom