• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Exploring The Complications Of Series X's Memory Configuration & Performance

How much are YOU willing to say recent Xbox multiplat perf is affected by the memory setup?

  • Very

    Votes: 18 9.8%
  • Mostly

    Votes: 32 17.5%
  • So/so

    Votes: 40 21.9%
  • Not really

    Votes: 41 22.4%
  • None

    Votes: 52 28.4%

  • Total voters
    183
  • Poll closed .
Status
Not open for further replies.
I didn't say anything about requirements, your being defensive while creating conflict in your own post.

If you were referring to the thread outside of the OP, you wouldn't accuse me of of thinking people need an engineering degree.

I brought up the strategy in the OP, and you keep proving it works. Oh well.
You are implying the op has a strategy and illfull agenda.

Can you prove any of that? Then please do. I would gladly listen and ignore him in the future.
If you can't prove any of your claims then please leave this thread and stop accusing people of wrongfull doings.
 
Last edited:

DenchDeckard

Moderated wildly
Vulgarization is informative, and you are talking against OP motives instead of talking about what was said. If someting is factually wrong or inexact, we would welcome your posts a lot more than just

IF he does not know what he is saying, either you do or you do not. If you do, please make a counter argument. If you do not, then please at least take one of his argument, then explain why you think that he said it in bad faith.
If you have a better idea about why there is a PS5 advantage in some third parties games like Hogwarts, atomic heart ... Please say so.


Part of what makes consoles great is seeing them "punching above their weight." The PS3, with games like TLOU, and the PS4 surprised us , and continue to do so with games like Horizon Forbidden West and GOW Ragnarok. Nothing wrong to talk about what allow those game to be so good graphically and what is in the way of Xbox doing the same.

Horizon looks good but god of war ragnarok does not look that great to be honest. The character models and animation on all companion characters is not that good at all. It doesn't look as good as gears of war hivebister characters imo. Or plague tale etc.

Horizon looks great though.
 
Don't worry MS, Connectix has your back!


https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcadb1a8-ede2-4666-bd05-8c8900842f8d_656x446.jpeg
 
Horizon looks good but god of war ragnarok does not look that great to be honest. The character models and animation on all companion characters is not that good at all. It doesn't look as good as gears of war hivebister characters imo. Or plague tale etc.

Horizon looks great though.
I do not agree with stat statement at all.
I liked the look of GOW characters, but I can understand that some do not. At least he understood what I was trying to say.
 
I do not agree with stat statement at all.
I think what's great about Ragnarok is the framerate it can reach on PS5: 80-90fps in the performance mode while horizon is usually 60-70fps. Also the resolution is higher (without CBR) in Ragnarok (still in the performance mode).

Overall I'd say both games are impressive in their own ways on PS5.
 

Kataploom

Gold Member
There is no subject matter to discuss.

There's no analysis.

As I said, the OP doesn't know what he's saying but acting like he does hoping people like you believe there's knowledge and technical detail in the post, where there isn't.

You are talking about "hard" data that doesn't exist in the way you think, because you took the bait.
Go tell us what's wrong, so the discussion can get better, I'm enjoying the thread despite not participating because I am not that deep into computer graphics programming... I'd like to read your take on the subject, with details and not "because I know that I know" lol
 
Last edited:

Allandor

Member
I stopped reading after the OP's avatar disappeared from the screen.
What? it is fine on my 32k screen. Just one page :messenger_winking_tongue:

(Just kidding before someone thinks I have such a screen.... internet ...)

To make the answer short, no in most cases in shouldn't make it more complicated or slower, as not that much data is processed in every frame.
Also there is enough data in the memory that just isn't read etc that extensive so it really doesn't make such a difference.

But it would have made things easier if they used a bigger interface for all the memory, because than the box would have had an even higher memory bandwidth. They really saved on the wrong end here in my opinion, but made the best out of it with that interface they have.
 

MarkMe2525

Gold Member
I wonder how would this so called SFS would interact with ray tracing too
I could be wrong, but because SFS deals with the textures and not the actual geometry, it would not interact with the RT structure directly. Indirectly, a game using SRS should have more available RAM and memory bandwidth to store and move the RT information.

This is my laymen's understanding.
 
Is this one of those users who makes long posts pretending they have content to discuss and they are technical, but when someone who knows about game tech, development, or coding gets involved, they flee into the woods?

A lot of this posters speculation is based on nothing, and they are coming to conclusions that only manifest by the OP having know idea how anything works while also hoping nobody else does, so is repeating console war talking points you can find on YouTube by guys who still think the registry in Windows, is when you extent.your warranty.

Welp, I'm here so, if you have points to address or feel I'm wrong on something, by all means share it. I don't run away from being proven wrong, so if someone brings something up I didn't consider or makes me look at a thing differently, no problem.

But I mean right now you're making some wild claims yourself but haven't done a single thing to see how what I mentioned is based on "nothing", or that I hope no one else knows what I'm bringing up. I mean there's a lot of meme posts about me writing a novel but most of people ITT generally know what I'm talking about.

In the meantime tho I got some other posts to catch up on reading so, have fun!
 
Your paranoia is funnier, I haven't even mentioned Primitive Shaders.
Mesh Shaders have a shorter more granular pipeline, so are better for Devs and are now the industry standard, that probably won't make any huge difference to performance but it's there, this troll thread is about Xbox hardware so it's a valid point. Same as when some clowns claimed hardware assisted Tier 2 VRS was just a DX12U term until Doom Eternal came along.
Full RDNA2 is one thing but Series consoles go beyond even that, larger grouping than RDNA2 for Mesh Shaders, SFS Filters and core adjustment for ML.
Bespoke forward looking hardware as I said.

An AMD engineer came out in a recent interview; there's not much difference between Primitive and Mesh Shaders in practice. It's just down to Mesh Shaders being a software implementation. Both PS5 and Series have Primitive Shader units in them.

Here's the thread with the article; here are some quotes from the article itself (translated to English)

If so, how is AMD's Primitive Shader currently handled? Sony Interactive Entertainment ( hereinafter referred to as SIE) has appealed the adoption of Primitive Shader in the GPU of PlayStation 5 (related article ). Mesh Shader in Xbox Series X | S (XSX) ( related link ).

 What's interesting is that both PS5 and XSX are equipped with almost the same generation AMD GPUs, but the new geometry pipeline uses different technologies as standard.

 Certainly, Mesh Shader was adopted as standard in DirectX 12. However, the new geometry pipeline concept originally started with the concept of tidying up the complicated geometry pipeline, making it easier for game developers to use, and to make it easier to In other words, it can be said that both AMD and NVIDIA had the same goal as the starting point of the idea. To be frank, Primitive Shader and Mesh Shader have many similarities in terms of functionality, although there are differences in implementation.

 So did AMD abandon the Primitive Shader? As for hardware, Primitive Shader still exists, and how to use Mesh Shader is realized with Primitive Shader, it corresponds to Mesh Shader with such an image.

Basically, Sony & Microsoft are using the same Primitive Shader Units for the same end results in both systems. But Sony is using an access implementation that is different from Microsoft's, which is using it as a Mesh Shader. The functionality is practically the same; it's just the implementation that's different.

That's directly from AMD engineer Mr.Wang, btw. Also no Series consoles don't have larger grouping than RDNA2 for Mesh Shader groups; they just have (had?) larger group sizes than what Nvidia supported at the time. But with further CUDA revisions since 2020, chances are Nvidia supports a similar group size (256) as Microsoft, I'm just assuming on that point though.

The worst case scenario I am talking about is the physical limit of what the CPU can actually consume on main ram (if accessing the slowest pool obviously). It can't use more of that bandwidth because its bus is limited so it can't take 33% of main bandwidth even if a dev would be stupid enough of programming the CPU doing so.

I don't remember the numbers, just the average bandwidth lost if all CPU bandwidth was used on the main ram: about 40GB/s

Okay, I think I get what you're saying then. I thought the CPUs in the new consoles would be a bit more bandwidth-hungry than that but, they are based on Zen 2 mobile-line chips (albeit customized), so I guess I shouldn't be surprised.

It's not really saying too much though in terms of scenarios where maybe the 10 GB capacity for fast data access isn't enough, because any scenario where the GPU needs data in the slower pool will still drag down the GPU's peak bandwidth down. But now I'm more confident that in even those instances total effective system bandwidth would still be a portion higher than PS5's.

Although as I saw someone else saying, the part of the pipeline that would possibly be most affected by bandwidth constriction are probably the ROPs.
 
Last edited:

Panajev2001a

GAF's Pleasant Genius
An AMD engineer came out in a recent interview; there's not much difference between Primitive and Mesh Shaders in practice. It's just down to Mesh Shaders being a software implementation. Both PS5 and Series have Primitive Shader units in them.

Here's the thread with the article; here are some quotes from the article itself (translated to English)





Basically, Sony & Microsoft are using the same Primitive Shader Units for the same end results in both systems. But Sony is using an access implementation that is different from Microsoft's, which is using it as a Mesh Shader. The functionality is practically the same; it's just the implementation that's different.

That's directly from AMD engineer Mr.Wang, btw. Also no Series consoles don't have larger grouping than RDNA2 for Mesh Shader groups; they just have (had?) larger group sizes than what Nvidia supported at the time. But with further CUDA revisions since 2020, chances are Nvidia supports a similar group size (256) as Microsoft, I'm just assuming on that point though.
Thanks for posting the excerpt, I trusted the link to the thread would have been fine but eh 🤷‍♂️ heh.
 
I can’t comment much on this stuff myself, as a layman. But with my limited understanding of memory usage, programming concepts, and absolutely no knowledge of the video game industry - I’d say it seems like Microsoft over engineered this solution. In a perfect world it seems like XSX would have the advantage, but sometimes when you optimize something there comes a price to pay on overhead to manage the solution which eats into the gains in the best case, and can have unintended consequences.


I used to do a lot of cool little tricks in my programs, but in the real world it’s usually best to go with simple and readable, without having to rely on a black box to manage the minutia. Especially when consistent runtime performance is desired, in the case of video games. I think it comes down to having hardware and the OS designed by a software company (Microsoft) vs designed by a video game company (Sony). And not just a video game company, one with talent that learned their lesson from the PS3’s CELL mistake.
 
In addition to scrubbers i think we should also factor in the fact that within PS5 GPU each CU has access to ~45% more L1 cache at higher bandwidth per shader array. This should reduce RAM access by nature at least for compute processes while also increasing compute efficiency/per CU saturation.

Edit: Thinking about cache hierarchy again, there is of course the L2 cache along the way before RAM which offers (a more moderate) ~15% of additional amount per CU on PS5 which should also play a role in the matter of costlier RAM access.

Yep, and while Series X's GPU has 1 MB of additional L2$, I'm not totally sure how much that provides a benefit as the benefit would likely fluctuate game to game, and with more demanding titles simply an extra 1 MB of the L2$ may not be enough (particularly when it's running slower compared to PS5's 4 MB of GPU L2$).

Thanks for posting the excerpt, I trusted the link to the thread would have been fine but eh 🤷‍♂️ heh.

No prob; the original article's machine translated anyway I think. That might cause some issues for people on phone so just posted those parts in the quotes.

You read all of OPs post and have a problem with reading people clogging the thread with whining?

iu

Yeah I mean on one had I can understand his point but on the other hand I'm not personally bothered with any of those posts. They're funny. I know it's become a meme at this point; I've broken character limits one too many times anyway.

Rule of thumb, if there's a gif or a pic in the reply, chances are it's a meme.

On one hand, the Series X would probably gain in development effectiveness by just putting a 2GB GDDR6 on all channels ,giving it the full 560GB/s on all memory.
On the other hand, during the hardware design they might have concluded that the console doesn't really need 560GB/s to start with, and they didn't predict the scale of the memory contention issues they ended up having.


If anything, the PS5 seems to be pretty well balanced and it "only" uses 448GB/s. There's a common (mis)conception that memory bandwidth should be adjusted to compute throughput in GPUs, so a Series X with 18% higher shader throughput than the PS5 should also get higher memory bandwidth. However, IIRC shader processors aren't the most memory bandwidth intensive components in a GPU, those would be the ROPs which are usually hardwired to the memory controllers in discrete GPUs. There's the notable case of the PS4 Pro being a "monster" in theoretical pixel fillrate with its 64 ROPs but official documentation being clear about the chip not being able to reach anywhere close to its limit because of a memory bandwidth bottleneck.

The PS5 and the Series X have the same pixel rasterizer (ROP) throughput per clock but the PS5 has higher clocks, so the PS5's design might actually be more bandwidth-demanding than the Series X.

So It could be that the reason the Series X uses a 320bit memory controller has little to do with running videogames.
The PS5 has one purpose alone, which is to run videogames. The Series X serves two purposes: to run videogames and to accelerate compute workloads in Azure servers. The Series X chip was co-designed by the Azure Silicon Architecture team and that's actually the team that originally presented the solution at HotChips 2020. The 320bit memory controller could be there to let the SoC access a total of 20GB (or even 40GB if they use clamshell) of system memory in Azure server implementations.



Microsoft's dual-use design was obviously going to bring some setbacks and the most obvious one is the fact that they needed to produce a 20% larger chip on the same process, to run videogames with about the same target IQ.
As for the memory pools with uneven bandwidths and the memory contention issues that they brought, it might have been something Microsoft didn't see coming and perhaps they should have used only 8 channels / 256bit on the gaming implementation of the Series X SoC.
Or perhaps someone did see it coming, but the technical marketing teams wanted to have stuff to gloat about and developers were going to have to adapt to the uneven memory pools regardless, for the Series S.

You know, I would be very interested to know just when did Microsoft fully roll into development of the Xbox Series platforms. Something keeps telling me that most of their development (at least for the iterations we now have) didn't fully kick off until late 2017. There are some things, like the reduction in funding (which would have also impacted R&D) to Xbox during Myerson's run, that could have also hampered some things in terms of progress of hardware development. But that is something a bit different moving into a separate topic.

Were the members of the Azure team who worked on the new Xboxes also accompanied by members of the Surface team? Because I think I've heard in the past that some of the Surface team also assisted in the development (maybe more in terms of the aesthetic and ergonomics of the boxes) of Series X and S. But as far as Series X's dual-purpose design nature, I think that's something which has been at least slightly known since 2020. I remember MLiD mentioning that in his console analysis videos from then before the consoles launched. That's actually the first place I saw it being brought up.
 

BeardGawd

Banned
Why are you using Bandwidth in seconds only?

At 30fps the GPU optimized memory for Series X allows for 18.7GBs per frame. The PS5's memory allows for 14.9GBs per frame.

Which allows for higher bandwidth when it is needed and less bandwidth when it's not (CPU tasks).
 

Thebonehead

Gold Member
Alright, GTFO with your condescending tone.

If you are gonna discredit what someone has said, then its on you to state otherwise and make your point. Then you can be reasoned with and actual insight will be shared.

All you have been doing is springing a bunch of nonsense with nothing to actually back it up while at the same time taking digs at users in an attempt to make them look stupid.

Sy something meaningful, or if you can't be bothered, then just shut up.
world of warcraft legion GIF
 

BlackTron

Gold Member
My attention span didn't last longer than a few paragraphs for this topic, but don't forget that Xbox is running Microsoft software under the hood. Isn't that enough to require an extra bit of power to achieve parity with a weaker system by itself?
 
Sony learnt its lesson from the PS3. When Mark Cerny took over he pushed the PS4 console to be as easy to program as possible.
Infact during his Road to PS5 talk he ran with this priority in the beginning of his talk.
Time to triangle. Reducing the time required for developers to get to grips with exploiting the hardware.
He put up the following time metrics for each console.
PS1 : 1-2 months
PS2 : 3-6 months
PS3 : 6-12 months
PS4 : 1-2 months

With the PS5 they brought that down even further to less than 1 month.
They did this by keeping the development environment, libraries and tools exactly the same.
The PS4 Pro started the design philosophy of Sony. It was a butterfly design GPU which aided in keeping BC simple and keeping the development environment the same.
With the Xbox One X we saw a completely different mid gen refresh approach.
Compared to the Xbox One the XOX had more RAM, different type of RAM, a GPU that didn't need to be a butterfly design to keep BC.
This is because MS has a different API design than allows for flexibility. A game developed by Xbox needs to work on all types of PC set ups. It needs to be widespread. This has the benefit of not keeping Xbox console development hinged on keeping certain hardware configurations to keep BC.
Remember, every strength is also a weakness.
This gives Xbox a far better BC abilities than PS, but it also means that games on DX don't get optimised for every possible PC set up, and the Xbox is just another if those possible PC configurations.

This gen saw the following.
PS5 kept the same GPU CUs as the PS4Pro.
The PS5 kept the same API as the PS4 Pro.
The PS5 kept the same development environment, tools and libraries as the PS4.
During the PS4 generation developers got the absolute most out of what that console was capable of, meaning developers jumped straight into the PS5 with similar abilities to get the most out of the hardware.

On the other hand, the Xbox Series had a new API. It has a totally different hardware set up from the XO. It had all new tools and libraries for developers to use.

This direction from Sony gave it an absolute kick start into the new generation.

It has nothing to do with RAM set ups, nothing to do with Tflops, or Cache Scrubbers. It purely comes down to a better development environment.
 

Lysandros

Member
Although as I saw someone else saying, the part of the pipeline that would possibly be most affected by bandwidth constriction are probably the ROPs.
As per RDNA white paper GPU L1 cache directly services requests from the ROPs within the shader array. Since the block itself has 22% higher bandwidth on PS5 and has to feed four less (30%) CUs i would say that PS5 should have enough additional headroom to feed its beefier/faster ROPs in fill rate intensive situations.
 
Last edited:
Why are you using Bandwidth in seconds only?

At 30fps the GPU optimized memory for Series X allows for 18.7GBs per frame. The PS5's memory allows for 14.9GBs per frame.

Which allows for higher bandwidth when it is needed and less bandwidth when it's not (CPU tasks).

My main point of this thread was to look into examples where the 10 GB of GPU-optimized memory may not be enough for Series X and therefore the GPU has to access data from the other 6 GB pool. In those scenarios, the effective GPU bandwidth is reduced by the amount of time it stays accessing from the 6 GB pool. So if there's a scenario where for example it needs to access a specific block of data that's only in the 6 GB pool for 30 or so frames (of a 30 or so FPS game), then effective GPU bandwidth is reduced to peak 336 GB/s.

That could be offset by swapping out 1 GB worth of data in the 10 GB to the slower pool and swapping in the data for the GPU that's been allocated to the 6 GB pool into the 10 GB one, but that will either result in 1 GB of data being permanently erased out of RAM, or you'd need a 1 GB reserved scratchpad buffer in the 6 GB pool to hold transferred data of the two pools without deleting anything which could be a point of contention since only 3.5 GB of the 6 GB pool is freely available for game applications.

But, TBF it's been noted by P Physiognomonics that my CPU usage example in the OP was too high, and Hoddi Hoddi made some good points about SFS's role in reducing texture access sizes. Even Riky Riky had a decent point about SFS there, but P Panajev2001a seems to have a very strong understanding of it and related techniques from what I've read, and he's done a good job showing that maybe some of these SFS benefits aren't the "big solution" others make them out to be when equivalent features have been usable on prior AMD GPUs and devs have implemented software solutions since even the PS2 era.

So I'm still basically left asking: if more data-demanding games become more prolific later in the generation, and 10 GB isn't necessarily enough for the GPU and GPU-bound tasks fall upon parts that are more bandwidth-demanding anyway (such as the ROPs), how will reserving space for extra data in the other 3.5 GB pool (since 2.5 GB is reserved for the OS) impact total bandwidth access of Series X's GPU compared to PS5? And will the worst-case scenarios still provide enough of a delta to ensure the GPU can do what it needs while maintaining performance parity with PS5?

I think that becomes even more important considering other small differences between the designs of the two systems which can start to add up, like the PS5's cache scrubbers alleviating dependency on RAM access, and situations where other parts of the systems need access to RAM (like for SSD I/O transfers) the PS5 has advantages in where the CPU is not being taxed for those the way Series X's CPU is, stuff like that.

And again, I'm not saying any of this memory stuff is the reason for some of the performance results in multiplats we've been seeing of late. But if these types of results continue to occur in the future, or increase in frequency in favor of PS5, and it becomes easier to pin some portion of that to aspects of the the memory, maybe what's being discussed ITT can help give some perspectives to ascertaining causes when those scenarios pop up (IF they pop up).
 

Mr.Phoenix

Member
Why are you using Bandwidth in seconds only?

At 30fps the GPU optimized memory for Series X allows for 18.7GBs per frame. The PS5's memory allows for 14.9GBs per frame.

Which allows for higher bandwidth when it is needed and less bandwidth when it's not (CPU tasks).
He did it that way to just make this concept easier to grasp.

But since you want to be anal about it.

In any given frame, it's not just the GPU requiring the bandwidth. You know that right? So ou can wittle everything he said down o 16ms for 60fps games and 33ms for 30fps.

During deferred rendering, there is a ton of stuff the CPU is doing for every frame and even the subsequent frames with data updated from the current frame and user input. So it's not like during that one frame you just have CPU idle and GPU on load. So here is no such thing as when it's needed and when it's not.

You will discover, however, that when we start looking at the time like this in ms, even a 1ms or2ms overhead/bottleneck from anything that has to do with memory management or your system as a whole that isn't shared by the other platform becomes a very very big issue.

Hope that helps.
 
Last edited:
I can’t comment much on this stuff myself, as a layman. But with my limited understanding of memory usage, programming concepts, and absolutely no knowledge of the video game industry - I’d say it seems like Microsoft over engineered this solution. In a perfect world it seems like XSX would have the advantage, but sometimes when you optimize something there comes a price to pay on overhead to manage the solution which eats into the gains in the best case, and can have unintended consequences.


I used to do a lot of cool little tricks in my programs, but in the real world it’s usually best to go with simple and readable, without having to rely on a black box to manage the minutia. Especially when consistent runtime performance is desired, in the case of video games. I think it comes down to having hardware and the OS designed by a software company (Microsoft) vs designed by a video game company (Sony). And not just a video game company, one with talent that learned their lesson from the PS3’s CELL mistake.

You and a few other posters have brought up Sony learning lessons from Cell and I think that's accurate. I mean Cell really banged Sony up badly; amazing tech but they didn't do enough to have a dev environment suitable for 3P developers early enough.

Let alone they overestimated Cell's performance for GPU work and had to rush in the RSX (wonder what a dual-Cell PS3 could've been like though; one for CPU and the other completely dedicated for graphics).

Sony learnt its lesson from the PS3. When Mark Cerny took over he pushed the PS4 console to be as easy to program as possible.
Infact during his Road to PS5 talk he ran with this priority in the beginning of his talk.
Time to triangle. Reducing the time required for developers to get to grips with exploiting the hardware.
He put up the following time metrics for each console.
PS1 : 1-2 months
PS2 : 3-6 months
PS3 : 6-12 months
PS4 : 1-2 months

With the PS5 they brought that down even further to less than 1 month.
They did this by keeping the development environment, libraries and tools exactly the same.
The PS4 Pro started the design philosophy of Sony. It was a butterfly design GPU which aided in keeping BC simple and keeping the development environment the same.
With the Xbox One X we saw a completely different mid gen refresh approach.
Compared to the Xbox One the XOX had more RAM, different type of RAM, a GPU that didn't need to be a butterfly design to keep BC.
This is because MS has a different API design than allows for flexibility. A game developed by Xbox needs to work on all types of PC set ups. It needs to be widespread. This has the benefit of not keeping Xbox console development hinged on keeping certain hardware configurations to keep BC.
Remember, every strength is also a weakness.
This gives Xbox a far better BC abilities than PS, but it also means that games on DX don't get optimised for every possible PC set up, and the Xbox is just another if those possible PC configurations.

This gen saw the following.
PS5 kept the same GPU CUs as the PS4Pro.
The PS5 kept the same API as the PS4 Pro.
The PS5 kept the same development environment, tools and libraries as the PS4.
During the PS4 generation developers got the absolute most out of what that console was capable of, meaning developers jumped straight into the PS5 with similar abilities to get the most out of the hardware.

On the other hand, the Xbox Series had a new API. It has a totally different hardware set up from the XO. It had all new tools and libraries for developers to use.

This direction from Sony gave it an absolute kick start into the new generation.

It has nothing to do with RAM set ups, nothing to do with Tflops, or Cache Scrubbers. It purely comes down to a better development environment.

Well said, and ultimately, you're right in this assessment. I do think things like complications with quirks in the memory setup could play a role as time goes on, but ultimately they would never be the most critical. After all, there are systems with way more complex and hamstrung memory setups that ended up seeing success in the past both commercially and with developers.

It can be said that the changes Microsoft made with their SDK and APIs had to be done, after what occurred with the XBO. It can even be argued that their timing wasn't any different than Sony's when Sony had to change everything from PS3 to PS4. Where Microsoft messed up, IMO, was not having 1P software in development soon enough to ensure their own studios could get things running off to a strong start.

PS4 was very different from PS3 in almost all aspects but we still saw some really solid 1P games right from the first year or two like Killzone:ShadowFall and InFamous:Second Son. That PS4 had more obvious advantages over XBO (such as almost 3x the RAM bandwidth, 50% more TF performance, more pixel fillrate performance and a less resource-hungry OS plus a better-integrated hypervisor) helped, but Sony did seem to make sure their 1P teams were taking the steps needed to at least get technical bearing on PS4 ASAP even as they were also finishing off the PS3 strongly.

Microsoft really messed up in their priorities near end of XBO. They turned to services and BC programs instead of making sure the 1P they had at the time were getting acclimated with GDK as early as possible, because MS themselves were running late getting GDK up to speed. They didn't have their 1P give strong enough final sendoffs for the XBO; maybe a few more games at least up to the level of Gears 5 would've prepared more of their 1P for the technical and design ramp-up of 9th-gen.

MS were too slow investing in a proper dev environment in time for the new console launches, they're forever playing catch-up, they may have complicated dev environment even further with Series S acting as a constricting element, and they didn't foster their 1P in time for the new system launches.

As per RDNA whitepaper GPU L1 cache directly services requests from the ROPs within the shader array. Since the block itself has 22% higher bandwidth on PS5 and has to feed four less (30%) CUs i would say that PS5 should have enough additional headroom to feed its beefier/faster ROPs in fill rate intensive situations.

Oh, so you were referring to the caches? I thought you were also referring to the RAM, or maybe that was for illustrating another point. When you think about it, 22% larger L1$ cache bandwidth for the ROPs in PS5, when they already have to feed less than Series X, that sounds like it could create some scary gulfs in scenarios where cache bandwidth for the ROPs is the determining factor for performance in a given frame.
 

Lysandros

Member
It has nothing to do with RAM set ups, nothing to do with Tflops, or Cache Scrubbers. It purely comes down to a better development environment.
This seems to be a pretty bold (almost insane) claim. It has 'everything' to do with hardware. I hope you are aware that by this logic you are saying PS5 would run exactly the same without the Cache Scrubbers and/or let's say 8TF of compute and/or 10 GB or RAM right?... Why not a 2 GHz CPU while at it? A piece of software's speed/performance is absolutely influenced or even dictated by the confines and specifics of the hardware in which it's running. Software isn't an independent entity which wanders around like a lost soul. It can always be more optimized meaning better tailored to the specific 'hardware' (again) where it is running regardless of the API.
 

Mr.Phoenix

Member
This seems to be a pretty bold (almost insane) claim. It has 'everything' to do with hardware. I hope you are aware that by this logic you are saying PS5 would run exactly the same without the Cache Scrubbers and/or let's say 8TF of compute and/or 10 GB or RAM right?... Why not a 2 GHz CPU while at it? A piece of software's speed/performance is absolutely influenced or even dictated by the confines and specifics of the hardware in which it's running. Software isn't an independent entity which wanders around like a lost soul. It can always be more optimized meaning better tailored to the specific 'hardware' (again) where it is running regardless of the API.
Couldn't have said it better myself. And this is something I fear people tend to overlook.

It's like that saying, `treating the infection and not the wound that caused the infection`.

Yes, we can say it all boils down to one having a better development environment than the other, but the only reason it has a better development environment (and better isn't even the term here, it's familiar), is because Sony clearly went out of their way to make the PS5 as close in design to the PS4 as possible. And they added things that would make stufff that were bottlenecks on the PS4 non-existent or to allow for new things that can be built on top of what already was there with the PS4.

That is a hardware design, and the software that runs in that environment, runs within the confines of it. The XS approach means there is a harder or different/new learning curve all over again. Or spending time learning new ways to do od tricks. So the net end result, is that it ends up taking longer to get the same results you would get on the PS5.
 

onQ123

Member
The engines for now favor fast speeds over MS set-up, but the Series consoles have a forward looking design . As we still get cross generation games , none are properly build in mind with more parallel executions. Until engines are adjusted we keep seeing this .
I'm willing to bet there will be current gen only games that will choke Xbox Series X out before PS5.
 

Fafalada

Fafracer forever
It's an interesting topic OP - I'm going off on a tangent below (not so much to your original post) but bear with me.

I think it's worth touching on the commonly maligned 'oh but memory juggling is hard' narrative as well (irrespective of system's total performance).
Ie. specifically:
It is also not "split" in a way wherein the 10 GB and 6 GB are treated as separate virtual memory addresses.
Yes it's unified address space - but it's worth pointing out that specifically - variable-performance memory access, inside a unified address space has been common practice for decades in console space.
Eg. even the much complained about PS3 memory setup was accessible to CPU/GPU as unified address space. Or if you want even one better - PSP had a unified address range for entirety of all of its physical memory packages (covering 2 separate pools of eDram, external Dram, and 2 pools of SRam on the CPU).

On the other side - some systems also map different address ranges to the same physical memory addresses but with different access patterns. Eg. the famed Onion / Garlic buses on PS4, or the more primitive variant of the same thing on PS2 via Cached/Uncached/Uncached-accelerated memory mapping. And PS2 was the first console to adopt HUMA monicker (before GPU industry anyway) though the H stood for hybrid there :p

Anyway the point here - if it's not obvious - is that memory pool juggling has been par for the course for... over 2 decades now, across most consoles, and while some setups were easier than others, no system really escaped unscathed. Even X360 memory access patterns were potentially more complex than XSX as you had to juggle the balance of eDram and Dram access, and even the original XBox - for all its convenience of fully unified Ram, was paired with memory too slow to keep up with everything its GPU could do, so it required a balancing act to not bandwith starve the GPU (or rest of the system) that PS2/GC didn't have to deal with, for instance.


But to tie this back to the main topic of the thread - raw bandwidth numbers are indeed, only a part of the equation of any given memory subsystem. One of my favorite examples of how misleading looking at just bandwidth can be, was actually the PSP, which on-paper, had memory performance competing with the GameCube, but real-world (due to how the bus was designed) overall system didn't come anywhere close to that.
 

Mr.Phoenix

Member
It's an interesting topic OP - I'm going off on a tangent below (not so much to your original post) but bear with me.

I think it's worth touching on the commonly maligned 'oh but memory juggling is hard' narrative as well (irrespective of system's total performance).
Ie. specifically:

Yes it's unified address space - but it's worth pointing out that specifically - variable-performance memory access, inside a unified address space has been common practice for decades in console space.
Eg. even the much complained about PS3 memory setup was accessible to CPU/GPU as unified address space. Or if you want even one better - PSP had a unified address range for entirety of all of its physical memory packages (covering 2 separate pools of eDram, external Dram, and 2 pools of SRam on the CPU).

On the other side - some systems also map different address ranges to the same physical memory addresses but with different access patterns. Eg. the famed Onion / Garlic buses on PS4, or the more primitive variant of the same thing on PS2 via Cached/Uncached/Uncached-accelerated memory mapping. And PS2 was the first console to adopt HUMA monicker (before GPU industry anyway) though the H stood for hybrid there :p

Anyway the point here - if it's not obvious - is that memory pool juggling has been par for the course for... over 2 decades now, across most consoles, and while some setups were easier than others, no system really escaped unscathed. Even X360 memory access patterns were potentially more complex than XSX as you had to juggle the balance of eDram and Dram access, and even the original XBox - for all its convenience of fully unified Ram, was paired with memory too slow to keep up with everything its GPU could do, so it required a balancing act to not bandwith starve the GPU (or rest of the system) that PS2/GC didn't have to deal with, for instance.


But to tie this back to the main topic of the thread - raw bandwidth numbers are indeed, only a part of the equation of any given memory subsystem. One of my favorite examples of how misleading looking at just bandwidth can be, was actually the PSP, which on-paper, had memory performance competing with the GameCube, but real-world (due to how the bus was designed) overall system didn't come anywhere close to that.
While memory juggling has always been a thing because its something every platform has in some former the other doesn't negate its issues. And the more of it you have to do on any given platform, the more complex it is for you.

eg. say on the PS5, for any task, you take a pathway like this, (1)SSD > (2)RAM > (3)CPU/GPU > (4)cache > (5)Core. That is straightforward devs. At any given time, they know exactly what they will be getting as far as bandwidth at each stage in that chain. Regardless of what task you are doing.

Now take the XSX, SSD > RAM > CPU/GPU > Cache > Core. Now while it may seem identical to the PS5, step (2) here is more complicated. And depending on how much RAM is needed by a specific component in step (3) adjustments or accommodations have to be specifically made in step(2) or not the performance in step (4) and (5)may suffer.

Now this is not to say that the XSX will always underperform, but that more work and care/thought need to go into it to get the most out of it. And unless carefully optimized with a fin tooth command accounting for every possible instance where such an issue could present itself then you wi see performance hits.

This issue is further exacerbated by how devs optimize their engines. Rather than seek out every possible scenario where something may happen (and you can't blame them there could be thousands of them), they work in ranges of best fit. Eg... Ok, rather than assume we have 560GB/s peak bandwidth, let's say we have only 480GB/s, this way, we account for anytime our engine would otherwise have been bandwidth starved. Probably with ranges like that,is that it takes overall performance down, while also never really accounting for the worst-case scenario.
 

ChiefDada

Gold Member
Sony learnt its lesson from the PS3. When Mark Cerny took over he pushed the PS4 console to be as easy to program as possible.
Infact during his Road to PS5 talk he ran with this priority in the beginning of his talk.
Time to triangle. Reducing the time required for developers to get to grips with exploiting the hardware.
He put up the following time metrics for each console.
PS1 : 1-2 months
PS2 : 3-6 months
PS3 : 6-12 months
PS4 : 1-2 months

With the PS5 they brought that down even further to less than 1 month.
They did this by keeping the development environment, libraries and tools exactly the same.
The PS4 Pro started the design philosophy of Sony. It was a butterfly design GPU which aided in keeping BC simple and keeping the development environment the same.
With the Xbox One X we saw a completely different mid gen refresh approach.
Compared to the Xbox One the XOX had more RAM, different type of RAM, a GPU that didn't need to be a butterfly design to keep BC.
This is because MS has a different API design than allows for flexibility. A game developed by Xbox needs to work on all types of PC set ups. It needs to be widespread. This has the benefit of not keeping Xbox console development hinged on keeping certain hardware configurations to keep BC.
Remember, every strength is also a weakness.
This gives Xbox a far better BC abilities than PS, but it also means that games on DX don't get optimised for every possible PC set up, and the Xbox is just another if those possible PC configurations.

This gen saw the following.
PS5 kept the same GPU CUs as the PS4Pro.
The PS5 kept the same API as the PS4 Pro.
The PS5 kept the same development environment, tools and libraries as the PS4.
During the PS4 generation developers got the absolute most out of what that console was capable of, meaning developers jumped straight into the PS5 with similar abilities to get the most out of the hardware.

On the other hand, the Xbox Series had a new API. It has a totally different hardware set up from the XO. It had all new tools and libraries for developers to use.

This direction from Sony gave it an absolute kick start into the new generation.

It has nothing to do with RAM set ups, nothing to do with Tflops, or Cache Scrubbers. It purely comes down to a better development environment.

Fantastic points all around, but you went left towards the end. We can't ignore instances of developers explaining how/why hardware differences also play a part in performance differentials.

From DF interview with Touryist developer, Shin-En:

Typically, in the PC space, to get a faster GPU, manufacturers produce 'wider' designs that run at the same clocks as less capable parts - or even slower. Xbox Series X follows the same pattern. Its GPU runs at a slower clock, but should be more capable overall as it has many more compute units. Shin'en tells us that in the case of its engine, the increase to clock frequencies and the difference in memory set-up makes the difference. Beyond this, rather than just porting the PS4 version to PS5, Shin'en rewrote the engine to take advantage of PS5's low-level graphics APIs.

I am sure there are many other developers who can point to hardware differences resulting in better performance for one platform vs another, but because of the toxic environment among gaming public and the need for developers to maintain as much objectivity in appearance as possible for business relationships, unfortunately it is rare to get candid discussions on this from them. But when they are willing to do so, we should always incorporate their expert knowledge into our assessments and discussions.
 
I think in the future what will really make the difference between those consoles are the cache scrubbers combined with custom I/O (those decompression units equivalent to 9 Zen 2 cores) available on PS5. The others advantages available on PS5 (clocks, caches, truely unified main ram) will make much less of a difference.

I think it's fair to use Spider-man games and Ratchet and clank as the very likely benchmark for those technologies. Nothing on XSX looks even remotely similar to what they are pushing in those games (open world so heavy I/O, RT reflections, at ~1080-1440p stable 60fps). And even on PC you need like twice the CPU and GPU power to reach a similar level graphically with the same performance stability above 60fps (based on Insomniac ports).

I also think those technologies are very probably not used in any of those multiplats, only on select first-party games on PS5, very likely games done by Insomniac and maybe Bluepoint.

More about how cache scrubbers are working:
 
Last edited:

Lysandros

Member
I think it would be adequate to leave the stage to the actual chief software engineer of PS5 for a moment:

iLerHUm.jpg


I think this pretty much confirms how big of a role the often overlooked Cache Scrubbers (will) play this generation in the context of respective console performance.
 
Last edited:
This seems to be a pretty bold (almost insane) claim. It has 'everything' to do with hardware. I hope you are aware that by this logic you are saying PS5 would run exactly the same without the Cache Scrubbers and/or let's say 8TF of compute and/or 10 GB or RAM right?... Why not a 2 GHz CPU while at it? A piece of software's speed/performance is absolutely influenced or even dictated by the confines and specifics of the hardware in which it's running. Software isn't an independent entity which wanders around like a lost soul. It can always be more optimized meaning better tailored to the specific 'hardware' (again) where it is running regardless of the API.
They have the same basic hardware. We arnt talking about a PS3 vs X360. We arnt talking about the PS4 using GDDR5 and the XO using DDR8 and a esram set up.
We arnt talking about PS5 using and Nvidia GPU and Xbox an AMD one. We arnt talking about PS5 using an Intel CPU and Xbox a AMD one.

They both have the same amount and the same type of RAM.
They both have a Zen 2 CPU with the exact same core count and minor speed difference.
They both have an AMD GPU with a minimum amount of difference between them.
They both have an SSD.
These two consoles are closer in specs than. Any other generation, ever.

One have Primitive Shaders, one has Mesh Shaders. As per that AMD interview I put up, they both do the same thing.

From a hardware point of view if you want to get technical, the XSX is a more advanced system. It has a higher clock speed, more powerful GPU, faster RAM bandwidth. However, a better development environment will Trump those hardware advantages every day of the week.
How many times have we seen a game released that has terrible issues, to then get a patch and it fixing alot of the issues? And that's on the same hardware. This should show you that performance is massively dependent on the actual developers more so than the actual hardware.
 
You and a few other posters have brought up Sony learning lessons from Cell and I think that's accurate. I mean Cell really banged Sony up badly; amazing tech but they didn't do enough to have a dev environment suitable for 3P developers early enough.

Let alone they overestimated Cell's performance for GPU work and had to rush in the RSX (wonder what a dual-Cell PS3 could've been like though; one for CPU and the other completely dedicated for graphics).



Well said, and ultimately, you're right in this assessment. I do think things like complications with quirks in the memory setup could play a role as time goes on, but ultimately they would never be the most critical. After all, there are systems with way more complex and hamstrung memory setups that ended up seeing success in the past both commercially and with developers.

It can be said that the changes Microsoft made with their SDK and APIs had to be done, after what occurred with the XBO. It can even be argued that their timing wasn't any different than Sony's when Sony had to change everything from PS3 to PS4. Where Microsoft messed up, IMO, was not having 1P software in development soon enough to ensure their own studios could get things running off to a strong start.

PS4 was very different from PS3 in almost all aspects but we still saw some really solid 1P games right from the first year or two like Killzone:ShadowFall and InFamous:Second Son. That PS4 had more obvious advantages over XBO (such as almost 3x the RAM bandwidth, 50% more TF performance, more pixel fillrate performance and a less resource-hungry OS plus a better-integrated hypervisor) helped, but Sony did seem to make sure their 1P teams were taking the steps needed to at least get technical bearing on PS4 ASAP even as they were also finishing off the PS3 strongly.

Microsoft really messed up in their priorities near end of XBO. They turned to services and BC programs instead of making sure the 1P they had at the time were getting acclimated with GDK as early as possible, because MS themselves were running late getting GDK up to speed. They didn't have their 1P give strong enough final sendoffs for the XBO; maybe a few more games at least up to the level of Gears 5 would've prepared more of their 1P for the technical and design ramp-up of 9th-gen.

MS were too slow investing in a proper dev environment in time for the new console launches, they're forever playing catch-up, they may have complicated dev environment even further with Series S acting as a constricting element, and they didn't foster their 1P in time for the new system launches.



Oh, so you were referring to the caches? I thought you were also referring to the RAM, or maybe that was for illustrating another point. When you think about it, 22% larger L1$ cache bandwidth for the ROPs in PS5, when they already have to feed less than Series X, that sounds like it could create some scary gulfs in scenarios where cache bandwidth for the ROPs is the determining factor for performance in a given frame.
Microsoft made a bet.
They incorporated the PC and Xbox development into the one dev kit, the GDK. Prior to that they had an xbox specific one called the XDK.
They bet that by giving devs the ability to develop both the Xbox and PC games side by side on the one dev kit, it would allows devs to exploit the new DX12U features quicker. On top of that, the majority of games do their initial development on the PC and then port across to the consoles. By having the GDK it would by default hopefully mean that the XSX and PC would be the lead development platform for all games.

One of the big issues for Xbox is that from a development point of view it's just another PC configuration. Because a PC game needs to work over so many different types of GPUs, CPUs and Memory set ups, the games never get fully optimised for each one. Historically that's why consoles would always get more performance out of them than a similar speced PC build. Devs just can't sit there and optimise every possible rig set up. Its impossible. That's why it's close enough is good enough.
When a dev works on a PS5, he has to use their API. It's a totally different environment and they can dig down deeper into it.
And if the PS5 is the lead platform? Forget about it.
This is why we have seen games that were better performing on the PS5 over XSX, were also pretty poorly optimised on PC as well.

Got to give Sony their props.
The work they put into their development environment has paid off big time.
 

BlackTron

Gold Member
I think it would be adequate to leave the stage to the actual chief software engineer of PS5 for a moment:

I think this pretty much confirms how big of a role the often overlooked Cache Scrubbers (will) play this generation in the context of respective console performance.

I think this is a victory for the efficiency of a dedicated game console that I thought we were (disappointingly) losing as hardware gets more samey and shares architecture. But PS5 is punching above its weight with intelligent engineering.

Coincidentally the Series X is channeling the same energy as the first Xbox, brute forcing it.
 
As far as we know, there's not a single engine being used by Sony FP studios as advanced/better looking as UE5. And, cynically, I don't think we'll see any developed to fully take advantage of PS5's architecture due to the later PC ports in mind.
I'd wager a guess that FF16 would be a taste of things to come. The game is PS5 exclusive and as far as I know they haven't begun working on a pc/xbox port. The game looks absolutely stunning and really calls into question if what the producers said about the game needing a $2K equivalent pc build to run has some merit behind it.

landscapes_of_ffxvi_-h8ihp.jpg

landscapes_of_ffxvi_-30ezv.jpg

landscapes_of_ffxvi_-xmcwx.jpg
 

onQ123

Member
I think this is a victory for the efficiency of a dedicated game console that I thought we were (disappointingly) losing as hardware gets more samey and shares architecture. But PS5 is punching above its weight with intelligent engineering.

Coincidentally the Series X is channeling the same energy as the first Xbox, brute forcing it.
Actually PS5 is more brute force with the higher clocks while it's the Xbox Series X that shine when the devs optimize for the hardware.
 

bender

What time is it?
Sony learnt its lesson from the PS3. When Mark Cerny took over he pushed the PS4 console to be as easy to program as possible.
Infact during his Road to PS5 talk he ran with this priority in the beginning of his talk.
Time to triangle. Reducing the time required for developers to get to grips with exploiting the hardware.
He put up the following time metrics for each console.
PS1 : 1-2 months
PS2 : 3-6 months
PS3 : 6-12 months
PS4 : 1-2 months

With the PS5 they brought that down even further to less than 1 month.
They did this by keeping the development environment, libraries and tools exactly the same.
The PS4 Pro started the design philosophy of Sony. It was a butterfly design GPU which aided in keeping BC simple and keeping the development environment the same.
With the Xbox One X we saw a completely different mid gen refresh approach.
Compared to the Xbox One the XOX had more RAM, different type of RAM, a GPU that didn't need to be a butterfly design to keep BC.
This is because MS has a different API design than allows for flexibility. A game developed by Xbox needs to work on all types of PC set ups. It needs to be widespread. This has the benefit of not keeping Xbox console development hinged on keeping certain hardware configurations to keep BC.
Remember, every strength is also a weakness.
This gives Xbox a far better BC abilities than PS, but it also means that games on DX don't get optimised for every possible PC set up, and the Xbox is just another if those possible PC configurations.

This gen saw the following.
PS5 kept the same GPU CUs as the PS4Pro.
The PS5 kept the same API as the PS4 Pro.
The PS5 kept the same development environment, tools and libraries as the PS4.
During the PS4 generation developers got the absolute most out of what that console was capable of, meaning developers jumped straight into the PS5 with similar abilities to get the most out of the hardware.

On the other hand, the Xbox Series had a new API. It has a totally different hardware set up from the XO. It had all new tools and libraries for developers to use.

This direction from Sony gave it an absolute kick start into the new generation.

It has nothing to do with RAM set ups, nothing to do with Tflops, or Cache Scrubbers. It purely comes down to a better development environment.

Good post but it has everything to do with Cache Scrubbers.
 

ChiefDada

Gold Member
I'd wager a guess that FF16 would be a taste of things to come. The game is PS5 exclusive and as far as I know they haven't begun working on a pc/xbox port. The game looks absolutely stunning and really calls into question if what the producers said about the game needing a $2K equivalent pc build to run has some merit behind it.

landscapes_of_ffxvi_-h8ihp.jpg

landscapes_of_ffxvi_-30ezv.jpg

landscapes_of_ffxvi_-xmcwx.jpg

shocked holy shit GIF
 
There is no subject matter to discuss.

There's no analysis.

As I said, the OP doesn't know what he's saying but acting like he does hoping people like you believe there's knowledge and technical detail in the post, where there isn't.

You are talking about "hard" data that doesn't exist in the way you think, because you took the bait.
Share your insight instead of being a critic.
 
Status
Not open for further replies.
Top Bottom