• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Exploring The Complications Of Series X's Memory Configuration & Performance

How much are YOU willing to say recent Xbox multiplat perf is affected by the memory setup?

  • Very

    Votes: 18 9.8%
  • Mostly

    Votes: 32 17.5%
  • So/so

    Votes: 40 21.9%
  • Not really

    Votes: 41 22.4%
  • None

    Votes: 52 28.4%

  • Total voters
    183
  • Poll closed .
Status
Not open for further replies.
games_gear_series-x.jpg


So, I'm not going to really get into the performance results between Series X and PS5 over the past few game releases, or even necessarily claim that this thread is a response to say "why" the performance deltas have been popping up outside of Microsoft's favor. I think in most cases, we can take perspectives like NXGamer NXGamer 's , which focus on API differences, and even Digital Foundry's insights (specifically from Alex) on how MS's approach for a platform-agnostic API set in GDK might prevent specific optimizations from being applicable because it could be a crapshoot for devs to figure which option for their game is best for their specific game (this was referenced in relation to Atomic Heart).

However, I do think it's worth talking a bit about Series X's memory situation, because I do think it plays a part into some of the performance issues that pop up. As we know, Series X has 16 GB capacity of GDDR6 memory, but "split" into a 10 GB capacity pool running at 560 GB/s, and a 6 GB pool running at 336 GB/s. The 10 GB pool is referred to as "GPU-optimized", while the 6 GB partially reserved for the OS, with the remaining 3.5 GB capacity of that block being used by CPU and audio data.

Some clarification: the Series X memory is not physically "split". It's not one type of memory for the GPU and a completely different type of memory for the CPU, as was the case for systems like the PS3, or virtually all pre-7th gen consoles (outside of exceptions like the N64, which had their own issues with bad memory latency). It is also not "split" in a way wherein the 10 GB and 6 GB are treated as separate virtual memory addresses. AFAIK, game applications will see the two as one virtual memory address, though I suspect the kernel helps out on that given the fact that, since the two pools of memory run at different bandwidths and therefore can't be completely considered hUMA (Heterogenous Unified Memory Architecture) in the same way the PS5's memory setup is.

But now comes the more interesting parts. Now, the thing about Series X's advertised memory bandwidths is that they are only accomplished if only THAT part of the total memory pool is accessed for the duration of a second. If, for a given set of frame time, a game has to access data for the CPU or audio, then that's a portion of a second the 10 GB capacity is NOT being accessed, but rather the slower 336 GB/s portion of memory. Systems like the PS5 have to deal with this type of bus contention as well, since all the components are sharing the same unified memory. However, bus contention becomes more of an issue for Series X due to its memory configuration. Since data for the GPU needs to be within certain physical space (otherwise it's not possible for the GPU to leverage the peak 560 GB/s bandwidth), it creates a bit of complication to the memory access that a fully hUMA design (where the bandwidth is uniform across the entire memory capacity pool) isn't afflicted with.

This alone actually means that, since very few processes in an actual game are 100% GPU-bound for a given set of consecutive frames over the course of a second, then it's extremely rare that Series X ever runs the full 560 GB/s bandwidth of the 10 GB capacity in practice. For example, for a given second let's say that the GPU is accessing memory for 66% of that time, and the CPU accesses memory for the other 33% and the audio accesses it for 1% of that time. In practice, memory bandwidth access would theoretically be ((560/3) * 2 + (336/3) * 1 =) ~ 485 GB/s. However, that isn't taking into account any kernel or OS-side management of the memory. On that note, I can't profess to knowing much; I would say that since outside of a situation I'll describe in a bit, the Series X doesn't need to copy data from the 6 GB to the 10 GB (or vice-versa), you aren't going to run into the type of scenario you see on PC. So in this case, whatever kernel/OS overhead there is for memory management of this setup is minimal and for this example effective bandwidth would be around the 485 GB/s mark.

As you can see, though, that mixed result is notably less than the 560 GB/s of the 10 GB GPU-optimized memory pool; in fact it isn't much larger than PS5's 448 GB/s. Add to that the fact with PS5, features like the cache scrubbers reduce the need for GPU to hit main memory as often, and there being wholly dedicated hardware for enforcing cache coherency (IIRC the Series X has to leverage the CPU to handle at least a good portion of this, which nullifies much of the minuscule MHz advantage CPU-side anyway), it gets easy to picture how these two systems are performing relatively on par in spite of some larger paper spec advantages for Series X.

What happens though when there's a need for more than 10 GB of graphics data, though? This is the kind of scenario where Series X's memory setup really shows its quirky problems, IMO. There are basically two options, both of which require some big compromises. Devs can either reserve a portion of the available 3.5 GB in the 6 GB pool for reserve graphics data to copy over (and overwrite) a portion of what's in the current 10 GB pool (leaving less room for non-graphics data), or they can access that data from the SSD, which is much slower and has higher latency, and would still require a read and copy operation into memory, eating up a ton of cycles. Neither of these are optimal but will become more likely if a game needs more than 10 GB of graphics data. Data can be compressed in memory and then decompressed by the GPU when it accesses it, and that helps provide some room, but this isn't exclusive to Series X so in equivalent games across it and PS5 the latter still has an advantage due to its fully unified memory and uniform bandwidth across the entire pool of its memory.

Taking the earlier example and applying it here, in a situation where the GPU needs to access 11 GB of graphics data instead of 10, it means it has to use a 1 GB in the other 6 GB pool to do this. That would reduce the capacity for CPU & audio data down to 2.5 GB. Again, same situation where for a given second, the GPU accesses memory for 66% of the time (but accesses the other 1 GB for 25% of its total access time), CPU for 33% and audio for 1%, and you get ((((560/3) * 2) * .75) + ((336/3) * 1.4125) =) ~438 GB/s. Remember, now the GPU is only accessing from the 10 GB capacity for 3/4ths of the 66% time (in which the GPU is accessing memory); it spends 25% of its access time in the 6 GB capacity, so in reality the 6 GB capacity is being accessed a little over 40% of the total time for the second.

Either way, as you can see, that's a scenario where due to needing more than 10 GB of graphics data, the total contention on the bus pulls overall effective bandwidth down considerably. That 438 GB/s is 10 GB/s lower than PS5's bandwidth and, again, PS5 has the benefit of cache scrubbers which can help (to a decent degree, though it varies wildly) reduce the GPU's need to access the RAM. PS5 also having beefier offloaded hardware for enforcing cache coherency helps "essentially" maximize its total system bandwidth usage (talking RAM and caches, here), as well. For any amount of data the GPU needs which is beyond the 10 GB capacity threshold, if that data is in the 6 GB pool, then the GPU will be bottlenecked by that pool's maximum bandwidth access, and that bottleneck is compounded the longer the GPU needs that particular chunk of data which is outside of the 10 GB portion.

As cross-gen games fade out and games start to become more CPU-heavy in terms of data needing processed, this might present a small problem for Series X relative the PS5 as the generation goes on. But it's the increasing likelihood of more data being needed for the GPU which presents the bigger problem for Series X going forward. Technically speaking, Series X DOES have a compute advantage over PS5, but the problem its GPU could face isn't really tied to processing, but data capacity. ANY amount of data that is GPU-bound but needs to fit outside of the 10 GB pool, will essentially drag down total system bandwidth depending on the frequency the GPU needs that additional information. Whether that's graphics data or AI or logic or physics (to have the GPU process via GPGPU), it doesn't change the complication.

There's only but so much you can compress graphics data for the GPU to decompress when accessing it from memory; neither MS or Sony can offer a compression on that front beyond what AMD's RDNA2 hardware allows, and that benefit is shared by both PS5 and Series X so entertaining it as a solution for the capacity problem (or rather, the capacity configuration problem) relative Series X isn't going to work. There isn't really anything Microsoft could do to fix this outside of giving the Series X a uniform 20 GB of RAM. Even if that 20 GB were kneecapped to the same bandwidth as the current Series X, all of the problems I present in this post would go away. But they would still need games to be developed with the 16 GB setup in mind, effectively nullifying the solution of that approach.

Microsoft's other option would be to go with faster RAM modules but keep the same 16 GB setup they currently have. The problems would persist in practice, but their actual impact in terms of bandwidth would be nullified thanks to the sheer raw bandwidth increase. Again, Microsoft would still need games to be programed for the current set-up in mind, but with this specific approach it would be cheaper than increasing capacity to 20 GB and while system components access things the same as current, the bandwidth uptick would benefit total system performance on the memory side (just as an example, the scenario I gave earlier resulting in the 438 GB/s would automatically increase to 518 GB/s if MS fitted Series X with 16 Gbps chips (for total peak system bandwidth of 640 GB/s over the current 560 GB/s).

Again, I am NOT saying that the current performance issues with games like Atomic Heart, Hogwarts, Wild Hearts etc. on Series X are due to the memory setup. What I'm doing here is illustrating the possible issues which either are or could arise with more multiplat releases going forward, as a result of how the Series X's memory setup functions, and the way it's been designed. These concerns become even more pronounced with games needing more than 10 GB of GPU-bound data (and for long periods of access time cycle-wise), a situation which will inevitably impact Series X in a way it won't impact PS5, due to Series X's memory setup.

So hopefully, this serves as an informative post for those maybe wanting to point to an explanation to any future multiplat performance results of games where results for Series X are less than expected (especially if they are lower than PS5's), and the RAM setup can be identified as a possible culprit (either exclusively or in relation with other things like API tools overhead, efficiency, etc.). This isn't meant to instigate silly console wars; that said, seeing people like Colteastwood continue to propagate FUD about both consoles online is a partial reason I wanted to write this up. I did give hints to other design differences between the two systems which provide some benefits to PS5 in particular, such as the cache scrubbers and the more robust enforcement of cache coherency, but I kept this to something both systems actually have, and they both use RAM, at the same capacity, of the same module bandwidth (per chip). SSD I/O could also be a factor into gaming performance differences as the generation goes on, but that is a whole other topic (and in the case of most multiplats, at least for things like load times, the two systems have been very close in results).

Anyway, if there are other tech insights on the PS5 and Series systems you all would want to share to add on top of this, whether to explain what you feel could create probable performance advantages/disadvantages for PlayStation OR Xbox, feel free to share them. Just try to be as accurate as possible and, please, no FUD. There's enough of that on Twitter from influencers 😂
 

Ezekiel_

Banned
If a game only uses 10 GB, then I'm assuming there isn't any issue. But if it's trying to use more, then yeah that's going to be a headache. And I'm sure it's even more of a headache for the weaker S.

PS4 and PS5 have a unified pool of RAM for this reason. And PS5 made strides in removing bottlenecks for devs, not adding some.
 
Last edited:

K' Dash

Member
To really give you an answer I would have to check the tools, APIs and a extensive amount of documentation, if you’re not a software developer I recommend you go do bait console war thread.

You’re expecting people to provide an answer with a bunch of gibberish and hypotheticals… here, full of 12 year olds and man children.

Lock this shit.
 

feynoob

Banned
Why the hell would you go in a thread in a forum and complain you don't want to read the op. What a weird thing.
Because most of us are dumb and can't handle long text walls.
We need a short paragraphs that can summarize the entire shit.

These stuff are for college literature, not for gamers.
 

nowhat

Gold Member
The non-unified memory architecture may have something to do with it, but you gotta remember - it's also the tools.

While the non-physical tools will show up in threads like this, I've received leaked footage of the tools really arriving this time, for realizes:

01-Hammers.gif
 
Last edited:

March Climber

Gold Member
Why the hell would you go in a thread in a forum and complain you don't want to read the op. What a weird thing.
I would agree with this if it were 2003. 2023 posters don’t have that kind of patience, which is why the best option for them is text to speech.
 

LordOfChaos

Member
4c9.jpg


More seriously though: We just can't know out here beyond speculation, these are complicated things between APIs, operating systems, hardware configurations, developer effort and lead platforms. I think the best cases clearly show that you can work around it. I'm not sure it's provable that the worst cases are due to it, or just due to a less mature API/dev stack/not being the lead platform.
 
I think the bottle necks mark Cerny fix to handle data and using less overhead on CPU and ram will make a big difference in the 2nd half of the gen. I think these systems will be very similar in performance but the bottle necks will start to show later on.
 
Last edited:
summarized by A.I.

NeoGAF user discussed Xbox Series X memory and performance.
Series X memory split into two pools, optimized for GPU and partially reserved for OS.
Bus contention can be an issue for Series X due to its memory configuration.
Series X rarely runs full 560 GB/s bandwidth of 10 GB capacity in practice
 

Riky

$MSFT
"In terms of how the memory is allocated, games get a total of 13.5GB in total, which encompasses all 10GB of GPU optimal memory and 3.5GB of standard memory. This leaves 2.5GB of GDDR6 memory from the slower pool for the operating system and the front-end shell. From Microsoft's perspective, it is still a unified memory system, even if performance can vary. "In conversations with developers, it's typically easy for games to more than fill up their standard memory quota with CPU, audio data, stack data, and executable data, script data, and developers like such a trade-off when it gives them more potential bandwidth,"

You won't need more than 10gb for the GPU, first party games will leverage Sampler Feedback Streaming which will have a huge impact

FQ9kmXd.jpg


Therefore Series X has much higher peak bandwith and according to DF more Ram available to games. As games get more compute heavy this gen the wider architecture will prove it's worth as in the PC GPU scene where top range cards are wider not narrow and clocked faster.
We can see with Forza Motorsport that they are already pushing ahead with 4k 60fps and RT on track. It's just the start.
 

DenchDeckard

Moderated wildly
10gb of video memory at the speed the series x has is probably more enough for a 1440p upscale to 4k image at 120fps... even native 4k without raytracing....

...but I don't really know shit so I look forward to the armchair developers who clearly have a bias telling me how shit it is.
 
Last edited:

Hobbygaming

has been asked to post in 'Grounded' mode.
I haven't trusted MS with hardware specs since last gen when they claimed the Xbox One had more bandwidth than the PS4

They said the Xbox One's bandwidth between the Esram and DDR3 could be combined to get 204GB/s except you can't sustain that speed for every cycle with the small amount of Esram they put in the console
 

dottme

Member
Do the developer can decide or control where the data goes? because if you can’t control and you are getting different random latency impacting your engine, I can imagine a lot of developer will give up the additional memory to make testing the engine easier.
its already complicated enough that you don’t need more complicated memory to manage. I wonder if only MS game might actually well optimized for XSX. Especially if the sales number are behind the PS5.
 

SlimySnake

Flashless at the Golden Globes
Digital Foundry actually spoke with devs about this performance advantage and John said devs themselves were baffled by it. Some devs think it's the DirectX overhead, but there was no consensus.

Alex speculated that everyone is still using old DXR APIs. He also speculated that because the PS5 uses new RT APIs, devs had to do MORE work to get it up and running and ended up optimizing those PS5 APIs more compared to the DXR APIs. Odd reasoning considering devs have had 5 years of experience with RT APIs on PC, and it also doesnt reconcile with the fact that multiplatform game dev is done on PC first and then ported to consoles.

Whats important is what devs have told them. Which is that they dont fucking know. PS5 literally has some secret sauce in it that is making it perform better than its specs. Memory management was not brought up. In fact, John said no devs are complaining about the Xbox or the PS5.

Sometimes it just boils down to Mark Cerny am God.

Timestamped:
 
Last edited:
Sampler Feedback Streaming (SFS) was developed to combat the very issues described by the OP. The problem is that devs are just not using it. Along with a number of other next-gen technologies not/under utilised i.e. mesh shaders, VRS (some titles now use this), DirectML etc. this has contributed to a rather lacklustre generation thus far. Ultimately, Microsoft is responsible. First parties should be encouraged (or perhaps mandated) to utilise the more advanced features possible with the Series generation devices. As long as we are still living in this cross-gen hellscape we sadly won't see what the Xbox is capable of. Only basic increases to resolution and frame-rate and a couple of extra effects seem to be the difference since last gen.

Phil Spencer, though better than his predecessors in many ways has given us no system selling games whatsoever. If we look at first-party titles released under his tenure, the situation is laughably bad. The reality is that Xbox has been woefully mismanaged since the early days i.e. OG Xbox and 360. Since then it has been a joke. It's sad as its us that lose out.
 

Lysandros

Member
Well, luckily we do have a very concrete evidence from an actual developer pointing to this. Shin'en Multimedia the developer behind the Touryst cited the difference in memory architectures as being one of the three reasons on how they managed to achieve native 8K resolution on PS5 compared to XSX' 6K (about 78% of difference). The others being the higher GPU clock frequency (thus higher pixel fill rate and L1 cache bandwidth among other things) and lastly the native API. This could only point to a meaningfully higher real world bandwidth throughput required for the large resolution increase at least for their engine/game:

"Shin'en tells us that in the case of its engine, the increase to clock frequencies and the difference in memory set-up makes the difference. Beyond this, rather than just porting the PS4 version to PS5, Shin'en rewrote the engine to take advantage of PS5's low-level graphics APIs".


More than this, i think we should more analyze the bandwidth question in the context of (real world) whole system bandwidth (this is what matters after all) along the data path including all the related contributors per order: Storage I/O bandwidth => V/RAM => GPU caches/Scrubbers (=>Processor/CU) rather than to limit our focus to just V/RAM.
 
Last edited:

Hobbygaming

has been asked to post in 'Grounded' mode.
Some say that devs aren't using mesh shaders, SFS and other tools but who's to say that the devs haven't already tried experimenting with these but saw that the real world results weren't as good as other methods

7sDEMKj.jpg


Cerny the God
 
Last edited:

nowhat

Gold Member
VRS (some titles now use this)
Oh yeah, Dead Space. While the software implementation on PS5 (at launch) looked terrible, I don't think anyone can say the XSX hardware implementation looked particularly good either. Combined with eye tracking in VR (i.e. "foveated rendering") it makes sense, but if you're doing it on a display where the user can look wherever without the game knowing about it - I'm not sure if it'll ever be able to be the silver bullet some make it up to be. Unless the drop in shading quality is just miniscule, and then, I don't think the performance gains would be that substantial.
 

ChiefDada

Gold Member
Or maybe, just maybe the PS5 is just as powerful as the XBSX.

Imo, there is nothing wrong with the XBSX hardware or software.

I think we let teraflops take over and stop us from appreciating how well engineered these consoles are.

I think we really need to start defining which aspect we're referring to when we say one platform is more powerful than another. XSX is more powerful than PS5 from a max theoretical compute perspective, but so many developers have been saying it's really difficult to reach that max due to other significant bottlenecks in the system. PS5 is more powerful from a speed, efficiency, and utilization perspective.
 

M1chl

Currently Gif and Meme Champion
It is start configuration of the memory, not ring, thus no.

You can't really add the numbers like that, the whole "maximum bandwidth" is divided into num of chips and as far as I know one chip (or maybe two?) is connected with "less wires", they aren't one after the other, rather it is SoC <=> chip1, SoC <=> chip2, etc... and then SoC <-> chip8

Troubles of the performance is more due to the fact that for being on competition level with Sony API, you simply need to have:
1) more than few Mhz on CPU, because of Direct X
1.5) since a lot of the architecture of Dx sw depends on feedback, because in Dx contains everything not just graphics, the memory setup with higher latency GDDR memory, is simply not ideal
2) more CUs are harder to address in terms of latency, especially on engines which really wasn't built for heavy parallelism, rather than vertical scale
3) based on my own experience it took MS, 2 years to solve the esram vs gddr6 situation, with the X1X vs X1S. Games which are built on the SDK which accommodates X1 and X S|X there is bound to be problems.

Solution is simple, let finally put last gen to bed and we will then see, if there are other issues plaguing the X|S situation

Last point, despite beating the drum of "balanced" architecture, it seems like PS5 is really that, very good engineered console. Xbox on the other hand is done mostly by AMD and if you are monitoring PC market, you know

However this stuff is way more complex, then what it may seems on the paper.
 
Last edited:

ChiefDada

Gold Member
Is SFS the new 'savior' acronym?

I recall a verified developer on resetera describing SFS as just "ok" but said it doesn't come close to matching PS5 i/o. He said PS5 did everything "significantly faster" without the need to optimize in comparison to SX and was by far the biggest difference between the consoles. Xbox fanatics are hearing these sentiments coming from developers and they still choose to ignore.
 

Neo_game

Member
I remember there were some prediction regarding this as well. I think PS5 memory is better solution eventhough 448 gb/sec is too low IMO. The github leak of 16ghz ram would make it 512 gb/sec which I guess was their target. But since gamer nexus made a video that ram temp is slighly more and Sony decided to cheap it out and settled with 448 gb/sec. Since it is unified memory dev does not have to bother. For SX I am sure they definitely wanted to have 20gb of full 560gb/sec and some people thought that wider and narrow is some sort of genius move where as it was a clear compromise on their end as well. There was another school of thought that feeding 36CU with 448 gb/sec is better or 52cu wit 560. To me PS5 just seems more efficient.
 

yamaci17

Member
Is SFS the new 'savior' acronym?
yup its a magical memory multiplier. devs have actual bad malice, they do not use it. it magically multiplies effective memory as if you can sample feedback every texture in every scene to perfect scaling

remember cherry picked VRS benchmarks where it was providing enormos performance benefits in CHERRY picked examples? it is the exact same.
lo and behold, the VRS. a meager %5-15 perf. improvement for noticable image quality degration

sampler feedback will end up saving like a funny 100-400 megs here and there and it will amount to nothing LMAO
 

Lysandros

Member
Or maybe, just maybe the PS5 is just as powerful as the XBSX.

Imo, there is nothing wrong with the XBSX hardware or software.

I think we let teraflops take over and stop us from appreciating how well engineered these consoles are.
That would be blasphemous; The Great Dictator could well order public execution for it. The public in question being Resetera and Beyond3d members that would not be a good way to go.
 

Thirty7ven

Banned
Whats important is what devs have told them. Which is that they dont fucking know. PS5 literally has some secret sauce in it that is making it perform better than its specs. Memory management was not brought up. In fact, John said no devs are complaining about the Xbox or the PS5.

Probably hard to get fair answers when DF insists on only making questions about XSX underperformance relative to its specs, when maybe they could also question devs on what is it that PS5 is doing well. Probably harder to get devs to be at risk of nda infringements by spelling the deeds on what’s wrong with Xbox api or otherwise. Bad news stories spread like wildfire.

It’s a shame that DF continues on this path. Not unexpected though.
 
Last edited:
Status
Not open for further replies.
Top Bottom