Yeah the consoles are much more distinguished from each other in the coming generation. It's really hard not to just compare paper specs because that's what the current generation was built upon due to the Xbox One and PS4 being much less customized. In the coming gen with the addition of the SSDs asset streaming has become paramount and while Sony has been boasting about their SSD bandwidth it seems that Microsoft has not only bridged the gap through software but placed further distance between the Series X and the PS5.
The biggest game changer of the new consoles seems to me to be sampler feedback streaming as it will be delivering with a 2-3x multiplier on I/O Bandwidth and Memory. I've seen it reiterated many times by Microsoft and they seem pretty confident in that statement. That means the Series X can effectively transfer and store game assests that would have taken 20-30GBs of space in 10GB of RAM with a transfer rate of 4.8-7.2 GB/s raw and 9.6 - 14.4 GB/s compressed. (Please note that it's not going to literally be these speeds and sizes just how much equivalent data a system without Sampler feedback streaming would have to utilize).
Combining that with the ultra low latency and the speculated memory paging SSG style storage controller and the ceiling for what to expect from the Series X becomes much higher.
Exactly. At first even I was thinking the two approaches to I/O would be apples-to-apples but seeing the divergence highlights where their priorities are at. And, they're both equally capable for what they seek to achieve in cutting out bottlenecks through the data pipeline.
The frustrating part is that people are still generally comparing the two as if they're trying to do the exact same thing, and just going with the paper specs. It's even led to some people thinking MS just slapped a SSD in the system and called it a day, which is a pretty ludicrous idea to have.
The way I see the solutions so far, Sony's still favors raw bandwidth peaks obviously, their solution to the I/O bottleneck problem seems focused on being indiscriminate of data types and just seeing how quickly any type of data can get moved from storage to RAM and replaced more or less as quickly as possible with today's technological limits in the NAND market. I still think their absolute peak (22 GB/s) is for very specific types of video files for example (which you can compress at very high ratio levels with no discernible quality loss), but it's still a really commendable peak. Their solution's also a lot more centrally focused to a single platform but that means it'll be a lot less scalable going into the future.
MS's solution doesn't have Sony's bandwidth figures but they've seemingly done a lot of very specific research on actual texture usage and have built XvA around that, including SFS. Their solution seems focused more on cutting down latency so they'll probably have better latency than Sony's approach and that helps with latency-critical tasks. Rather than trying to move through as much data in/out of the system as quickly as possible, MS wants something that is focused on determining what actual specific data is needed, and having the low latency to pull in that specific data in as small a prefetch window as possible. They've even gone as far as to implement custom hardware in the GPU for this stuff. And their approach is focused much more on scalability for a range of SSD configurations that meet some minimum specification, given that they'll be implementing XvA (if not all of it then at least most parts of it) in the PC, server, mobile etc. spaces.
So both approaches are resolving virtually all of the current I/O bottlenecks, but with different main focuses and different approaches that respect/play to the strengths of the lineage of their respective companies. You can compare them on the paper specs of course if you'd like, but that's missing the big picture and ignoring the reality there are always multiple methods to solving the same problem.
Came across this from someone on Era (don't kill me xD) who linked what they feel might be what type of implementation MS has with XvA.
Storage Performance Development Kit
This is just something that poster figured could be an implementation MS are taking, but t'd make sense they are using at least part of it as some inspiration. It's actually really interesting to consider both MS and Sony are taking inspiration from segments of the server/data center/business markets for addressing I/O throughput; you can see inspirations of Data Processing Units (DPUs) in bot Sony's and MS's I/O blocks (though Sony's seems to be contained wholesale to an actual single silicon block which would more match visually with a DPU; granted I'm sure there are many DPU setups that aren't necessarily like single chips in their implementation either).
Just where do you get this from? Whats pointing to it?
Did you watch the road to PS5 and see about that speed 100 times faster SSD transferring to 100 times faster usable speed.
PS5 has looked to eliminate all bottlenecks and latency is a bottleneck.
It has everything right there in the io complex, things as close as possible for as little as possible latency.
I'd love to hear how the xbox despite processing its io on the CPU will have less latency.... seems to go against physics....
I never said Sony hasn't addressed latency. But it's very feasible MS have focused
very specifically on latency and therefore could have the edge there. Latency and bandwidth are not one in the same.
One thing that might hurt Sony on latency is that they're using slower NAND modules in the first place, while MS are using faster ones. This is in measure of MT/s. Usually the faster NAND modules also have better latency read access on first 4KB (or such) of data. Other factors influence latency too of course.
So PS5 can certainly improve dramatically on I/O bottlenecks in terms of bandwidth and latency, but still have higher latency compared to MS's solution. Also FWIW Series X has custom hardware for a lot of the I/O stack; it's 1/10th of a core (the OS core very likely) that is handling management of the I/O stack.
If you'd like a bit more insight from my POV just read the other part of this post right above yours.
I think there's a huge misconception how th XB VA works, especially that "multiplier" part, so to make things simple, let's use an image:
On this picture we can see only three sides of the Rubic's cube, BUT - the system still loads and uses textures for all 6 of them, wasting memory space and bandwidth. And now what MS is doing with VA, specifically the SFS component, is making the system loading and using indeed only the textures for those three visible sides, and by doing so, they effectively cut the data size by half, which also means half the required bandwidth. Now if we want to talk about rendering a whole scene, for example instead of 6GB the same scene will now use just 2-3GB instead, so instead of 2.4GB/s they can achieve the exact same on-screen result by utilizing only 0.8GB/s of the SSD. Which then creates an opportunity/headroom to add 3-4GB worth of additional objects/textures to fit within that previous 6GB size, utilizing the 2.4GB/s bandwidth, which otherwise would need 18GB RAM and 7.2GB's bandwidth. long story short, they can achieve the same stuff, but with only half-third of the resources.
We can probably even say that the solution of XvA might be a storage, I/O equivalent of Tile-Based Deferred Rendering. Here's a wiki portion on it:
Tiled rendering is the process of subdividing a
computer graphics image by a regular
grid in
optical space and rendering each section of the grid, or
tile, separately. The advantage to this design is that the amount of memory and bandwidth is reduced compared to
immediate mode rendering systems that draw the entire frame at once. This has made tile rendering systems particularly common for low-power
handheld device use. Tiled rendering is sometimes known as a "sort middle" architecture, because it performs the sorting of the geometry in the middle of the
graphics pipeline instead of near the end.
[1]
I'm not saying XvA is literally implementing a setup of TBDR in the SSD I/O, just that it has some common inspirations of it. The main part being spending processing power only on what is actually seen, cutting out the necessity to render and handling geometry earlier in the pipeline.
So what does that sound a lot like? Well, it sounds a lot like XvA and thins like SFS. Working with only very specific textures, or just specific portions of textures. Focusing on streaming in the immediate (or near-immediate) texture data or portion of the texture data, etc. Knowing what to pull and when, just in time, saving on required bandwidth throughput along the way and being beneficial for even lower-end SSD drives (in terms of pure specs like GB/s we can say MS's SSD is lower-end than Sony's), so on and so forth.
A really interesting approach all said. Looking forward to see it in practice.