• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Xbox Velocity Architecture - 100 GB is instantly accessible by the developer through a custom hardware decompression block

oldergamer

Member
SFS isn't compression. Compression is making data smaller, and planning on restoring it. You're going to make it bigger again so it's usable. You might lose a bit on the way, but you aren't getting the linear advantage in Memory savings that is describes for SFS (2-3X for both memory and storage). Compression is squeezing 4.8 of data into 2.4. SFS is completely excluding textures it think won't come up.

"2.4 GB/s (Raw), 4.8 GB/s (Compressed, with custom hardware decompression block)"

That's the quote from the XSX web page. Nothing indicates SFS is in either number. SFS is not a part of hardware decompression and isn't in the raw, as we know that's the base throughput.
I know that, That's what i was alluding to for the 4.8gb number ms has posted.
 

Ascend

Member
are you aware he is saying advantage of it are already mostly there ?
And why do you come to that conclusion? The Eurogamer article clearly stated that on the Xbox One X, memory utilization efficiency is low. Considering that games have been using PRT for a while, it's quite safe to assume, that what they discovered on the Xbox One X, was with PRT. Not without.
When you use the sampler feedback to then load the required textures, rather than pre-loading partial textures, you get the advertised 2x - 3x efficiency increase.
As far as I'm aware, at least.
 
Last edited:
Hmm I'm not sure i understand what that means. using sampler feedback to trigger page reads? what does it achieve? is this in the absence of a Cache?

Texture sampling uses whatever texture MIP level that is most appropriate and is available in memory to determine a pixel's color. The software prior to Sample Feedback has no idea what MIP level is actually used or needed. It's blind to what is happening with texture sampling. If you want to evict unneeded MIP levels from memory, you need to know what is needed, so developers use different techniques to try to guess what is needed. Sample Feedback gives visibility into this information so highly accurate information is used to determine which portions of texture to keep and which to evict. This increases the efficiency of this technique.

are you aware he is saying advantage of it are already mostly there ?

If you are referring to Sampler Feedback vs Sample Feedback Streaming, then you are partially right. I doubt there are any games right now that use this since it is only a few months old so "already mostly there" isn't quite true.
 

martino

Member
If you are referring to Sampler Feedback vs Sample Feedback Streaming, then you are partially right. I doubt there are any games right now that use this since it is only a few months old so "already mostly there" isn't quite true.

the tweet talk about PRT.
 
Interesting post btw. However we should consider how performance per lane degrades when there is a single memory chip on a lane by itself.. Xbox uses 4 lanes, and PS5 uses 12. Not sure how you would factor that into those numbers.

I would assume that 12 channels could potentially hit higher iops more often.
However I'm not that familiar with how ssds work.
Though upon further consideration i doubt the iops would actually matter at this point. We're probably talking about >100k on both systems.
I don't think games make use of that much.
 
Last edited:
I'm sorry, I get frustrated at times when I see a genuinely interesting discussion getting constantly derailed by people who appear to lack any real understanding of the topic at hand. People like you.

Blocking and ignoring come free with every account on NeoGaf. Im certainly not here to debate you what you do or dont know.
 
The ability to do this does sounds like something very new. Is this custom to the Xbox GPU?

That's a complicated question with two answers:

1) What I described was Sampler Feedback. We know more about that because it is in DX12 Ultimate, meaning its on RTX nvidia cards and will be in RDNA2 GPUs from AMD.

Because console GPUs are somewhat custom, we don't know if this will be on both the XsX and the PS5, or just the XsX.

Because of the marketing from both sides on this and the VRS feature (lots of claims on one side and lots of silence from the other), I'm inclined to bet that the PS5 doesn't have it. But that's just a guess, we don't really know.

and

2) The other issues that some people get hung up on is the difference between Sample Feedback and Sampler Feedback Streaming. Here's all we know about that:
- Microsoft uses two distinct terms in their marketing and documentation: SF and SFS.
- SF is a big part of SFS, but not all of it (it is in the name after all).
- SFS appears to be a banner over a hardware/software feature (the SF part at least is hardware)
- The performance claim of 2X-3X is only seen in the SFS XBox marketing, not in the DX12 marketing.
- There are some folks on twitter vaguely alluding to some additional hardware features in SFS vs SF, not enough to really draw strong conclusions (it's twitter)
 
That's a complicated question with two answers:

1) What I described was Sampler Feedback. We know more about that because it is in DX12 Ultimate, meaning its on RTX nvidia cards and will be in RDNA2 GPUs from AMD.

Because console GPUs are somewhat custom, we don't know if this will be on both the XsX and the PS5, or just the XsX.

Because of the marketing from both sides on this and the VRS feature (lots of claims on one side and lots of silence from the other), I'm inclined to bet that the PS5 doesn't have it. But that's just a guess, we don't really know.

and

2) The other issues that some people get hung up on is the difference between Sample Feedback and Sampler Feedback Streaming. Here's all we know about that:
- Microsoft uses two distinct terms in their marketing and documentation: SF and SFS.
- SF is a big part of SFS, but not all of it (it is in the name after all).
- SFS appears to be a banner over a hardware/software feature (the SF part at least is hardware)
- The performance claim of 2X-3X is only seen in the SFS XBox marketing, not in the DX12 marketing.
- There are some folks on twitter vaguely alluding to some additional hardware features in SFS vs SF, not enough to really draw strong conclusions (it's twitter)

According to Stanard, everything you are saying is true and we are waiting on Hotchip in August to learn the other technical features such as the special XSX texture filters that aid in the "streaming" part. I'll also like to know more about the EEC elements of the GDDR6 and NAND selection.
 
Last edited:
2) The other issues that some people get hung up on is the difference between Sample Feedback and Sampler Feedback Streaming. Here's all we know about that:
- Microsoft uses two distinct terms in their marketing and documentation: SF and SFS.
- SF is a big part of SFS, but not all of it (it is in the name after all).
- SFS appears to be a banner over a hardware/software feature (the SF part at least is hardware)
- The performance claim of 2X-3X is only seen in the SFS XBox marketing, not in the DX12 marketing.
- There are some folks on twitter vaguely alluding to some additional hardware features in SFS vs SF, not enough to really draw strong conclusions (it's twitter)

I just read through the Stanard twitter thread (which I had not fully looked at before) and should amend what I've said here.

It seems like the claim with SFS from Stanard is that they've added a hardware filter feature for SFS that enables SF-aided streaming with little to no visible pop-in. It's still not clear if this is the ONLY addition, but Stanard seems to think this is the most important thing to mention.

So basically the idea of using SF to stream texture tiles can be about as aggressive as you can get -- don't grab the tile from the SSD until you need it. Depending on how aggressive your implementation, you will have some frames where you don't have the MIP level you want in memory. SFS makes this aggressiveness much more desirable by smoothing over the small number of frames where you might notice some pop-in. Maybe this doesn't add performance, but it could "unlock" it in the sense that a developer might only be inclined to fully take advantage of SF-based texture streaming if the visual artifacts from using it are effectively mitigated.
 

Three

Member
I think you're mixing up a few things here.

1) Not sure if you're saying this or not, but just in case, PRT and SF are not the same thing. You can use PRT to stream portions of textures in and out, and SF lets you make more accurate and performant decisions about when to do that.
2) SF is a new GPU hardware feature that was only exposed via Direct X late last year. It is currently only on nvidia Turing (RTX) and will be in RDNA2 on PC when it comes out. It doesn't need to be bespoke to be relevant if it's not on PS5 (which we have no confirmation of).
3) I'm using SF and SFS somewhat interchangeably mostly because I don't believe it's known whether the PS5 has Sample Feedback. I know some people infer this from the mention of RDNA2 but I just don't think we know.
4) I don't think it is particularly important how much secret Sauce comes with that extra S in SFS. What's relevant is the performance claim and the context of it.
5) Like with VRS, I think Microsoft mentions these things in their marketing to compete with PS5. Specifically pointing out a performance improvement that is also available on the PS5 seems a little odd. At this point we don't know though.
6) If a new GPU hardware feature allows for 2-3X memory and I/O efficiency on the PS5, why didn't Marc Cerny mention it in his developer presentation? Seems like that would be really cool and relevant to this whole topic of using the SSD to stream in graphics data. Again, we don't know, but something to ponder.


Stepping away from this high-stakes, high-emotion family feud for a second (I kid), even if Sampler Feedback is on both consoles, consider what that means for games in the future. These specs, while already impressive, become really incredible. Effectively 32-48GB memory! Effectively 1TB-1.5TB/s memory bandwidth! Effectively 15-30GB/s SSD throughput! Couple that with the efficiency improvements in RDNA2 vs GCN (DF showed something like a 1.5X to the TF spec vs framerate with RDNA1) plus things like VRS and these consoles start to seem pretty powerful! I'm excited to see how games will look as time goes on.
SFS is sometimes known as PRT+
From the DX SPEC:

Terminology
Use of sampler feedback with streaming is sometimes abbreviated as SFS. It is also sometimes called sparse feedback textures, or SFT, or PRT+, which stands for “partially resident textures”.


The point isn't to be pedantic about the term the point is that the x2 savings come from not loading the complete textures. That is the main aim of this technology and why SFS and PRT+ are used interchangeably. The way you know which ones to load have been done before in different ways.

'Samper feedback' just standardises it in the DX API, something that was done before when determining which textures to load in games. The RTX cards got a driver update for DX12 sampler feedback. They didn't get some special custom hardware they didn't have before and again this proves it is not XSX secret sauce as the spec is known.

So going back to the original point when I said this was available on other GPUs.

With all that in mind lets even assume that PS5 doesn't have this known tech that existed for a while that allowed a massive 2x-3x performance boost both in memory use and bandwidth and yet somehow they didn't implement just as all the regulars were spreading FUD that PS5 is RDNA1 or didn't do raytracing 'because we don't know'.

Regardless of all that, is this shitpost accusing me of lying correct or not when you yourself are saying cards from 2018 support SF?

You know this is a lie? I didn't read the rest of what you were saying when I saw this:
What you're not getting is that the 2x efficiency (not performance in the spec) exists already on other GPUs in SF and PRT.

If you still think that is correct and that boost is not even from SF then where does the x2 performance come from? I'm hoping for a better answer than 'but we don't know'.
 
Last edited:

NullZ3r0

Banned
Not all games or applications will require full power from the APU. Take backwards compatibility or Netflix as an example. The CPU will hardly be taxed by a PS4 game and watching a video will hardly require anything from the system. In these cases it will be more efficient to lower clock speeds. This is especially useful when a game engine isn't yet fully optimized for PS5 and running full tilt will actually break a game.
This is true for all modern computers, but many manufactures don't bother calling it variable. They just don't advertise clock speeds. Why does Sony?
 

BrentonB

Member
4.8gb/s is with BCpack compression and no other bandwidth saving measures. SFS and any other methods for saving bandwidth are in addition to 4.8gb

No, the 2x~ 3x multiplier they refer to is the efficiency gained over the One X after they implemented SFS. That and their custom SSD solution mean that the total data transfer rate for the Series X will be 4.8 GB/s when the data is compressed. This will be a huge upgrade from current generation consoles.
 

BrentonB

Member
This is true for all modern computers, but many manufactures don't bother calling it variable. They just don't advertise clock speeds. Why does Sony?

Because they felt like it? People interested in technology want to know? To fuel flame wars on forums? There could be many reasons.
 

oldergamer

Member
No, the 2x~ 3x multiplier they refer to is the efficiency gained over the One X after they implemented SFS. That and their custom SSD solution mean that the total data transfer rate for the Series X will be 4.8 GB/s when the data is compressed. This will be a huge upgrade from current generation consoles.

That's incorrect. I'm not sure why this is hard to understand, but I think you are confused. BCpack compression is not the same feature as sampler feedback.

2.4gb/s is uncompressed data
4.8gb/s is with BCpack compression of texture data
Further bandwidth savings that can give a 2x to 3x "effective" increase in bandwidth comes from using SFS and not loading parts of textures or geometry before they are loaded into memory. This isn't factored into 4.8gb becuase there's no way to know exactly what each game will get from this. More complicated visual titles will likely hit the 2x to 3x range. Simpler titles likely will not see a big benefit. Say you can fit one large texture into memory. If you only need 25% of that texture, you still have room to fit 75% of other textures.

Wait, are you saying the same thing I am now that i laid it out?
 
Last edited:
'Samper feedback' just standardises it in the DX API, something that was done before when determining which textures to load in games. The RTX cards got a driver update for DX12 sampler feedback. They didn't get some special custom hardware they didn't have before and again this proves it is not XSX secret sauce as the spec is known.

I never claimed Sampler Feedback is only in the Xbox though I did suggest it may not be in the PS5. All Sony has to do is say they have SF (and VRS) and I will believe them.

...and you're wrong about Sampler Feedback being just a codifying of something that already existed. Unless you only mean it was already in RTX (your arguments are kind of fuzzy around that). I'm not sure how that actually came about, but clearly nvidia and MS were working together in development of upcoming new features in DX12U since they were already present in hardware in 2018. There's mention of variation in hardware implementation (probably between AMD and nvidia), but nvidia says that RTX cards are the only ones that support it from them.

Everyone from Anandtech to Microsoft to nvidia refers to Sampler Feedback as a new hardware feature. It was exposed in RTX for the first time by DX12U (even though the hardware support was obviously present).

The way you know which ones to load have been done before in different ways.

It seems to me like you're just doing a lot of hand-waving, saying "yeah yeah people have done this before, there a bunch of different ways..." Maybe you're right in that Sampler Feedback is not a big deal in practice, but that is clearly not what Microsoft is saying.

Here's what Microsoft says about "the way you know which ones to load".


The difference in committed memory is very high— 524,288 versus 51,584 kilobytes! About a tenth the space for this tiled resource-based, full-mip-chain-based texturing system. Although this demo comparison is a bit silly, it confirms something you probably suspected: good judgments about what to load next can mean dramatic memory savings. And even if you’re using a partial-mip-chain-based system, accurate sampler feedback can still allow you to make better judgments about what to load and when.

If that's not clear, she is saying that in this extreme example of two different texture streaming systems, the one that uses Sampler Feedback uses 10(ten) times less memory! This paragraph alone refutes what you're saying on whether this feature is JUST PRT in new a dressing.

Here's some more from that same article on why SF is different.

Suppose you are shading a complicated 3D scene. The camera moves swiftly throughout the scene, causing some objects to be moved into different levels of detail. Since you need to aggressively optimize for memory, you bind resources to cope with the demand for different LODs. Perhaps you use a texture streaming system; perhaps it uses tiled resources to keep those gigantic 4K mip 0s non-resident if you don’t need them. Anyway, you have a shader which samples a mipped texture using A Very Complicated sampling pattern. Pick your favorite one, say anisotropic.

The sampling in this shader has you asking some questions.

What mip level did it ultimately sample? Seems like a very basic question. In a world before Sampler Feedback there’s no easy way to know. You could cobble together a heuristic. You can get to thinking about the sampling pattern, and make some educated guesses. But 1) You don’t have time for that, and 2) there’s no way it’d be 100% reliable.

Where exactly in the resource did it sample? More specifically, what you really need to know is— which tiles? Could be in the top left corner, or right in the middle of the texture. Your streaming system would really benefit from this so that you’d know which mips to load up next. Yeah while you could always use HLSL CheckAccessFullyMapped to determine yes/no did-a-sample-try-to-get-at-something-nonresident, it’s definitely not the right tool for the job.

Direct3D Sampler Feedback answers these powerful questions.
 

rnlval

Member
That's incorrect. I'm not sure why this is hard to understand, but I think you are confused. BCpack compression is not the same feature as sampler feedback.

2.4gb/s is uncompressed data
4.8gb/s is with BCpack compression of texture data
Further bandwidth savings that can give a 2x to 3x "effective" increase in bandwidth comes from using SFS and not loading parts of textures or geometry before they are loaded into memory. This isn't factored into 4.8gb becuase there's no way to know exactly what each game will get from this. More complicated visual titles will likely hit the 2x to 3x range. Simpler titles likely will not see a big benefit. Say you can fit one large texture into memory. If you only need 25% of that texture, you still have room to fit 75% of other textures.

Wait, are you saying the same thing I am now that i laid it out?


3k3fZFt.jpg


2.4 GB/s is uncompressed data
+6GB/s is decompressed mode (ZLIB or BCpack)
 
Last edited:

Eliciel

Member
The amount of theory crafting is reaching unprecedented levels these days. It would actually be a Bigger success than any console generation if any of these consoles actually delivers on expectations. I Sense that it grows next to Impossible to actually succeed in delivering on expectations. It could become the biggest let down generation ever experienced by the human mind/eye/ear!
 

BrentonB

Member
That's incorrect. I'm not sure why this is hard to understand, but I think you are confused. BCpack compression is not the same feature as sampler feedback.

2.4gb/s is uncompressed data
4.8gb/s is with BCpack compression of texture data
Further bandwidth savings that can give a 2x to 3x "effective" increase in bandwidth comes from using SFS and not loading parts of textures or geometry before they are loaded into memory. This isn't factored into 4.8gb becuase there's no way to know exactly what each game will get from this. More complicated visual titles will likely hit the 2x to 3x range. Simpler titles likely will not see a big benefit. Say you can fit one large texture into memory. If you only need 25% of that texture, you still have room to fit 75% of other textures.

Wait, are you saying the same thing I am now that i laid it out?

Close, but not quite. I'm saying that the efficiency gains from Sampler Feedback are in relation to the One X hard drive. It's an effective 2 - 3x increase in efficiency compared to that console. I don't believe it will dramatically increase the loading speed of the Series X. The only numbers that Microsoft have made public are 2.4 GB/s and 4.8 GB/s. Those are the numbers I will refer to. I will be interested to see real-world tests.
 
Close, but not quite. I'm saying that the efficiency gains from Sampler Feedback are in relation to the One X hard drive. It's an effective 2 - 3x increase in efficiency compared to that console.

Actually, I think it's compared to a 3.5" floppy drive. I suspect we'll see 60-90 KB/s from Xbox this generation, assuming developers want to invest in custom Xbox features.
 

Three

Member
I never claimed Sampler Feedback is only in the Xbox though I did suggest it may not be in the PS5. All Sony has to do is say they have SF (and VRS) and I will believe them.

...and you're wrong about Sampler Feedback being just a codifying of something that already existed. Unless you only mean it was already in RTX (your arguments are kind of fuzzy around that). I'm not sure how that actually came about, but clearly nvidia and MS were working together in development of upcoming new features in DX12U since they were already present in hardware in 2018. There's mention of variation in hardware implementation (probably between AMD and nvidia), but nvidia says that RTX cards are the only ones that support it from them.

Everyone from Anandtech to Microsoft to nvidia refers to Sampler Feedback as a new hardware feature. It was exposed in RTX for the first time by DX12U (even though the hardware support was obviously present).



It seems to me like you're just doing a lot of hand-waving, saying "yeah yeah people have done this before, there a bunch of different ways..." Maybe you're right in that Sampler Feedback is not a big deal in practice, but that is clearly not what Microsoft is saying.

Here's what Microsoft says about "the way you know which ones to load".




If that's not clear, she is saying that in this extreme example of two different texture streaming systems, the one that uses Sampler Feedback uses 10(ten) times less memory! This paragraph alone refutes what you're saying on whether this feature is JUST PRT in new a dressing.

Here's some more from that same article on why SF is different.
Sony didn't come out and say they have RDNA 2 either but there you go they do. They didn't need to.

There is no handwaving. SF is just a standardized way now in the DX12 API you could write a similar but not identical thing in HLSL. There are examples of alternatives even in the spec. SF is just a standardized API feature. You could write it in the driver for most cards that support PRT.

From your quote
Although this demo comparison is a bit silly, it confirms something you probably suspected:

Why do you think it's saying it's a bit silly? Because nobody just loads everything like that and people have wrote good systems that do something similar already, especially on consoles.
And even if you’re using a partial-mip-chain-based system, accurate sampler feedback can still allow you to make better judgments about what to load and when.

You even have some examples of the alternatives in your last quote so why are you saying it's handwaving? It makes things easier and less time consuming because you have something in the API now but it doesn't give you 2x the performance boost in comparison to similar things known by different names PRT+, Sparse Texture Feedback.
 
Last edited:

BrentonB

Member
"Our second component is a high-speed hardware decompression block that can deliver over 6GB/s," reveals Andrew Goossen.

It can deliver up to 6GB/s but will it? I don't think so. There are only 4.8 GB of compressed data coming into the decompression block. I believe this design choice was made to ensure that there is always room for incoming data and that it will never be 'backed up'. Basically the system can decompress more data than it actually needs to so that nothing has to 'wait its turn.'
 
You are. You're telling me things that have nothing to with the conversation. It doesn't matter if it's not full RDNA 2, the fact is, Sony and MS call it RDNA 2 and you can't say one is full RDNA 2 and the other is not.



I watched it, and again, its IRREVERENT.

Let me put it like this since you're clearly missing the point.

They're saying it will be 4.9 and sometimes hit 6.9 because PS5's variable frequency won't be able to maintain that number when needed.

It's the fact that they're saying 10.2TF is not the real number and that it will only hit that number "sometimes".

I don't have to explain this an further.




Right, all MS numbers are "substained" but PS5's SSDs are not. You're really playing word games here and it reminds when we had this discussion a few months ago when you were saying Cerny was assuming his numbers.

I don't have time for people who just want to keep moving goalpost.

Fine. You do you. But you're kind of agitated based on things I'm not actually doing. If you feel the proof is so clear-cut, pull up my quotes where you think I've engaged in these things you are alleging.

I never said XSX was full RDNA2, in fact I've said it's also custom RDNA2 several times in the past, as well. So if I've said PS5 is not "full" RDNA2 then what I've probably actually said is that it's custom RDNA2, similar to XSX. The question is how the systems have customized their feature sets, and that is a hard question to answer because even AMD are cheeky when it comes to what RDNA2's full feature set is.

From the sounds of it, you maybe didn't watch the NXGamer video as intended, because I don't think that's what they were saying. They were basically touching up on architectural improvements in the RDNA2 GPUs which would help more fully utilize the GPU hardware to reach ever closer to the theoretical performance numbers. Like I said, there won't be any real-world use cases where PS5 actually reaches 10.275, and there won't be any real-world use cases where XSX actually reaches 12.147. But thanks to system designs and architectural gains both systems aught to reach much closer to their theoretical peaks under absolute/highest utilization than their predecessors.

I said MS's numbers are sustained because MS literally used the word "sustained" when stating them. And the reason I take them on their word on that is because the XSX is also being designed with use in server markets, primarily Azure server racks IIRC. Sustained performance is a necessity there, hence why they probably mentioned it (tho tbf, they also probably mentioned it as a cheeky jab against Sony's variable frequency). There's no agenda in stating they've claimed sustained and Sony haven't because Sony haven't literally came out and said their numbers are sustained on the SSD.

Now, I can afford them benefit of the doubt and say they likely are, but I'm also considering the power draw a SSD of that level has to put on the system in terms of potential strain, and the fact that will pretty much factor into the variable frequency. Very logical things to think about, considering Cerny stressed the importance of their variable frequency strategy and what needed to be done to achieve it.

So, I feel I've hopefully explained myself satisfactorily. If for some reason you're still not satisfied with my rationale, or seem to somehow think I'm out to belittle/downplay PS5 or Sony or Mark Cerny out of spite, I can't help you. Because you're literally wrong; I've been critical (in a constructively optimistic way) on design and other aspects on both systems, and will continue to do so.

So can I speak at 22GB/s on the PS5?

MS's 6 GB/s figure is exactly analogous in terms of potential real-world use cases as Sony's 22 GB/s figure. So if you think one is being fantastical in that regard, it would apply to both.

Most lossless compression ratios tap out at 2:1, but a few can go a bit above that. I'd assume based on that, that some of the data types MS are referring to with their 6 GB/s figure, and a decent amount of data types Sony are referring to with their 22 GB/s figure, are for either quite lossy compressed files or files that can handle large compression ratios pretty well without noticeable quality degradation. Generally, those tend to be things like video file formats.

6gb isn't a theoretical max. It is the stated MINIMUM decompression throughput rate of the Hardware decompression block.

Much like the stated locked compute of the XSX CU array is 12.15.

Upon some reflection and reading through this thread and others, I actually think I understand why they chose a decompression rate of 6GB/s... but its simply a random speculation nothing more.

Wait so it's referring to minimum decompressison throughput rate? Has this been mentioned by any of the MS Xbox engineering team members?

Was under the impression it was theoretical max, but if I'm wrong I'm wrong. Or is this a personal analysis? Is there a way you can expand on this?
 
Last edited:

Deto

Banned
Close, but not quite. I'm saying that the efficiency gains from Sampler Feedback are in relation to the One X hard drive. It's an effective 2 - 3x increase in efficiency compared to that console. I don't believe it will dramatically increase the loading speed of the Series X. The only numbers that Microsoft have made public are 2.4 GB/s and 4.8 GB/s. Those are the numbers I will refer to. I will be interested to see real-world tests.

MS has the power of magic software.
Does it display 4.8GB / s? it doesn't matter, a windows update will come out on the xbox SX with BCpack 12 that will release the hidden power of the third IO controller and reach 6GB / s.

Then there will be the BCpack 13 that will release the 23GB/s ultra hidden power because the MS number always has to be as inflated as possible based on PR.



Did Microsoft release 4.8GB / s?

Lie, she is hiding the game to surprise Sony (same narrative about the hidden GPU) and is actually 6GB / s sustainable 100% of the time, while the PS5 heats up and drops to 1MB / s and it will catch fire and set your house on fire .
 
Last edited:

rnlval

Member
It can deliver up to 6GB/s but will it? I don't think so. There are only 4.8 GB of compressed data coming into the decompression block. I believe this design choice was made to ensure that there is always room for incoming data and that it will never be 'backed up'. Basically the system can decompress more data than it actually needs to so that nothing has to 'wait its turn.'
It's NVMe's 2.4GB/s data stream hitting the decompression block which then decompressed into 4.8 GB/s to "more than 6GB/s".

Are you claiming the decompression block is built into NVMe device?

It is exactly your argument.
Just apply the same argument on the PS5 and it magically reaches 22GB / s
PS5 wasn't mentioned in my post. LOL
 
Last edited:

jimbojim

Banned
On the Topic of SSD speeds in both consoles:

XSX : SSD Read Speed is 2.4GB/s
PS5: SSD Read Speed is 5.5GB/S

All those other numbers that are floating around already count in compressed data.

Example:
If both consoles would read 100Gb of data that has been compressed to 50GB of data. ( 50% compression)
XSX - 20.83 seconds to Read 50GB of compressed data and output 100GB of data
PS5 - 9.09 seconds to Read 50GB of compressed data and output 100GB of data

Another example with some of these numbers that have been reported:
If both consoles need to read 100Gb of textures that have been compressed by kraken / BCPack
XSX - BCPack compression 60% - 40GB Compressed Data - 40GB/2.4GB = 16.7 seconds
PS5 - Kraken compression 30% - 70GB Compressed Data - 70GB/5.5GB = 12.7 seconds

So in the time XSX reads those 100 GB of Data that has been compressed by 60% - the PS5 could actually read 91.85GB of Kraken compressed data, which would be 119.40GB of textures.

KEEP IN MIND HOWEVER THOSE COMPRESSION RATES ARE JUST AUSSUMPTIONS
We have no idea how good either of those compression methods is at compressing texture.
Also not all data necessary is texture data.
Compressions rate might therefore differ between all kinds of data structures.

Yeah, maybe is something like that, just to add, keep in mind that BCP is lossy method, Kraken is lossless
 
Sony didn't come out and say they have RDNA 2 either but there you go they do. They didn't need to.

I believe they did say it was RDNA2 "based".

There is no handwaving. SF is just a standardized way now in the DX12 API you could write a similar but not identical thing in HLSL. There are examples of alternatives even in the spec. SF is just a standardized API feature. You could write it in the driver for most cards that support PRT.

...

Why do you think it's saying it's a bit silly? Because nobody just loads everything like that and people have wrote good systems that do something similar already, especially on consoles.

...

You even have some examples of the alternatives in your quote so why are you saying it's handwaving?

It seems like you're just saying:

A) There are other ways to do this, therefore
B) This is the same as the other ways.

I don't think that is sound reasoning. The whole point of that paragraph I quoted is that the way you make your decisions has an impact on how much memory you save with your streaming, and Sampler Feedback can provide substantial improvements. Answer these questions if you a have a moment:

1) Is it impossible to you that under the hood of Sampler Feedback is more efficiently acquired and more accurate information than these other methods?

2) If this is so trivial, why is Sampler Feedback exclusive to DX12 ultimate? Why even bother putting this next to VRS and updates to DXR? Why all these articles/videos explaining the value of having feedback from texture sampling?

3) What do you make of the claim from Microsoft that games running on XBox One X were only making use of 1/2 or 1/3 of what was loaded in memory? Was this measure selectively taken (for marketing purposes) of games not using PRT techniques? Or do you just doubt that Sampler Feedback is sufficient to close that gap in any noticeable way like they claim?
 

BrentonB

Member
It's NVMe's 2.4GB/s data stream hitting the decompression block which then decompressed into 4.8 GB/s to "more than 6GB/s".

Are you claiming the decompression block is built into NVMe device?


PS5 wasn't mentioned in my post. LOL

No, I am not claiming that. Official specs state that data moves at 2.4 GB/s raw and 4.8 GB/s compressed. Those are the numbers I'm using.
 

truth411

Member
PS fanboys made a big deal over a .5 TFLOPS difference last Gen, but for whatever reason a 2 TFLOP advantage is "insignificant". Yet with that .5 TFLOP difference manifested itself in framerate and resolution. So I suspect that a bigger gap this gen will see at least a similar difference. I think it will be greater because I'm not buying Sony's clock speed claims. If the damn thing spent "most of it's time" at 2.23 GHz then just say that's the clock speed and call it a day. PC manufacturers do that all the time. But I digress, back to storage...

There's nothing disingenuous about my post. The real world difference between Xbox Series X and PS5 storage solutions will be measured in milliseconds and not seconds. The end user won't notice the difference in game. There's nothing in that UE5 demo that couldn't be done in the Xbox Velocity Architecture. You're believing in a fantasy world where only Sony listened to developers and put significant investment in their storage solution while MS went to Best Buy and just slapped a random SSD in the box and called it a day.

Both companies focused significantly on storage and asset streaming. However, Microsoft didn't skimp on GPU to get there.

No desire for a debate. But this post make you look like you dont know what your talking about cause your too caught up in fanboy wars.
Quick history lesson (minus the fanboy nonsense)

A. There was a whole DRM fiasco that Microsoft was dealing with.

B. When announced I believe the XBox one was rated at 1.23 Tflops (then upclocked to 854mhz to 1.31 Tflops)

C. Was $100 more expensive.

So if you do simple math at the time of announcement of specs the PS4 was about 50% faster GPU with 8 ACE (later reduced to 40% after M.S. increase in clock speeds) with a better memory/bandwidth solution (8GB GDDR5) for $100 cheaper.

It isnt just the .5tflops, it doesnt work that way, for your analogy to work it would mean at announcement the XSX would be 15.4x TFlops with a far better Memory/bandwidth solution to take advantage of it.
Non of this is the case, Next Gen is faaaaaar more competitive. IMHO Next Gen isn't about Console wars, Its about Content Wars, exclusives and ecosystem is what matters.
 
I have the feeling I still need to clarify something
On the Topic of SSD speeds in both consoles:

XSX : SSD Read Speed is 2.4GB/s
PS5: SSD Read Speed is 5.5GB/S

All those other numbers that are floating around already count in compressed data.

Example:
If both consoles would read 100Gb of data that has been compressed to 50GB of data. ( 50% compression)
XSX - 20.83 seconds to Read 50GB of compressed data and output 100GB of data
PS5 - 9.09 seconds to Read 50GB of compressed data and output 100GB of data

Another example with some of these numbers that have been reported:
If both consoles need to read 100Gb of textures that have been compressed by kraken / BCPack
XSX - BCPack compression 60% - 40GB Compressed Data - 40GB/2.4GB = 16.7 seconds
PS5 - Kraken compression 30% - 70GB Compressed Data - 70GB/5.5GB = 12.7 seconds

So in the time XSX reads those 100 GB of Data that has been compressed by 60% - the PS5 could actually read 91.85GB of Kraken compressed data, which would be 119.40GB of textures.

KEEP IN MIND HOWEVER THOSE COMPRESSION RATES ARE JUST ASSUMPTIONS
We have no idea how good either of those compression methods is at compressing texture.
Also not all data necessary is texture data.
Compressions rate might therefore differ between all kinds of data structures.

Back to my example:
If both consoles need to read 100Gb of textures that have been compressed by kraken / BCPack
XSX - BCPack compression 60% - 40GB Compressed Data - 40GB/2.4GB = 16.7 seconds | 100GB worth of uncompressed data divided by 16.7 seconds is 5.99GB/s compressed data
PS5 - Kraken compression 30% - 70GB Compressed Data - 70GB/5.5GB = 12.7 seconds | 100GB worth of uncompressed data divided by 12,7 seconds is 7.87GB/s compressed data
 
Last edited:

rntongo

Banned
I never claimed Sampler Feedback is only in the Xbox though I did suggest it may not be in the PS5. All Sony has to do is say they have SF (and VRS) and I will believe them.

...and you're wrong about Sampler Feedback being just a codifying of something that already existed. Unless you only mean it was already in RTX (your arguments are kind of fuzzy around that). I'm not sure how that actually came about, but clearly nvidia and MS were working together in development of upcoming new features in DX12U since they were already present in hardware in 2018. There's mention of variation in hardware implementation (probably between AMD and nvidia), but nvidia says that RTX cards are the only ones that support it from them.

Everyone from Anandtech to Microsoft to nvidia refers to Sampler Feedback as a new hardware feature. It was exposed in RTX for the first time by DX12U (even though the hardware support was obviously present).



It seems to me like you're just doing a lot of hand-waving, saying "yeah yeah people have done this before, there a bunch of different ways..." Maybe you're right in that Sampler Feedback is not a big deal in practice, but that is clearly not what Microsoft is saying.

Here's what Microsoft says about "the way you know which ones to load".




If that's not clear, she is saying that in this extreme example of two different texture streaming systems, the one that uses Sampler Feedback uses 10(ten) times less memory! This paragraph alone refutes what you're saying on whether this feature is JUST PRT in new a dressing.

Here's some more from that same article on why SF is different.

Thank you for this. They’ve been trying to blurr the difference between between PRT and Sampler Feedback. I would advise not to argue with them.
 

Ascend

Member
Thank you for this. They’ve been trying to blurr the difference between between PRT and Sampler Feedback. I would advise not to argue with them.
Indeed... And I pointed this out in another thread... Quoting...;

I think we're confusing two things here... We have;

1) Using tiles/partial textures to load less amount of textures from SSD into ram.
2) Increasing loading efficiency by only loading what is required from storage to RAM.

With number 1, even when you are using partial textures, you can still be loading textures that you don't use.
With number 2, you are avoiding the loading of (partial) textures that you will not use.

Based on the explanation by Eurogamer, I think MS is doing the latter, not the former. Reposting;

As textures have ballooned in size to match 4K displays, efficiency in memory utilisation has got progressively worse - something Microsoft was able to confirm by building in special monitoring hardware into Xbox One X's Scorpio Engine SoC. "From this, we found a game typically accessed at best only one-half to one-third of their allocated pages over long windows of time," says Goossen. "So if a game never had to load pages that are ultimately never actually used, that means a 2-3x multiplier on the effective amount of physical memory, and a 2-3x multiplier on our effective IO performance."

They are talking about allocated pages. Allocated means present in RAM. There are really two possibilities here... Either

A) The allocated pages mentioned for the Xbox One X are already based on the use of partially resident textures, thus the increase in efficiency of 2x -3x they mention are on top of these.
or
B) The allocated pages mentioned for the Xbox One X are not based on the use of partially resident textures, and thus the benefit of PRT applies now.

In either case, I still don't see how the 2x -3x multiplier can be discarded.

 

ToadMan

Member
I would add one question to that list.
3. Can the sony NVME setup actually live up to the stated performance level?

I still have my doubts that sony's solution, even though it states 5.5 GB per second is going to deliver far lower, specifically if they are using 12 channels/lanes, or one for each memory chip on the NVME. That will impact performance more than i think people will realize. In turn I expect Xbox to punch above its weight a little, and sony's brute force approach to be more conservative. In the end, the the difference I suspect in paper specs vrs useable performance on both drives will not be a 50%. I'm expecting a smaller delta.

Ok so you’re interchanging channels and lanes. They’re not interchangeable and they have specific definitions for PCIe and SSD terminology.

The reason I point out the above is because it suggests to me you may be confused about the technology.

Which brings me onto this. For SSD, more channels = increased performance. The fact Sony is using a controller with 12 channels is part (it may be all) of the reason they are going to be fast - 12 channels is 12 simultaneous data reads.

Low end SSDs perhaps have 2 or 4 channels. High end SSDs perhaps have 8. 12 is a 50% bump in that and that’s in line with what Sony is getting in terms of transfer speeds.
 

Deto

Banned
I have the feeling I still need to clarify something


Back to my example:
If both consoles need to read 100Gb of textures that have been compressed by kraken / BCPack
XSX - BCPack compression 60% - 40GB Compressed Data - 40GB/2.4GB = 16.7 seconds | 100GB worth of uncompressed data divided by 16.7 seconds is 5.99GB/s compressed data
PS5 - Kraken compression 30% - 70GB Compressed Data - 70GB/5.5GB = 12.7 seconds | 100GB worth of uncompressed data divided by 12,7 seconds is 7.87GB/s compressed data


wow, we already have the magic and the xbox SSD is already delivering the same thing as the PS5.
what a delusion.

What a coincidence that all these accounts with these magic numbers deducted from the lollipop world are from this year.


If I were you, I would comment on YouTube and expect the Windows Central people to spread the FUD on Twitter
 
Last edited:
Wait so it's referring to minimum decompressison throughput rate? Has this been mentioned by any of the MS Xbox engineering team members?

Was under the impression it was theoretical max, but if I'm wrong I'm wrong. Or is this a personal analysis? Is there a way you can expand on this?

"Our second component is a high-speed hardware decompression block that can deliver over 6GB/s," reveals Andrew Goossen. "This is a dedicated silicon block that offloads decompression work from the CPU and is matched to the SSD so that decompression is never a bottleneck.."

-Microsoft technical fellow and Xbox system architect Andrew Goossen.

This is taken from the Eurogamer interview https://www.eurogamer.net/articles/digitalfoundry-2020-inside-xbox-series-x-full-specs
 
wow, we already have the magic and the xbox SSD is already delivering the same thing as the PS5.
what a delusion.

What a coincidence that all these accounts with these magic numbers deducted from the lollipop world are from this year.


If I were you, I would comment on YouTube and expect the Windows Central people to spread the FUD on Twitter

Care to correct me then?
I mean did I make any mistake with these calculations?
Are they wrong?
Sure the compression rates are assumptions I already said that.


Just so you know based on my example:
7.87 = 100%
5.99 = 76%
Delta = 24% aka XSX is 24% slower then PS5.

Please note that while xsx has nearly 6.0GB/s compressed data in this case the PS5 has only 7.9GB/s which is still not what Cerny said.
I mean this is simple math you can probably tell me if I've made any mistakes.
 
Last edited:
Top Bottom