• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

(*) Ali Salehi, a rendering engineer at Crytek contrasts the next Gen consoles in interview (Up: Tweets/Article removed)

tkscz

Member
got it.
did you watched the inside xbox show of yesterday?
xbox people said that they made the console in order to get all the power every time they need it, do you think it's a response to the crytek guy leak?

Wouldn't be surprised considering how far this interview got, it was all over my youtube feed. It would be smart on MS's part to put out this flame before it spread to far and got future devs thinking it would be too difficult for them to use the hardware to full effect easily. This could easily intimidate indie devs who don't normally push hardware and losing indies is not what any company would want to do.
 

GymWolf

Member
Wouldn't be surprised considering how far this interview got, it was all over my youtube feed. It would be smart on MS's part to put out this flame before it spread to far and got future devs thinking it would be too difficult for them to use the hardware to full effect easily. This could easily intimidate indie devs who don't normally push hardware and losing indies is not what any company would want to do.
i think all the major devs already have a sex dev kits in their workplace tho.
 

geordiemp

Member
[
The 2080 S to 2080ti is scaling exactly as you'd expect based on their teraflops. You proved that with what you posted; the 2080 S has ~17% less teraflops than the 2080ti and it's performance at 4k is 20% faster, 17% faster at 1440p. If higher clockspeeds allow a lower tf card to catch up some ground you'd see it here, this is one of the most ridiculous things I've seen mentioned over the last several weeks and honestly I know you know better.


No, not at all. You're almost never hitting 100% on a cpu when gaming, however, your gpu will almost always be pegged at 100%. The workloads of the gpu are completely different than that of a cpu and scale well with more cores due to the parallel nature of it.

2080ti has a 616GB/s memory bandwidth as well for all the memory, not shared with CPU or other sound / assets. 2080 super is bandwidth limited.

Slower access RAM for anything over 10 GB is also not normal and a bad idea.

I know you know better.

So will the extra TF be more performance...?

You have to consider more than just 1 number.

Its not so clear is it. The consoles will both be equally memory bandwidth bound for large asset games and be similar IMO....

But believe what you want.
 
Last edited:

rnlval

Member
That doesn’t answer the question about the sources of that channel, but anyways, if RDNA2 is a 50% more efficient than RDNA it should run operate as an 18TF RDNA card, hence proving the inefficiency of the system.

Then again, as I said, not a single game has ended its development (Gears was a demo) in neither of both platforms and the performances shouldn’t be that representative.
At least it's not Forza Motosport 7.

~50% perf/watt improvement example

Fury X's 8.6 TFLOPS vs R9-290X's 5.6 TFLOPS.
 
Last edited:

ethomaz

Banned
The 2080 S to 2080ti is scaling exactly as you'd expect based on their teraflops. You proved that with what you posted; the 2080 S has ~17% less teraflops than the 2080ti and it's performance at 4k is 20% faster, 17% faster at 1440p. If higher clockspeeds allow a lower tf card to catch up some ground you'd see it here, this is one of the most ridiculous things I've seen mentioned over the last several weeks and honestly I know you know better.
Another good example.

3072 SPs to 4352 SPs = 42% increase in SPs
3072 @ 1815Mhz to 4352 @ 1545MHz = 20% increase if you accounts clock

Performance difference? 11%, 16% and 18%... the performance closes to the SP x Clock difference more the VRAM bandwidth become bootleneck for RTX 2080 Super... same scenario as RTX 2070 Super.

relative-performance_1920-1080.png

relative-performance_2560-1440.png

relative-performance_3840-2160.png


Let's see if we can find a bench with the same cards but with overclock in memory to RTX 2080 Super.
BTW nVidia underclocked the VRAM in 2080 Super to not canibalize the 2080TI.

 
Last edited:

A.Romero

Member
I really don't get the controversy, each gen manufacturers chose different approaches and they yield different results. I know there are not many things to discuss right now but it's impossible to know who is right or even what platform will take advantage of the tech the best.

We will see pretty soon what is PR fluff (Cell, power of the cloud, secret sauce) and what turns out to be a better practical approach.
 

Panajev2001a

GAF's Pleasant Genius
This is fascinating. There used to be a time when technical people had to say where and in what conditions an architecture would be bottlenecked. Not just wave bottlenecks like a scarecrow to downplay the system more powerful than the one that they personally prefer.

I am not downplaying another system, you may want to stop projecting there ;).
 
Look at how easily sheep are really buying Sonys PR blitz. Comparing the Series X to the PS3? The cell was a nightmare for developers to work with, the Series X is a fucking slightly specialized PC. There is NO reason to believe the series is substantially harder to make games on. The PS5 might have some tricks up it's sleeve to in terms of dev tools, but that will be irrelevant 2 years in once devs are familiar with both consoles.
Look at how easily my words get "fucking" misinterpreted. IDK where you got me saying that Series X will be hard to develop for, you pull that from the thoughts in your head. You need to chill man and stop nitpicking comments.

I said the numbers never really mattered, at the end both systems had great looking exclusives. The only thing you took from my comment is that Series X is going to be hard to develop for like the PS3... SMH. Chill bro, chill.
 

Radical_3d

Member
Lets not get ahead of ourselves. I was speaking of the article’s author, not you.
But the author didn’t compare it to the skills needed to master the PS3. He only implied that it’ll take more time to untap all the juice. If the two memory pools can’t be access at the same time I see that as a huge constraint already. Sony and Microsoft both have cheapen out in the RAM department. Specially Xbox since has more power to play than the PS5.
 
But the author didn’t compare it to the skills needed to master the PS3. He only implied that it’ll take more time to untap all the juice. If the two memory pools can’t be access at the same time I see that as a huge constraint already. Sony and Microsoft both have cheapen out in the RAM department. Specially Xbox since has more power to play than the PS5.

It's not two memory pools though. It's one pool, with different sized chips on different channels.
 
[


2080ti has a 616GB/s memory bandwidth as well for all the memory, not shared with CPU or other sound / assets. 2080 super is bandwidth limited.

Slower access RAM for anything over 10 GB is also not normal and a bad idea.

I know you know better.

So will the extra TF be more performance...?

You have to consider more than just 1 number.

Its not so clear is it. The consoles will both be equally memory bandwidth bound for large asset games and be similar IMO....

But believe what you want.

Without being a game developer or programmer, which I assume none of us are, are you capable of changing which data is stored inside RAM? Or is data that's already in RAM forever stuck there with no fast means of changing or swapping old data out with new?


Simple answer to this HAS to be yes you can swap out data and change where it's located in RAM, and fairly quickly at that! Why are people just assuming the data that actually needs to be in the fast GPU optimal memory can't quickly be moved there for processing at full speed bandwidth of 560GB/s? It's like people expect developers will just sit on their hands and not do what's needed when it's needed when all the capability to do it, and do it quickly, is right there at their fingertips. The asymmetrical memory setup is a non-issue for maintaining full bandwidth performance on Xbox Series X games.

Next question for us non programmers and game developers. Is 100% of what's in RAM on consoles tied to textures, models etc, or does some of that RAM need to also be reserved for other crucial aspects to making your game work, such as script data, executable data, stack data, audio data, or important CPU related functions?

A videogame isn't just static graphics and/or models. There are core fundamentals that require memory also such as for animations, mission design and enemy encounters, information crucial to how those enemies behave and react to what the player is doing, specific scripted events, how weapons, items or attacking works (how many bullets, rate of fire, damage done) how the player or weapons they use interacts with the world and objects around them, an inventory system, stat or upgrade systems for those games that use them, how vehicles or NPC characters can be interacted with or controlled, characters customization, craftable items, I can go on. There are things that do require RAM, but may not require the fastest possible access to RAM. For example, maybe the player has a weapon or power that when they use it, it causes confusion amongst the enemies, or it temporarily blinds them, allowing a window of opportunity for the player to gain a much needed advantage. That, too, as simple as it sounds, needs RAM, it just may not require as much as textures and models at the same high speed, but a game is made up of all these basic functions and rules and mechanics, and they require memory just the same.

When you have some basic understanding of all this you also come to understand that concerns about memory amount and memory performance on Xbox Series X are largely overblown. You can expect the parts of any game that could benefit from 560GB/s worth of memory bandwidth performance to actually get it, and parts that do not need it as much to get the lesser side. Then when you factor in custom solutions such as sampler feedback streaming, which is tailor made for using available RAM and the SSD performance in a more efficient manner. Microsoft says Sampler Feedback Streaming contributes to an effective 2-3x multiplier on effective amount of physical memory and the same on effective i/o performance.
 
Last edited:
Not saying it is chokefull of bottlenecks, neither was the PS2 and back then there was no problem questioning anything and everything about its theoretical peaks, that is all.
Also things can be comparable despite not being the same or at the same level of intensity.

Not me, I never worked for Lionhead nor the other greats in the Guildford mega hub :).

Ahh, I know who i confused you for. I confused you for fran from beyond3D. I don't know why I always associated you two as co-workers.
 

geordiemp

Member
Without being a game developer or programmer, which I assume none of us are, are you capable of changing which data is stored inside RAM? Or is data that's already in RAM forever stuck there with no fast means of changing or swapping old data out with new?


Simple answer to this HAS to be yes you can swap out data and change where it's located in RAM, and fairly quickly at that! Why are people just assuming the data that actually needs to be in the fast GPU optimal memory can't quickly be moved there for processing at full speed bandwidth of 560GB/s? It's like people expect developers will just sit on their hands and not do what's needed when it's needed when all the capability to do it, and do it quickly, is right there at their fingertips. The asymmetrical memory setup is a non-issue for maintaining full bandwidth performance on Xbox Series X games.

Next question for us non programmers and game developers. Is 100% of what's in RAM on consoles tied to textures, models etc, or does some of that RAM need to also be reserved for other crucial aspects to making your game work, such as script data, executable data, stack data, audio data, or important CPU related functions?

A videogame isn't just static graphics and/or models. There are core fundamentals that require memory also such as for animations, mission design and enemy encounters, information crucial to how those enemies behave and react to what the player is doing, specific scripted events, how weapons, items or attacking works (how many bullets, rate of fire, damage done) how the player or weapons they use interacts with the world and objects around them, an inventory system, stat or upgrade systems for those games that use them, how vehicles or NPC characters can be interacted with or controlled, characters customization, craftable items, I can go on. There are things that do require RAM, but may not require the fastest possible access to RAM. For example, maybe the player has a weapon or power that when they use it, it causes confusion amongst the enemies, or it temporarily blinds them, allowing a window of opportunity for the player to gain a much needed advantage. That, too, as simple as it sounds, needs RAM, it just may not require as much as textures and models at the same high speed, but a game is made up of all these basic functions and rules and mechanics, and they require memory just the same.

When you have some basic understanding of all this you also come to understand that concerns about memory amount and memory performance on Xbox Series X are largely overblown. You can expect the parts of any game that could benefit from 560GB/s worth of memory bandwidth performance to actually get it, and parts that do not need it as much to get the lesser side. Then when you factor in custom solutions such as sampler feedback streaming, which is tailor made for using available RAM and the SSD performance in a more efficient manner. Microsoft says Sampler Feedback Streaming contributes to an effective 2-3x multiplier on effective amount of physical memory and the same on effective i/o performance.

Hold your fanboy horses, the original one is back......

I am just saying both consoles will be similarly bandwidth limited compared to a 2080TI....which they BOTH are,.....and I was respondiong to a poster comparing TF of nvidia cards to try to show Ps5 wil be poor or whatever crap he was spouting.

My point is that anyone expecting a massive difference and one console will blow another away is probably wrong, they will not be too different, yes 560 >> 448 but 448 >> 336, both are not IDEAL bandwidth set ups and lets hope they both have some tricks..

If you want to read a good explantion, Lady Gaia does a nice one :

And trying to convince me 560 here and 336 there is good solution is just funny, its not much better than 448 average, but believe what you want.

KzQH8Wc.png


The difference between me and you is I recognise both consoles are equally good / equally deficient, you just see one side only (fanboy blinkered view). Read what you typed.
 
Last edited:

SonGoku

Member
And trying to convince me 560 here and 336 there is good solution is just funny, its not much better than 448 average, but believe what you want.
I'd just like to add to your comment so there isn't any confusion, the takeaway here is that XSX needs the extra bandwidth to feed the more powerful GPU that's why they went for this weird comprise instead of 16GB @448GB/s.
Both consoles have roughly the same amount bandwidth available proportionate to their GPUs peak performance or put in more laymans terms: similar GB/s per TF.

As far as bandwidth goes both consoles are equally good or equally bottleneckecked depending of how you look it (glass half full/empty)
 
Last edited:

Hobbygaming

has been asked to post in 'Grounded' mode.
Remember, the term "secret sauce" was coined by someone unironically defending the Xbox One in the beginning of this gen.

The most amusing thing here is how so many the same individuals who believed in "secret sauces", "power of the cloud" and "hidden dGPUs" back then are now trying to project their own stupidity onto others.

It's that recurrent theme of their petty and vengeful "roles reversal" fantasy.
There is no truer comment on any forum than this comment 👍
 

rnlval

Member
[


2080ti has a 616GB/s memory bandwidth as well for all the memory, not shared with CPU or other sound / assets. 2080 super is bandwidth limited.

Slower access RAM for anything over 10 GB is also not normal and a bad idea.

I know you know better.

So will the extra TF be more performance...?

You have to consider more than just 1 number.

Its not so clear is it. The consoles will both be equally memory bandwidth bound for large asset games and be similar IMO....

But believe what you want.

YhkHa7c.png


Sapphire RX 5600 XT Pulse 36 CU with 7.89 TFLOPS average with 336 GB/s has an 8% penalty relative RX 5700 36 CU with 7.7 TFLOPS with 448 GB/s.

For RDNA v1, 336 GB/s memory bandwidth has a minor performance penalty which is applicable to XSX and PS5


The difference between PS5 and XSX is like RX 5700XT OC/GTX 1080 Ti against RTX 2080 respectively.
 
Last edited:

ethomaz

Banned
I'd just like to add to your comment so there isn't any confusion, the takeaway here is that XSX needs the extra bandwidth to feed the more powerful GPU that's why they went for this weird comprise instead of 16GB @448GB/s.
Both consoles have roughly the same amount bandwidth available proportionate to their GPUs peak performance or put in more laymans terms: similar GB/s per TF.

As far as bandwidth goes both consoles are equally good or equally bottleneckecked depending of how you look it (glass half full/empty)
To add that bandwidth is what some PC “master” race keep saying it will bootleneck for the GPU... that is what is not fitting both systems.

They just forget both system manage RAM differently from PC and they don’t have a lot of overhead and duplication that needs a stronger bandwidth compared with consoles.

I believe in terms of RAM both system are equal until games start to use more than 10GB VRAM.
 
Don't get me wrong, Sony's 1st part studios will undoubtedly pull out amazingly looking games, they already do, they will be making them for 5-7 years, delaying 3-4 times, ND will crunch their employees to death, but eventually we will get games that look beyond what even a 10k PC has, so what? I'm too old to get excited for graphics, especially with real-time RT finally becoming a thing, for me the question is will their studios finally break the 30FPS barrier? Will they put more focus on actually fun/enjoyable gameplay, rather then slow/sloppy/sluggish controls? Will they have any MP proposition that will allow me to play their games with my friends for countless hours, rather then finishing the game through the weekend and that's it? The graphics will be forgotten really fast, usually it's half a year and a new technically impressive game shows up, no one remembers TO1887, KZ:SF, DC, Ryse, Crysis 2/3 etc. after all that initial hype they got because of the graphics, and there are solid reasons behind it. Even UC4 seems to be forgotten, and it was suppose to be the PS4's messiah, but now all what many people say about it is how bad it is compared to previous installments, its SJW agenda, and what's not, no one cares about its graphics anymore. From all the studios they have only Insomniac has a great track record of making games that are fun to play, like REALLY fun to play.




But that's professional application, where even a mere 10-15FPS (as even shown in the video) is considered as real-time, as oppose to few minutes of rendering, but will it be applicable to 60FPS games? That's the real question.

Remember the unoptimized implementation rendered at 92 fps on the fly and they scrubbed a 1 terabyte 8k video from beginning to end in real time. That was in 2016... If MS targeted this technology for the XSX then it has lots more engineering and resources behind it. We will see because no one outside of MS' developer pools really have a clue. None of us and no one that has spoken out is a hands on dev.
 

rnlval

Member
Another good example.

3072 SPs to 4352 SPs = 42% increase in SPs
3072 @ 1815Mhz to 4352 @ 1545MHz = 20% increase if you accounts clock

Performance difference? 11%, 16% and 18%... the performance closes to the SP x Clock difference more the VRAM bandwidth become bootleneck for RTX 2080 Super... same scenario as RTX 2070 Super.

relative-performance_1920-1080.png

relative-performance_2560-1440.png

relative-performance_3840-2160.png


Let's see if we can find a bench with the same cards but with overclock in memory to RTX 2080 Super.
BTW nVidia underclocked the VRAM in 2080 Super to not canibalize the 2080TI.


Why use older graphs when RX 5600 XT era graphs have RX 5700 XT is nearing GTX 1080 Ti?


relative-performance_3840-2160.png
 

rnlval

Member
To add that bandwidth is what some PC “master” race keep saying it will bootleneck for the GPU... that is what is not fitting both systems.

They just forget both system manage RAM differently from PC and they don’t have a lot of overhead and duplication that needs a stronger bandwidth compared with consoles.

I believe in terms of RAM both system are equal until games start to use more than 10GB VRAM.
Not correct according to ED DICE's lecture on PC DirectX12's aliasing memory layout.

Start at slide 52 vs slide 53

framegraph-extensible-rendering-architecture-in-frostbite-52-638.jpg


VS

framegraph-extensible-rendering-architecture-in-frostbite-53-638.jpg



framegraph-extensible-rendering-architecture-in-frostbite-54-638.jpg


framegraph-extensible-rendering-architecture-in-frostbite-55-638.jpg
 

ethomaz

Banned
Not correct according to ED DICE's lecture on PC DirectX12's aliasing memory layout.

Start at slide 52 vs slide 53

framegraph-extensible-rendering-architecture-in-frostbite-52-638.jpg


VS

framegraph-extensible-rendering-architecture-in-frostbite-53-638.jpg



framegraph-extensible-rendering-architecture-in-frostbite-54-638.jpg


framegraph-extensible-rendering-architecture-in-frostbite-55-638.jpg
I’m not sure it shows advantage of the consoles in a lower resolution.

And I said memory bandwidth and not memory allocation.

I believe both bandwidth are enough to the power of the GPUs even if in PC you need more for the same level.
 
Last edited:

psorcerer

Banned
Shader cores in GPU cant be compared to CPU cores. Developers have to optimize their games in order to use more CPU cores, while GPU distribute workload evenly by itself. 2080ti is extremely big chip, yet GPU usage is almost always at 99% in higher resolutions (GPU usage only drops below 99% in CPU bottlenecked scenarios). On the other hand CPU usage is almost never 100% in real games.

100% usage you see is a sampled usage.
I would be surprised if actual ALU usage of the GPU is ever >30-40% on PC.
 

ethomaz

Banned
100% usage you see is a sampled usage.
I would be surprised if actual ALU usage of the GPU is ever >30-40% on PC.
That was why AMD was smart to add units to schedule compute tasks to theses non-used CUs simultaneously with the render tasks.
That is the ideia behind the famous asynchronous compute.

There is a slide from AMD that explain how hard is to maintain the CUs busy.

Keeping the SIMD busy: GCN vs. RDNA

▪ 2 CU require 2464 = 512 threads to be able to reach 100% ALU utilization

▪ WGP requires 4*32 = 128 threads to be able to reach 100% ALU utilization

⁃ Only achieved with high instruction level parallelism (ILP)

⁃ Graphics workloads often have 3 independent streams (RGB / XYZ)

⁃ 256 threads / WGP often reach >90% ALU utilization in practice1

▪ Additional threads are needed on both GCN and RDNA to hide memory latency

⁃ Unless you can fill the wait with ALU (this is extremely rare)

⁃ # of threads required for memory latency hiding has reduced as well, but not as much

▪ Fewer threads required overall to keep the machine busy

⁃ Utilization ramps up more quickly after barriers

⁃ High VGPR counts hurt slightly less

▪ Call to action:

⁃ Keep ILP in mind when writing shader code


In easy terms.

To keep the ALUs busy (>90% utilization).

18 WGP (36 CUs) you need 4608 threads.
26 WGP (52 CUs) you need 6656 threads.

You need a high level parallelism in shader code from the defeloper.

But hey AMD said that that agrees with CryTek guy and Cerny and people will say I’m lying lol
 
Last edited:

rnlval

Member
I’m not sure it shows advantage of the consoles in a lower resolution.

And I said memory bandwidth and not memory allocation.

I believe both bandwidth are enough to the power of the GPUs even if in PC you need more for the same level.
EA Dice's lecture is an apple to apple API comparison on DX11 vs PC DX12 vs XBO vs PS4.

Unlike Crytek's IP, EA Dice's Battlefield V DX12 runs well on AMD GPUs
 

rnlval

Member
That was why AMD was smart to add units to schedule compute tasks to theses non-used CUs simultaneously with the render tasks.
That is the ideia behind the famous asynchronous compute.

There is a slide from AMD that explain how hard is to maintain the CUs busy.




In easy terms.

To keep the ALUs busy (>90% utilization).

18 WGP (36 CUs) you need 4608 threads.
26 WGP (52 CUs) you need 6656 threads.

You need a high level parallelism in shader code from the defeloper.

But hey AMD said that that agrees with CryTek guy and people will say I’m lying lol
Not a problem when the target resolution is 4K with attempts for running decent frame rates.

performance-3840-2160.png


Doom Eternal's optimizations are game console like.

Turing and RDNA are flexing heavy async compute usage with lower latency TFLOPS strength.

Note that applying XSX GPU's extra 25% TFLOPS over RX 5700 XT's Doom result still lands on RTX 2080/RTX 2080 Super GPU level. PS5 is similar RX 5700 XT (9.66 TF average) level which is close to RTX 2070 Super.

Even at 5600 XT scale to PS5's 10.275 TFLOPS, it still lands on RX 5700 XT level.
 

rnlval

Member
Never underestimate The Cerny.
My year 2013 R9-290X and Intel Core i7-2600 obliterated "The Cerny's supercharge PC" aka PS4.

My old R9-290X continues to obliterate The Cerny's PS4 Pro.

Big NAVI comes around October 2020 which is another October 2013 R9-290X re-run. Red October = AMD's flagship arrives.
 
Last edited:

ethomaz

Banned
Not a problem when the target resolution is 4K with attempts for running decent frame rates.

performance-3840-2160.png


Doom Eternal's optimizations are game console like.

Turing and RDNA are flexing heavy async compute usage with lower latency TFLOPS strength.

Note that applying XSX GPU's extra 25% TFLOPS over RX 5700 XT's Doom result still lands on RTX 2080/RTX 2080 Super GPU level. PS5 is similar RX 5700 XT (9.66 TF average) level which is close to RTX 2070 Super.

Even at 5600 XT scale to PS5's 10.275 TFLOPS, it still lands on RX 5700 XT level.
Do you really have anything useful to add?
Because you keep the same bull with these graphics that doesn’t show what you pretend to show :messenger_tears_of_joy:

You seems to try to fight against AMD, Cerny and CryTek’s guy.
If I post a Carmark comment you will try to show a chart unrelated to deny him lol

What I posted and you quoted come from AMD themselves.

If I have to guess Doom from your chart doesn’t reach 90% or more utilization of the CUs even with Async Compute.
 
Last edited:

rnlval

Member
Do you really have anything useful to add?
Because you keep the same bull with these graphics that doesn’t show what you pretend to show :messenger_tears_of_joy:

You seems to try to fight against AMD, Cerny and CryTek’s guy.
If I post a Carmark comment you will try to show a chart unrelated to deny him lol

What I posted and you quoted come from AMD themselves.

If I have to guess Doom from your chart doesn’t reach 90% or more utilization of the CUs even with Async Compute.
Hypocrite. You're in conflict with Lisa Su's Big NAVI's arrival.
 
Last edited:

Vroadstar

Member
My year 2013 R9-290X and Intel Core i7-2600 obliterated "The Cerny's supercharge PC" aka PS4.

My old R9-290X continues to obliterate The Cerny's PS4 Pro.

Big NAVI comes around October 2020 which is another October 2013 R9-290X re-run. Red October = AMD's flagship arrives.

I'm glad to hear your PC obliterated both PS4 and Pro, thanks!
 
Last edited:

ethomaz

Banned
Hypocrite. Your in conflict with Lisa Su's Big NAVI's arrival.
Yeap AMD RDNA writepaper is fully bullshit.

:messenger_tears_of_joy:

BTW a hint Lisa Su agree with what is write in that document.
If you think a little about you will understand why BigNavi until today didn’t left the paper.... they are really trying to cover these issues when you go big.
 
Last edited:

SonGoku

Member
That was why AMD was smart to add units to schedule compute tasks to theses non-used CUs simultaneously with the render tasks.
That is the ideia behind the famous asynchronous compute.
Didn't also nvidia improve asynchronous compute with turing since they were lagging behind in this area with Maxwell?
Cerny also mentioned how 40% VALU utilization was already considered pretty good
 

ethomaz

Banned
Didn't also nvidia improve asynchronous compute with turing since they were lagging behind in this area with Maxwell?
Cerny also mentioned how 40% VALU utilization was already considered pretty good
Yeap but it lags compared with AMD solution that is truly simultaneous (it executed both at same time/wave).
nVidia solution uses a queue running with render tasks... it runs in one wave render and another computer of course using a priority level.

What Turing added is a feature that in critical compute tasks it can jump the queue to be executed first to not affect the criticality of that compute task.

nVidia GPUs are faster enough to do that.
 
Last edited:
Top Bottom