Support NeoGAF

Tams · Nov 8, 2022

supernova8 said:
I expect that the 7900XTX will review well but that reviewers will still end up saying:

"If you want the absolute best then get the 4090" because let's face it, anyone even contemplating spending $1000 on a GPU has "fuck you" levels of money and can afford to just get a 4090 if they want the best. They're probably spending around $1000 on a CPU alone, a bunch of money on an AM5 motherboard and the accompaying DDR5 memory, and are not going to miss/give a shit about that extra $600 required to get the 4090 over the 7900XTX.

I think AMD is barking up the wrong tree trying to market the 7900XTX as being better value to a subset of consumers who, clearly, don't really care about "value" and just want the best (no matter how much it costs).

But we've already covered this.

There are people out there who don't have 'fuck you' levels of money who are looking for the best GPU. The GPU is the centre-piece of a gaming PC and people are willing to make significant sacrifices elsewhere in the system to get an expensive high level one.

PCs are also modular, so there are plenty (likely more than complete new system builders) out there who are looking to upgrade their GPU, as that's what affects the performance the most for most games.

For them, these AMD GPUs are a pretty solid choice, without having to worry about Nvidia fuckery.

supernova8 · Nov 8, 2022

Tams said:
But we've already covered this.

There are people out there who don't have 'fuck you' levels of money who are looking for the best GPU. The GPU is the centre-piece of a gaming PC and people are willing to make significant sacrifices elsewhere in the system to get an expensive high level one.

PCs are also modular, so there are plenty (likely more than complete new system builders) out there who are looking to upgrade their GPU, as that's what affects the performance the most for most games.

For them, these AMD GPUs are a pretty solid choice, without having to worry about Nvidia fuckery.

How many people? This is my point.

If you're right (I don't think you are) then we'll definitely see people in this segment flock to AMD cards, right?

Personally (not that it matters, but just an anecdotal point) I would consider my options in terms of value (using the last gen) up to around the 6700 XT. Anything beyond I would likely just get Nvidia. I think most people are the same but I'm happy to proven wrong.

Petopia · Nov 8, 2022

AncientOrigin said:
It will be better both the the xt and xtx will be better

Sure bud remember when ati and the time tried that.

SportsFan581 · Nov 8, 2022

supernova8 said:
If you're right (I don't think you are) then we'll definitely see people in this segment flock to AMD cards, right?

No. Because brand loyalty is a thing and Nvidia has the mind share. Many will just wait for a cheaper part from Nvidia or grab an 3000 series on discount.

That still doesn't excuse the basic fact that more people purchase 3070s than 3080s, and 3060s over 3070s. Even small price differences control buyer choice. Many, gamers out there offset their system building by upgrading their CPU/MB and GPU at different times (a theory proven by the hardware surveys).

AncientOrigin · Nov 8, 2022

Petopia said:
Sure bud remember when ati and the time tried that.

Now is a new time old things are irrelevant now.AMD has the chance to also destroy the 4090ti if they release the 7950 next year on bigger die then the 4090 has no chance.Even the 7900xtx will be very close on raster to the 4090 but they are middle- to high class cards AMD will be better from entry level to high level cards.For now Nvidia has only the 4090 better their only card.The others have already lost to the 7900.Next generation is when Nvidia will get really destroyed especially if they is not a new nm node improvement in time.And only to remember Nvidia uses already the best 4nm node.The 7900 cards are on 5nm Nvidia has no where to go while AMD is 308 die while the 4090 is 608

Petopia · Nov 8, 2022

AncientOrigin said:
Now is a new time old things are irrelevant now.AMD has the chance to also destroy the 4090ti if they release the 7950 next year on bigger die then the 4090 has no chance.Even the 7900xtx will be very close on raster to the 4090 but they are middle- to high class cards AMD will be better from entry level to high level cards.For now Nvidia has only the 4090 better their only card.The others have already lost to the 7900.Next generation is when Nvidia will get really destroyed especially if they is not a new nm node improvement in time.And only to remember Nvidia uses already the best 4nm node.The 7900 cards are on 5nm Nvidia has no where to go while AMD is 308 die while the 4090 is 608

Nvidia still be destroying them with less memory so that spec sheet be don't mean didley squat when games aren't utilizing it.

Petopia · Nov 8, 2022

AncientOrigin said:
Now is a new time old things are irrelevant now.AMD has the chance to also destroy the 4090ti if they release the 7950 next year on bigger die then the 4090 has no chance.Even the 7900xtx will be very close on raster to the 4090 but they are middle- to high class cards AMD will be better from entry level to high level cards.For now Nvidia has only the 4090 better their only card.The others have already lost to the 7900.Next generation is when Nvidia will get really destroyed especially if they is not a new nm node improvement in time.And only to remember Nvidia uses already the best 4nm node.The 7900 cards are on 5nm Nvidia has no where to go while AMD is 308 die while the 4090 is 608

Maybe I've been under a rock for the last decade but where was this cap when we had the 3870s and 4870 with dual GPU on one card during the ATI days y'all pop outta crevices real quick don't you.

Buggy Loop · Nov 8, 2022

AncientOrigin said:
Now is a new time old things are irrelevant now.AMD has the chance to also destroy the 4090ti if they release the 7950 next year on bigger die then the 4090 has no chance.Even the 7900xtx will be very close on raster to the 4090 but they are middle- to high class cards AMD will be better from entry level to high level cards.For now Nvidia has only the 4090 better their only card.The others have already lost to the 7900.Next generation is when Nvidia will get really destroyed especially if they is not a new nm node improvement in time.And only to remember Nvidia uses already the best 4nm node.The 7900 cards are on 5nm Nvidia has no where to go while AMD is 308 die while the 4090 is 608

TSMC 4N =/= N4

How many fucking times i've seen this misinformation that Nvidia is on better node at 4nm. It's NOT TSMC's 4nm node. It's 5nm+ with a custom Nvidia specific twist. It's marketing bullshit.

So now, next gen will be the Nvidia killer?

This is said every gen..

AncientOrigin · Nov 8, 2022

Buggy Loop said:
TSMC 4N =/= N4

How many fucking times i've seen this misinformation that Nvidia is on better node at 4nm. It's NOT TSMC's 4nm node. It's 5nm+ with a custom Nvidia specific twist. It's marketing bullshit.

So now, next gen will be the Nvidia killer?

Even if so doesn't change the fact the the 4090 is 608 die and the 7900 is 308.People must think logically if the 4090 is already extreme in size how ridiculous would it be if Nvidia would release the 5090 with 700 or 800 die space if still on 5nm or 4.you would need custom made cases for that card.If AMD wanted now they could release a 7950 with 400-500 die area and crush the 4090ti

PhoenixTank · Nov 8, 2022

Buggy Loop said:
TSMC 4N =/= N4

How many fucking times i've seen this misinformation that Nvidia is on better node at 4nm. It's NOT TSMC's 4nm node. It's 5nm+ with a custom Nvidia specific twist. It's marketing bullshit.

The names are pretty loose but arguing marketing vs semantics just gets messy as hell.

Facts:
There is a difference between N5 & N4. It isn't huge.
Depending on whether AMD is on N5 or N5P and whether Nvidia is on something akin to N4 or N4P the gap is 5-11% at the same power draw. These are all derivatives of N5.
N3/3nm is the next proper node step down.

I assume the RDNA2 6X50XT N7->N6 refresh offered similar improvements to N5->N4

TSMC Extends Its 5nm Family With A New Enhanced-Performance N4P Node

TSMC introduces a new 5-nanometer derivative - an enhanced performance N4P node.

fuse.wikichip.org

Buggy Loop · Nov 8, 2022

PhoenixTank said:
The names are pretty loose but arguing marketing vs semantics just gets messy as hell.

Facts:
There is a difference between N5 & N4. It isn't huge.
Depending on whether AMD is on N5 or N5P and whether Nvidia is on something akin to N4 or N4P the gap is 5-11% at the same power draw. These are all derivatives of N5.
N3/3nm is the next proper node step down.

I assume the RDNA2 6X50XT N7->N6 refresh offered similar improvements to N5->N4

http://[URL][URL]https://fuse.wikic...ily-with-a-new-enhanced-performance-n4p-node/
[/URL][/URL]

It's not even N4, it's 4N. That's the mistake I'm pointing out. 4N is N5+

PhoenixTank · Nov 8, 2022

Buggy Loop said:
It's not even N4, it's 4N. That's the mistake I'm pointing out. 4N is N5+

Good god, that is a fucking shambolic addition to the naming scheme. TSMC please use some other letters FFS.
I've seen N & the number reversed so many times now that I just assume they're getting it backwards or mean "nm".
Thank you.

Edit:
After looking into it a bit more and the brief info that came with Hopper it seems like it may offer a slight advantage vs N5 but we have no way to quantify it. Perhaps similar to the Turing-era optimisations for large dies?

hinch7 · Nov 8, 2022

supernova8 said:
How many people? This is my point.

If you're right (I don't think you are) then we'll definitely see people in this segment flock to AMD cards, right?

Personally (not that it matters, but just an anecdotal point) I would consider my options in terms of value (using the last gen) up to around the 6700 XT. Anything beyond I would likely just get Nvidia. I think most people are the same but I'm happy to proven wrong.

Yeah this is what most people want, GPU's under $500. Shame that AMD and Nvidia dismisses both these markets in going ultra high end first. Particularly AMD who are way behind Nvidia in marketshare. Mid range cards have always been the most wanted and in demand. If we go by Steam data and other polls on forums.. You'll probably see that the most popular cards are the 60 and 70 class cards.

I'd be interested to see how AMD prices Navi 33 (7700XT) because chances are Nvidia is going to up the cost for the RTX 4060 to the previous 70 class tier and beyond.

Granted people know that AMD's RT is not up to snuff with Nvidia's latest counterparts, they can't get away with slightly lower pricing like the RX 6000 series to Ampere equivelant. Especially now that more and more game engines are incorporating some form of RT in them.

DenchDeckard · Nov 8, 2022

AncientOrigin said:
It will be better both the the xt and xtx will be better

This is what I'm hoping for!

supernova8 · Nov 8, 2022

hinch7 said:
Granted people know that AMD's RT is not up to snuff with Nvidia's latest counterparts, they can't get away with slightly lower pricing like the RX 6000 series to Ampere equivalent. Especially now that more and more game engines are incorporating some form of RT in them.

I wonder if Nvidia's strategy in the mid-tier (around $500-$600) will be to focus on how, in raytracing intensive titles, even the RTX 4060/4070 can beat AMD's top tier GPUs. The narrative could be "hey our GPUs cost a little more but we can offer you proper raytraced gaming at enjoyable framerates right now. If you cheap and go AMD, you'll have to upgrade again in another year to get the same sort of RT performance anyway".

And then with that last bit, Jensen will come out with a new catchphrase:

"The sooner you buy, the more you save"

Buggy Loop · Nov 8, 2022

hinch7 said:
Yeah this is what most people want, GPU's under $500. Shame that AMD and Nvidia dismisses both these markets in going ultra high end first. Particularly AMD who are way behind Nvidia in marketshare. Mid range cards have always been the most wanted and in demand. If we go by Steam data and other polls on forums.. You'll probably see that the most popular cards are the 60 and 70 class cards.

I'd be interested to see how AMD prices Navi 33 (7700XT) because chances are Nvidia is going to up the cost for the RTX 4060 to the previous 70 class tier and beyond.

Granted people know that AMD's RT is not up to snuff with Nvidia's latest counterparts, they can't get away with slightly lower pricing like the RX 6000 series to Ampere equivelant. Especially now that more and more game engines are incorporating some form of RT in them.

They're leaving the door wide open for Intel's next gen. I think their focus on cheap midrange will pay off in 2nd or 3rd iteration.

Tams · Nov 8, 2022

supernova8 said:
How many people? This is my point.

If you're right (I don't think you are) then we'll definitely see people in this segment flock to AMD cards, right?

Personally (not that it matters, but just an anecdotal point) I would consider my options in terms of value (using the last gen) up to around the 6700 XT. Anything beyond I would likely just get Nvidia. I think most people are the same but I'm happy to proven wrong.

Did I say that people will flock to AMD cards? What's that? No I didn't? Oh.

It's a big enough of a market to sustain AMD. Most people are really not aiming to spend more than a 1,000 of a currency on, well anything other than houses, cars, and TVs. Should they not have something other than the leftovers of any Nvidia's halo products?

And do you honestly think there is a massive market out there for 1,500 GPUs? Most people simply can't afford that much.

Buggy Loop · Nov 8, 2022

Tams said:
And do you honestly think there is a massive market out there for 1,500 GPUs? Most people simply can't afford that much.

3090 selling more than entire RDNA 2 lineup?

supernova8 · Nov 8, 2022

Buggy Loop said:
3090 selling more than entire RDNA 2 lineup?

Exactly. It's not a very big market segment, but nonetheless practically everyone in that segment is buying Nvidia.

Rat Rage · Nov 8, 2022

If the 4090 is 100% in terms of power, how much is the 7900XTX?

SLESS · Nov 8, 2022

My 3080ti died from natural causes, I got a full refund at a bloated covid rate from when I purchased. Getting the 7900xtx as a replacement, fuck nvidia and their mega greed.

Crayon · Nov 8, 2022

Rat Rage said:
If the 4090 is 100% in terms of power, how much is the 7900XTX?

Guesses are flying around at 80-90%. Just for raster tho. For rt it should be well behind.

hinch7 · Nov 8, 2022

supernova8 said:
I wonder if Nvidia's strategy in the mid-tier (around $500-$600) will be to focus on how, in raytracing intensive titles, even the RTX 4060/4070 can beat AMD's top tier GPUs. The narrative could be "hey our GPUs cost a little more but we can offer you proper raytraced gaming at enjoyable framerates right now. If you cheap and go AMD, you'll have to upgrade again in another year to get the same sort of RT performance anyway".

And then with that last bit, Jensen will come out with a new catchphrase:

"The sooner you buy, the more you save"

That that wouldn't surprise me lol. The more you buy/save still makes me chuckle.

And you can bet they'll showcase RT and DLSS 3 > No DLSS benchmarks to show the GPU's in the best light.

But yeah Nvidia are clearly going the premium route this generation. They have the prestige in the high end, to justify charging more. And have an advantage in RT and marketshare. Also with the professional workspace. They can dictate the market to their choosing. Its only if AMD actually comes up with something that totally trounces Nvidia that people will take notice. Atm, they are just playing second fiddle. So they'll remain a distant second.

Buggy Loop said:
They're leaving the door wide open for Intel's next gen. I think their focus on cheap midrange will pay off in 2nd or 3rd iteration.

True. Say what you want about Intels drivers atm and Arcs failed launch. This is Intels 1st generation of DGPU's and their performance in RT is fairly impressive, especially for a first iteration. Looking forward to see what they can do with Battlemage and beyond.

Alexios · Nov 8, 2022

So, if we're in an age where top of the line cards are 900-1000 or worse rather than the 500 a 1080 cost just a few years ago with the only higher end model above it being the 1080ti, will games also adjust to be fully playable with high settings on lower end cards leaving these high ends for just, I dunno, maxed settings previously meant for the future, 120hz framerates, 8K resolutions, etc.? Or are they going to be pretty much required for decent visuals and gameplay experiences (as games made for higher spec get uglier when you reduce the settings, vs games made for a lower spec that are then merely boosted further when you eventually have better hardware lying around, thanks to resolution or some extra lod settings or ini tweaks and even higher performance)? NV was stretching it with their clearly limited release titans being a super premium way to have later ti performance earlier for crazy prices but now they wanna force titan prices to the regular high end tiers and that ain't good. Guess they were just a trojan horse to normalize the availability of such products before they attempt to make them a requirement for the optimal experience? It was previously pretty known that only a few people would have Titan cards and nobody was really coding for them specifically, now we no longer have titans and pay as much for the regular range. Until they introduce titans again for another jump when they feel like it I guess. And AMD follows. These prices will make the old memes of PCs costing too much almost true, before you could get high end gaming with 800-1000 and half of it going to the GPU (and then enjoy cheaper games, lack of subs, etc to make up for up front cost and end up gaming for cheaper than consoles), now that will get you some paltry X060 equivalent or something with the excuse it performs like a previous X(-1)070, like duh, it's next gen, you don't upgrade if it's not considerably better just because it's newer. I guess forcing lower end users to upgrade more often is also a thing that works for $, it's not just the high end affected.

Crayon · Nov 8, 2022

Alexios said:
So, if we're in an age where top of the line cards are 900-1000 or worse rather than the 500 a 1080 cost just a few years ago with the only higher end model above it being the 1080ti, will games also adjust to be fully playable with high settings on lower end cards leaving these high ends for just, I dunno, maxed settings previously meant for the future, 120hz framerates, 8K resolutions, etc.? Or are they going to be pretty much required for decent visuals and gameplay experiences (as games made for higher spec get uglier when you reduce the settings, vs games made for a lower spec that are then merely boosted further when you eventually have better hardware lying around, thanks to resolution or some extra lod settings or ini tweaks and even higher performance)? NV was stretching it with their clearly limited release titans being a super premium way to have later ti performance earlier for crazy prices but now they wanna force titan prices to the regular high end tiers and that ain't good. Guess they were just a trojan horse to normalize the availability of such products before they attempt to make them a requirement for the optimal experience? It was previously pretty known that only a few people would have Titan cards and nobody was really coding for them specifically, now we no longer have titans and pay as much for the regular range. Until they introduce titans again for another jump when they feel like it I guess. And AMD follows. These prices will make the old memes of PCs costing too much almost true, before you could get high end gaming with 800-1000 and half of it going to the GPU (and then enjoy cheaper games, lack of subs, etc to make up for up front cost and end up gaming for cheaper than consoles), now that will get you some paltry X060 equivalent or something with the excuse it performs like a previous X-1070, like duh, it's next gen, you don't upgrade if it's not better just because it's newer. I guess forcing lower end users to upgrade more often is also a thing that works for $$$, it's not just the high end suffering.

Nah the $1,000+ gpu is way out there. If you bought a 6700 a year ago, you should be on par with consoles and that won't change because there are crazy expensive cards coming out. You could get a 6600 right now in the low 200's and can play new games at 1080p for a minute.

If you need to be all pcmr about it, it's going to be very expensive tho.

Alexios · Nov 8, 2022

At first I was gonna write I hope you're right and if more chimed in with similar then maybe even feel more at ease about it but then I noticed, 1080p in 2023 and beyond if you don't wanna break the bank are you serious. I guess that's an extreme low cost example but still >_>

Crayon · Nov 8, 2022

Alexios said:
At first I was gonna write I hope you're right and if more chimed in with similar then maybe even feel more at ease about it but then I noticed, 1080p in 2023 and beyond if you don't wanna break the bank are you serious. I guess that's an extreme low cost example but still >_>

Nah you aren't wrong. Low 200's for a year old card that is still doing 1080 is a lot more expensive than it was. If 1440 is where you want to be, 6700's are down to low 300's and should also play games now like a console for the rest of the console gen. Games will have to run good and look good on a ps5 for years to come. Down in the low and mid-range, things have def got more expensive but these 1,000+ gpus seem to be for people who want excess.

PaintTinJr · Nov 9, 2022

Buggy Loop said:
It.. also performs the same at 4K, 1440p and even 1080p with a compiled version, indicating that maybe it's an incomplete demo/engine? To the point where the global illumination seems to be calculated on CPU? Nobody uses matrix demo to benchmark, because it has all the red flags indicating it's CPU limited while the utilization is indicating that it's not. Something is broken or limited in the background.

Extrapolating anything out of it, on an engine that is not ready for production.. I mean yea, I hope I don't have to explain further? Even the dev community on unrealengine forum are encountering a ton of problems

...

Now that people are bringing Intel into the equation, I was curious about the state of async compute on Intel and Nvidia, and in relation to your response to

SlimySnake - which in my opinion is nonsense to suggest UE5 nanite and lumen are broken or incomplete because Nvidia hardware is struggling to get gains - it turns out Intel have backed different strategies from AMD/Nvidia, taking the Rapid Packed Maths arrangement of getting double flops when using half flops FP16 like AMD - that Nvidia hardware is deficient with, and on the Nvidia side they've gone with dedicated hardware based AI super scaling, and dedicated hardware RT accelerators, which from reading means they have the same deficient async compute "lite" - as it has been dubbed - as Nvidia because the async copy engine used for copying/compute are just re-tasked RT cores, so the scheduling parallel work can cripple performance if used as liberally as the AMD async compute - that is designed for heavy async use with dedicated ACEs - and in turn Nvidia and Intel's async costs in lost RT processing too.

Why this is relevant to UE5 nanite and lumen performance is because from the technical info we've had from Epic about the algorithms and the data, it seems that both algorithms were designed to be processing centric, rather than the traditional data centric, so both algorithms just chunk through the data - as though it is all conceptually the same generic data - in very optimised graphics/compute shaders, still able to exploit the likes of mesh shaders, but without the inefficiency of constant shader and resource binding changes. If UE5 is performance constrained at 4k-1080p on high-end Nvidia cards, then it sounds like the rendering is hitting a hard geometry processing limit on the card and the lack of proper async capability is stopping them going beyond that limit with async compute, and is unfortunately hitting a graphics + compute utilisation limit on the Nvidia cards lower down the utilisation limit because Nvidia's solution being better designed for data centric rendering has handicapped it more for generic processing centric rendering like UE5, would be my take.

Just adding the Intel Arc A780's balance of stats into the mix of cards me and

SlimySnake were previously looking at - in comparison to these AMD 7900 - now IMO gives an inkling that Nvidia are all-in and possibly wrongly, and that the lack of proper async, the dependency on FP32 - with no option for 2x FP16, instead - and DLSS and RT hardware accelerators mean that if the performance flips to AMD, which I think it will heavily with the AMD 8000 series, Nvidia might have a harder way back than Intel, as Intel have already designed around the FP16 Rapid packed maths, at least and still have chosen a good balance between Half-Flops-rate, pixel-rate and texture-rate even if the poor power efficiency looks to be shared with Nvidia because of the dedicated AI super scaling hardware and dedicated RT cores.

hlm666 · Nov 9, 2022

PaintTinJr said:
Now that people are bringing Intel into the equation, I was curious about the state of async compute on Intel and Nvidia, and in relation to your response to SlimySnake - which in my opinion is nonsense to suggest UE5 nanite and lumen are broken or incomplete because Nvidia hardware is struggling to get gains - it turns out Intel have backed different strategies from AMD/Nvidia, taking the Rapid Packed Maths arrangement of getting double flops when using half flops FP16 like AMD - that Nvidia hardware is deficient with, and on the Nvidia side they've gone with dedicated hardware based AI super scaling, and dedicated hardware RT accelerators, which from reading means they have the same deficient async compute "lite" - as it has been dubbed - as Nvidia because the async copy engine used for copying/compute are just re-tasked RT cores, so the scheduling parallel work can cripple performance if used as liberally as the AMD async compute - that is designed for heavy async use with dedicated ACEs - and in turn Nvidia and Intel's async costs in lost RT processing too.

Why this is relevant to UE5 nanite and lumen performance is because from the technical info we've had from Epic about the algorithms and the data, it seems that both algorithms were designed to be processing centric, rather than the traditional data centric, so both algorithms just chunk through the data - as though it is all conceptually the same generic data - in very optimised graphics/compute shaders, still able to exploit the likes of mesh shaders, but without the inefficiency of constant shader and resource binding changes. If UE5 is performance constrained at 4k-1080p on high-end Nvidia cards, then it sounds like the rendering is hitting a hard geometry processing limit on the card and the lack of proper async capability is stopping them going beyond that limit with async compute, and is unfortunately hitting a graphics + compute utilisation limit on the Nvidia cards lower down the utilisation limit because Nvidia's solution being better designed for data centric rendering has handicapped it more for generic processing centric rendering like UE5, would be my take.

Just adding the Intel Arc A780's balance of stats into the mix of cards me and SlimySnake were previously looking at - in comparison to these AMD 7900 - now IMO gives an inkling that Nvidia are all-in and possibly wrongly, and that the lack of proper async, the dependency on FP32 - with no option for 2x FP16, instead - and DLSS and RT hardware accelerators mean that if the performance flips to AMD, which I think it will heavily with the AMD 8000 series, Nvidia might have a harder way back than Intel, as Intel have already designed around the FP16 Rapid packed maths, at least and still have chosen a good balance between Half-Flops-rate, pixel-rate and texture-rate even if the poor power efficiency looks to be shared with Nvidia because of the dedicated AI super scaling hardware and dedicated RT cores.

I'm not sure nvidia is deficent in fp16. I read around the release of turing that nvidia were pushing dual issue fp16 through the tensor cores and it apparently leaves the fp32 and int32 hardware available for concurrent work. I've got no idea how you would try to benchmark this but i'm not sure it's as clear cut as say comparing fp16 on the standard cores without considering what the tensor cores bring to the party.

PaintTinJr · Nov 9, 2022

hlm666 said:
I'm not sure nvidia is deficent in fp16. I read around the release of turing that nvidia were pushing dual issue fp16 through the tensor cores and it apparently leaves the fp32 and int32 hardware available for concurrent work. I've got no idea how you would try to benchmark this but i'm not sure it's as clear cut as say comparing fp16 on the standard cores without considering what the tensor cores bring to the party.

That's an interesting read, but I believe it supports the point I was making, because the use of an external ASIC on Nvidia to do the FP16 work is effectively scheduling to work in parallel on another unit and needing to amortise the setup cost of the parallel workload and not overburden the IO subsystem/memory in producing or consuming, and needing consideration for race-conditions, hence it being dubbed async compute lite, unlike async on AMD that isn't parallel at all, and is instead just highly concurrent usage of the same processing utilising FP32 or FP16 as needed. And from the linked information you provided the text

" Note that this doesn't really do anything extra for FP16 performance – it's still 2x FP32 performance – but it gives NVIDIA some additional flexibility."

So it doesn't change the headline TF/s of the card from being, say FP32 30TF/s to FP16 60TF/s like AMD and Intel with the Rapid Packed Maths, hence my saying it is deficient, it is still using the same paths just filled with smaller data.

Intel's situation seems to copy Nvidia's in that they have all the same limitations on their async being "lite" with the text below being very close to the async copy engine advice on Nvidia's website IMHO.

https://www.intel.com/content/www/u...series-gaming-api-developer-optimization.html

"Xe-HPG supports the use of queues, which can concurrently have both 3D and compute workloads resident in the threads of each Xe-core. There is also a copy engine present for parallel copies. When sharing resources across queues, consider the following recommendations:

Always profile code to see if using dual-queue support for asynchronous compute will provide a benefit.
Use COMMON state transitions when possible.
Use a dedicated copy queue to fetch resources when it makes sense.
Vulkan: Use vkGetPhysicalDeviceQueueFamilyProperties to enumerate queues to see what kind of queues there are, and how many can be created per family."

So even with the 7900XtX having a 100watt TDP lower than the 4090, in highly changing FP16 async heavy workloads (using graphics + compute shaders like UE5) the 7900XTX has 150% more FP16 capability, which this time round, couple with the async efficiency might even be enough to offset the parallel RT core work on the 4090 with UE5 and give the 7900XTX a performance win at lower TDP.

hlm666 · Nov 9, 2022

PaintTinJr said:
because the use of an external ASIC on Nvidia to do the FP16 work is effectively scheduling to work in parallel on another unit and needing to amortise the setup cost of the parallel workload and not overburden the IO subsystem/memory in producing or consuming, and needing consideration for race-conditions

external asic? the tensor core is part of the sm. If using it was going to have all the cons you mention there the same cons would apply to fp32 and int32 work.

PaintTinJr · Nov 9, 2022

hlm666 said:
external asic? the tensor core is part of the sm. If using it was going to have all the cons you mention there the same cons would apply to fp32 and int32 work.

I was poorly paraphrasing this article's opening statement

https://developer.nvidia.com/blog/advanced-api-performance-async-copy/

"Async copy runs on completely independent hardware but you have to schedule it onto the separate queue. You can consider turning an async copy into an async compute as a performance strategy. NVIDIA has a dedicated async copy engine. Use the following strategies in decreasing order of performance improvement:

Full parallelism: Use async copy.
Partial parallelism: Turn a synchronous copy into an async compute through a driver performance strategy. The compute workload overlaps with the graphics workload.
No parallelism: Perform serial execution of the copy and graphics work.
Negative scaling: Turn a synchronous copy into an async compute, but it takes longer due to conflicting SOLs."

The negative effects I was reading about were in this other article's Not Recommend section, the second line of which reads like it is a problem for nanite and lumen on Nvida async compute which AFAIK from listening about the nanite/lumen algorithms run as long continuous shaders bulk processing the data, rather than many different shader programs being switched per object. And the final point also seems problematic compared to using async with graphics shaders + compute + RT on nvidia.

https://developer.nvidia.com/blog/advanced-api-performance-async-compute-and-overlap/

"

Don't only focus purely on the SM warp occupancy, start by looking at unit throughputs.
Don't use long async compute workloads unless they can finish comfortably before the dependency on the sync queue (Figure 3).

Diagram shows inefficient “T3” overlap of “T2”. A large workload gab has occurred on the synchronous queue due to synchronization between “T3” and “T4”.

Figure 3. Visualization of sync workload (T4) depending on async workload (T3). Red represents a fence, ensuring T4 only starts after T3 has finished completely, which introduces space between T2 and T4.

Don't overlap workloads that use the same resource for reading and writing, as it causes data hazards.
Don't overlap workloads with high L1 and L2 usage and VRAM throughput metrics. Oversubscription or reduction in cache hit-rate will result in performance degradation.
Be careful with more than two queues if hardware-accelerated GPU scheduling is disabled. Software scheduled workloads from more than two queues (copy queue aside) may result in workload serialization.
Be careful with overlapping compute-over-compute workloads where both cause WFIs. WFIs during simultaneous compute on both queues can result in synchronization across the workloads. Frequent descriptor heap changes on the async queue may cause additional WFIs.
Don't use DX12 command queue priorities to influence async and sync workload priorities. The interface simply dictates the queue from which the commands are consumed first and does not affect warp priorities in any meaningful way.
Don't overlap RTCore workloads. Both share the same throughput units and due to interference will degrade performance."

Maybe I'm quoting old info that has been rectified, but from skimming other sources it still sounds like the Nvidia solution is a bit of a driver fudge to offer async compute and still looks to me like the difference between harder to schedule parallelism on Nvidia and more general purpose concurrent graphics/compute/BVH on AMD - which has maybe even with the 7900XTX less overall performance for UE5 but should be expected to win-out eventually IMHO because scheduling has finer grain control and the difficulty should remain constant on AMD as hardware CU/ROPs, etc grow.

iQuasarLV · Nov 9, 2022

Radeon Deals!

Here are the fire-sale deals in anticipation of the 7000 series launch. Basically, right where the prices should be in relative performance to the uplift on the 7000 series. Grab em while ~~their~~ they're hot bois!

*edit*
Goddamn grammar errors, lol..

Crayon · Nov 9, 2022

iQuasarLV said:
Radeon Deals!

Here are the fire-sale deals in anticipation of the 7000 series launch. Basically, right where the prices should be in relative performance to the uplift on the 7000 series. Grab em while ~~their~~ they're hot bois!

*edit*
Goddamn grammar errors, lol..

This is solid. With GPUs getting faster a and more expensive, older discounted models might hang around longer to serve that 2-300 bracket.

iQuasarLV · Nov 9, 2022

The last time I spent less than $700 on a flagship model was the GTX 780 in 2013.

Kataploom · Nov 9, 2022

Crayon said:
Nah the $1,000+ gpu is way out there. If you bought a 6700 a year ago, you should be on par with consoles and that won't change because there are crazy expensive cards coming out. You could get a 6600 right now in the low 200's and can play new games at 1080p for a minute.

If you need to be all pcmr about it, it's going to be very expensive tho.

Damn just checked Amazon and this is great, I'm short of money for a console and my 1060 3gb already got too old, that's a perfect option and runs the game most the same then bigger consoles, you can perfectly play on 1440p to 4K on Medium/High on that card at 60 fps and for less than a Series S

Crayon · Nov 9, 2022

Kataploom said:
Damn just checked Amazon and this is great, I'm short of money for a console and my 1060 3gb already got too old, that's a perfect option and runs the game most the same then bigger consoles, you can perfectly play on 1440p to 4K on Medium/High on that card at 60 fps and for less than a Series S

I picked up a used 6600xt for 200 on ebay a few weeks ago. One of the big chunky asrock ones so I did get a pretty good deal.

I can set a bunch of games to 1440 and they go great. Anything last gen, basically. I picked up a newer game that supports FSR 2.1 and it's rocking fake 1440 frame cap at at 60fps and running at like 75% load. Good deal, and I feel like it's more than just getting by. It's about up to snuff with my ps5 so I don't think it's going to interfere with my enjoyment for a couple years yet.

I totally appreciate what a $600 card can do, but if my games run and look good by my standards then I personally get more enjoyment knowing that I didn't overspend. I cheap and I love it.

Go check that 6600 performing against an xt, and a 6650 XT in particular. There's not a huge spread in performance, but it's there. Right now the 6650xt is going for under $300 new so more and more $200 used 6600xt's will surely be popping up.

PeteBull · Nov 10, 2022

iQuasarLV said:
The last time I spent less than $700 on a flagship model was the GTX 780 in 2013.

Its even hard to call that flagship model, when it was just 5th strongest card from 700s series from nvidia, more oomph had 780ti, titan, titan black and titan z(which was double gpu

).
Gtx Titan launched 2 months before 780, for 1k bucks so it wasnt much different compared to 4090 and 4080 16gigs situation, just add inflation+ nvidias greed(stimulated to the max by recent crypto boom), and we got terrible pricing, but modus operanti was kinda the same even back then in 2013=/

GeForce 700 series - Wikipedia

en.wikipedia.org

twilo99 · Nov 14, 2022

AMD RDNA 3 GPU Architecture Deep Dive: The Ryzen Moment for GPUs

Swimming with the next generation GPUs

www.tomshardware.com

PaintTinJr · Nov 14, 2022

twilo99 said:
http://[URL]https://www.tomshardwar...hitecture-deep-dive-the-ryzen-moment-for-gpus
[/URL]

The FP16 info contradicts what Techpowerup says about the the RTX 4090 - Tom's hardware chart - it is supposed to 1:1 with FP32 on nvidia cards according to techpowerup and the previous architecture technical info, so will be interesting which is correct - as the RTX 4090 in theory when doing FP16 would have nearly 3TF/s of FP16 less than the RX 7800XT, in something like UE5 if the techpowerup info is correct.

I get the feeling that tom's hardware is just adding the theoretical totals from SMs plus tensor/RX cores, despite the fact that they can't do more than async lite with the previous Nvidia architectures and are largely capped to the SMs FP16 capabilities for something like nanite/lumen of UE5.

Kazekage1981 · Nov 15, 2022

Timestamped

GreatnessRD · Nov 15, 2022

As the days go by I get angrier about where the 7900 XT is slotted. This must mean they've killed the 7800 XT or turned it into a mid-range card. With that said, like everyone else I'm excited to see the performance of both cards as stated in the vid Kazekage posted. Everyone believes AMD is downplaying their performance. Time will definitely tell.

Kazekage1981 · Nov 15, 2022

Uh these slides are very important, and allegedly based on the video i posted, raytracing is 80 something percent better than RDNA 2?

Also confirms has ML cores?

GreatnessRD · Nov 15, 2022

Kazekage1981 said:
Uh these slides are very important, and allegedly based on the video i posted, raytracing is 80 something percent better than RDNA 2?

Also confirms has ML cores?

I believe the tech folks are saying the Raytracing is supposed to be Ampere level or a very slight tick above it.

Leonidas · Nov 15, 2022

AMD shared RT performance.

These RT numbers are abysmal. Going from 12-13 FPS to 21-24 FPS in Dying Light 2 and CyberPunk 2077.

The RTX 4080 is much faster than this at a lower TDP.

Even the 4070 might perform better than this in RT.

Irobot82 · Nov 15, 2022

Leonidas said:
AMD shared RT performance.

These RT numbers are abysmal. Going from 12-13 FPS to 21-24 FPS in Dying Light 2 and CyberPunk 2077.

The RTX 4080 is much faster than this at a lower TDP.

Even the 4070 might perform better than this in RT.

That's literally doubling the performance.

This is from digital trends (no clue on quality of reporting). RTX 4090 AVG 4K Ultra RT is 30fps.

Edit: Same website, different numbers again?!?

Haint · Nov 15, 2022

Leonidas said:
AMD shared RT performance.

These RT numbers are abysmal. Going from 12-13 FPS to 21-24 FPS in Dying Light 2 and CyberPunk 2077.

The RTX 4080 is much faster than this at a lower TDP.

Even the 4070 might perform better than this in RT.

Those FSR values would also be in the terrible 1080p performance mode. Imagine paying $1000 for an ostensibly flagship card to run most upcoming and modern games of any consequence at 1080p.

Irobot82 said:
That's literally doubling the performance.

This is from digital trends (no clue on quality of reporting). RTX 4090 AVG 4K Ultra RT is 30fps.

Edit: Same website, different numbers again?!?

Safe to say any benchmarks that aren't using at least a 13900K with >6500Mhz RAM, and who don't have a long proven track record of high competence--are completely invalid. Chances of bottlenecks are evidently very high as these GPU's have surpassed every other system component. It's beyond the qualifications of novice or armchair benchers.

Irobot82 · Nov 15, 2022

Reallink said:
Those FSR values would also be in the terrible 1080p performance mode. Imagine paying $1000 for an ostensibly flagship card to run most upcoming and modern games of any consequence at 1080p.

Says 4K, we don't know the FSR rating, Performance or Quality until we can see ENDNOTE RX-480. I can't find it anywhere.

iQuasarLV · Nov 15, 2022

Well looks like Nvidia is going to have to adjust some pricing numbers

In the timestamped video 28-31fps in Cyberpunk 2077 with DLSS turned off vs. ~21fps on 7900XTX is a nice gap closer in just one generation. That is ~47% raw RT and damn near equal performance with DLSS / FSR on. Not bad at all for $200 less.

Haint · Nov 15, 2022

Irobot82 said:
Says 4K, we don't know the FSR rating, Performance or Quality until we can see ENDNOTE RX-480. I can't find it anywhere.

We know FSR's relative performance cost/savings. The ~1600p and 1400p modes do not produce anywhere close to a 3x performance lift, these charts are certainly depicting FSR 1080p Performance (or worse) upscaled to 4K.

Support NeoGAF

AMD: Radeon 7900XTX ($999) & 7900XT ($899) Announced | Available December 13th

Member

Banned

Banned

Member

Member

Banned

Banned

Member

Member

Member

Member

Member

Member

Moderated wildly

Banned

Member

Member

Member

Banned

Member

Member

Member

Member

Cores, shaders and BIOS oh my!

Member

Cores, shaders and BIOS oh my!

Member

Member

Member

Member

Member

Member

Member

Member

Member

Gold Member

Member

Member

Member

Member

Member

Member

Member

Member

AMD's Dogma: ARyzen (No Intel inside)

Member

Member

Member

Member

Member

Similar threads