Switch 2 cpu bottleneck issues: Digital foundry

What should it be compared to then? It's closest to the performance of the base PS4. I'm also surprised it's running so well and that CDPR got it running in 7 weeks. But also, I expect the experience to be sub-par
port wasnt developed in 7 weeks. How come some people still parrot this. Gene Park said it clearly - " 7 weeks old build", which means its running version that was done 7 weeks ago, alas why cdpr still said they are working on optimizing it in interview.
 
Last edited:
AMD's "hyper-threading

AMD didn't have SMT in the days of Bulldozer.
And that would have been completely useless for a CPU as narrow as Jaguar.

Long scheduler? As in ARM's scheduler for instruction scheduling and execution is massive. 160 entries ROB (re-order buffer). To give a comparison with desktop CPU you would have Skylake with 58 entries. ARM is completely overbuilt to avoid stalls.

Regardless of these micro details, we have the IPC of these processors anyway. ARM 78 for its time (of course modern ARM better always), was just a tad under Zen 3 IPC but a lot better than Zen 2. What saves modern consoles is their high clocks. Jaguar IPC is not even in the chat room. Like I said, it won't compete with Series S in cases where the CPU is a problem just by pure raw clocks, but it will never encounter a scenario where Jaguar, even the higher clocked one in Pro, would outperform it.

Having a long instruction queue helps reducing pipeline stalls from branching and instruction dependency.
It does not mitigate contention between the CPU and GPU accessing the same memory pool, using the same memory controller.

The A78 is a more advanced CPU, due to being 8 years older than Jaguar, but it's not doing miracles.
 
Yes it shifted in time. Maxwell continued it and I think Pascal went to dynamic.

They had different priorities in time. I think Nvidia was more inline with the immediate graphic rendering trends while AMD had good ideas but at the wrong time.

At the time, AMD was already pushing for low level APIs. Nvidia was way behind and focusing heavily on DX11.
That was a problem for AMD on PC. But on consoles, that advanced scheduler was great.

The scheduler on Nvidia GPUs is still simpler than AMD's. That means it still relies more on the CPU and drivers to organize work before it's delivered to the GPU.

? Its the exact shader array

0FSG4J8.jpeg


Are we talking about different things?

Work waves or Warps, are a form of how the GPU prepares threads for the Shader Units to compute.
A GPU is basically a huge cluster of thousands of cores, each computing threads. Because on GPUs, there is low dependency and branching, work can be delivered in big chunks to the shaders.
So the standard became Work Waves of 32 for Nvidia. Or Warps.
But there are a few considerations. Work Waves do not have the exact same size. They aren't always 32 or 64 sizes. It depends on the work load.
So if a work wave is 24, there will be 8 shaders with no work. Of if it's a 48 Wave, then it will fill one cluster with 32 and another with 16, but leave 16 empty. Which is bad.
With a work wave of 64 it's even worse, as there is a greater chance of shaders not being used.
A GPU maker could try to make work waves very small, maybe 8 in size. But that would mean wasting more transistors in the front-end of the GPU. But it would also give more granularity and mean greater shader occupancy.
Async compute tries to identify work that can fill on these gaps, when shaders are left without work.

The push for a 64 work wave in the PS4 probably came from Sony, as they had access to very low level APIs and their SDK could make great use of Async compute.
On PC, with DX11, that was close to impossible.

I was on R9 280 for the longest time waiting on that famous DX12 async unlock. Outside of Ashes RTS which was not even a good game, it was pretty barren for a long time. The dev himself actually dropped a cold shower on a lot of speculations going on back then.

AcuEBJ0.jpeg

Saying Async is a modest perf increase

Then Wolfenstein 2 async patch with modest 5%, etc. I mean Definitely AMD prefered DX12/Mantle over DX11, but I can't recall feeling like I made the absolute best choice with that card while I waited an eternity for these things to happen. I was waiting on some kind of "OMG! 50+% improvement!" and it never came.

By time Pascal came it all mattered very little.

I saw improvements of up to 25% on games that made proper implementation of Async. Just because some devs can't be bothered to make proper implementations, means nothings.
And that is on PC, using MS crappy DX12 API.
And you can be sure that on a console with proper SDKs and low level APIs, the PS4 was able to make even better use if it.

I had Pascal GPU and then a Turing GPU. Pascal sucked for Async compute. Zero improvements with Async. But not as bad as Maxwell, where it actually lost performance.
Turing was when Nvidia finally managed to get proper Async support.

Oh yea for sure. AMD could actually have grabbed the gaming world by the balls as they had consoles in their pocket and they had like 1 engineer in 100 or so studios at their peak? They could have leveraged mantle for making console ports heavily favor AMD, at least that's what I thought when I bought my R9 and... they fumbled.

GCN vs say a Pascal APU I would be sweating but vs Ampere I really don't see an argument that favors AMD here thus this whole discussion over the most obvious fucking statement of all time, its a more modern architecture...

AMD lost the API war. Mostly because devs on PC were stuck using DX11.
It also didn't help that this was AMD at it's lowest ever. When they had no money and even had to sell of their HQ to finish developing Ryzen.
 
I dont see it based of the raw numbers. I think it's probably "halfway" between the two, "last gen plus" but with smaller gulf than compared to the "last gen plus" the switch 1 had with respect to ps3/360

Are you basing this off what we know about the raw specs? cause cpu should be somewhere halfway in between the two. 8 cores of A78c is way better than 8 cores of jaguar

Switch 2 has more ram than both Xbox one and series S. Memory bandwidth is more in line with Xbox one but that's about it

Gpu raw power, I dunno. A 3ish Tflop ampere based gpu vs a 1.3ishtflop GCN 1.0 gpu? Series S is 4Tflopish rdna2. Seems closer to series S to me. Feature wise, I favour ampere to even if you can't 1:1 compare the cuda to sp

I just don't see the "closer to Xbox one" claim based off raw specs. It's somewhere in between the two in terms of raw power with more modern features, likely better ray tracing than rdna2, and dlss to help out in some cases. Way better feature set than GCN 1.0 I don't think you can argue that

It's also got a much smaller power draw which is what makes this debate about "xbone/ps4 tier" super impressive imo.

It's obviously not as good as current gen but it's also not just Xbox one/ps4 level either which is where the incorrect comparisons come. "Last gen plus", close to half way to series S (which itself is a fair bit weaker than series X and ps5)
Keep in mind this is what the PS4 is capable of.

 
Last edited:
DF are doing Nintendo commercials but they are biased against them? Weird logic
 
Here is a question. What games will run on that 120hz screen @ 120hz. It wont run most modern games at 60
Not demanding ones, ports of super old ones, some indies, but not gonna lie here, that 120hz screen was probably mostly chosen so games can have 40hz/fps modes, not full 120 :)
 
8cores arm clocked barely over 1GHz, in layman terms its definitely above xbox one/x/ps4/pr0 cpu but well below ps5/pr0/xbox series cpu's which are downclocked zen2 archi-8cores/16threads(still at least 3,5GHz).

We can even tell it easily by cp2077 ports- on last gen even now after all those patches game has- it only runs at 20-30fps, on current gen, even series s, it has performance mode that targets 60fps(ofc its not super stable) and current gen consoles can run more demanding expack sons of liberty, unlike lastgen.
Switch2 cp2077 port will have 30fps quality and 40fps performance modes, even if we assume fps gonna dip in those modes and not hold stable 30/40 thats still visibly above last gen consoles, and definitely below current gen consoles :)
Thats vid from 10months ago, still patch 2.12 and base game that is less demanding vs xpack, just look how nasty those visuals on ps4 are and game still dips below 25fps and frametime getting seizures all the time durning traversal with lowspeed vehicle, timestamped:

That's a function of streaming from an hdd. The game was not designed around an hdd, theres far too little ram on the ps4 to mitigate the hdd issues.
 
That's a function of streaming from an hdd. The game was not designed around an hdd, theres far too little ram on the ps4 to mitigate the hdd issues.
End effect is what counts, and ps4 or even ps4pr0 version of cp2077 is performing pretty badly even after all those visual cuts, and yup, we know exactly why it looks and runs like it does, which unfrotunately doesnt change the final results.
Its like saying that 40yo stripper crackhead with 250+bodycount came from broken family in a very poor neighbourhood, guess what- she is still by no means wifey material no matter what, even if she claims she has "good relationship with jesus" (or maybe its a hesus from hood gang instead ) xD
 
They're not biased against Nintendo, but when they make a prediction that doesn't quite fit with reality they will perform great mental gymnastics to not admit they were wrong.
They are so determined not to be prooved wrong on the Switch 2 being PS4 level in power. They are willing to overlook anything that does not support their point of view.
 
Last edited:
Top Bottom