Any chance of nexgen consoles having asymmetrical cores and a task scheduler?

giphy.gif
West City latest tech presented from Dr. Gero right here.
 
The only practical use was for background downloads and updates whilst the console is in standby, but I believe they use ARM cores within the chipset to do that.
 
AMD ZEN 6 Hybrid Concept for PS6:

8 "Worker Cores":
Zen 6c ● 512KB L2 Per Core ● 128-Bit SIMD ● SMT
7C/14T @ 4.0GHz for GAME
1C/1T Disabled for Yield

2 "Showrunner Cores":
Zen 6 ● 1MB L2 Per Core ● 128-Bit SIMD + AVX256 & AVX512-S ● SMT
2C/2T @ 4.8GHz for GAME

2 "System Cores":
Zen 6lp ● 256KB L2 Per Core ● 128-Bit SIMD ● SMT
2C/4T @ 3.0GHz for OS

Unified 12 Core CCX ● SmartShift ● No On-Die L3

48MB CPU L3 via 3D V-Cache (Low-Yield Generic AMD 64MB Die)

---

Even well-threaded games tend to use 1-2 main saturated threads that handle the bulk of the game logic, then delegate more parallelizable tasks out to the other cores that rarely get close to full utilisation.

With consoles focused on efficiency, it may make sense knowing that this is a given to have only a limited amount of full cores with deleted SMT, crankable clocks, more cache and extended instruction logic which "run the show" (some AVX capability could also make any prospective PS3 BC more viable), then have a bunch of smaller compact cores with less cache, reduced instruction logic, SMT and lower clock capabilities that are fundamentally geared towards being "workers"; handling tasks which can be more easily chopped up. In addition, use Zen low power cores for the OS. Finally, given you'd wanna get the die as small as possible, drop L3 off the die entirely and use the low-yield cast offs from AMD's mass-produced, generic V-cache dies instead.

---

No chatgpt used in this post..

..unless you think it's a shit idea, then let's say I did; and offload the blame.
 
Last edited:
On the subject the HD generation ditched out of order instructions because it was not cost effective. The better you make a chip the more is going to take in yields. Although some people are chanting victory because MS is going to smash prices with 900€$ hardware that actually should cost 1.000€$ this is a market of people rising an eyebrow every time something crosses the 299 price tag. Ready with their keyboards already on fire. If, if, the advantages outclass the space in silicon, like with AI cores for machine learning, it's implemented. If not:
https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4aaee399-f14a-4f83-9ef5-eeacdf229d0c_619x238.png

Out with it!


Six-core Zen 2 performance from a defective PS5 APU is close to Ryzen 5 2600x (six-core Zen 1.5 with quad 128-bit FPU pipelines for each CPU core).

PS5 Zen 2's 256-bit AVX2 throughput is similar to desktop Zen 1.x.
 
Last edited:
AMD ZEN 6 Hybrid Concept for PS6:

8 "Worker Cores":
Zen 6c ● 512KB L2 Per Core ● 128-Bit SIMD ● SMT
7C/14T @ 4.0GHz for GAME
1C/1T Disabled for Yield

2 "Showrunner Cores":
Zen 6 ● 1MB L2 Per Core ● 128-Bit SIMD + AVX256 & AVX512-S ● SMT
2C/2T @ 4.8GHz for GAME

2 "System Cores":
Zen 6lp ● 256KB L2 Per Core ● 128-Bit SIMD ● SMT
2C/4T @ 3.0GHz for OS

Unified 12 Core CCX ● SmartShift ● No On-Die L3

48MB CPU L3 via 3D V-Cache (Low-Yield Generic AMD 64MB Die)

---

Even well-threaded games tend to use 1-2 main saturated threads that handle the bulk of the game logic, then delegate more parallelizable tasks out to the other cores that rarely get close to full utilisation.

With consoles focused on efficiency, it may make sense knowing that this is a given to have only a limited amount of full cores with deleted SMT, crankable clocks, more cache and extended instruction logic which "run the show" (some AVX capability could also make any prospective PS3 BC more viable), then have a bunch of smaller compact cores with less cache, reduced instruction logic, SMT and lower clock capabilities that are fundamentally geared towards being "workers"; handling tasks which can be more easily chopped up. In addition, use Zen low power cores for the OS. Finally, given you'd wanna get the die as small as possible, drop L3 off the die entirely and use the low-yield cast offs from AMD's mass-produced, generic V-cache dies instead.

---

No chatgpt used in this post..

..unless you think it's a shit idea, then let's say I did; and offload the blame.
Strix Point's Zen 5c has Zen 4's 256-bit SIMD layout.
 

Six-core Zen 2 performance from a defective PS5 APU is close to Ryzen 5 2600x (six-core Zen 1.5 with quad 128-bit FPU pipelines for each CPU core).

PS5 Zen 2's 256-bit AVX2 throughput is similar to desktop Zen 1.x.
Well. That's my point. Silicon is measured to the atom since you're going to sell 100 mill+ of these things every dollar saved is a massive difference. Consoles are not a boutique market.
 
It's why the shitty NVidia PS3 GPU got its ass handed to it by the XB360 GPU, because the former had specific pixel and vertex shaders, whereas the latter boasted AMD's latest unified shader architecture
This is a misrepresentation of the two GPUs at best. It's like when people compare PS4 to XB1 and equate them to 40% gap (because of compute delta) where the gap was consistently much larger than that, but most of it came on the back of XB1 GPU having a host of other limitations PS4 didn't.
RSX in particular was most often limited by its data throughput (both the rates, and memory bandwidth) long before any compute bottlenecks came into play, even on the Vertex Shader side. More importantly - in terms of raw compute, it actually held certain advantages - eg. in fragment-bound scenarios where the compute would be roughly on par, but 360 would not have spare capacity for Vertex processing, while RSX did.
But taking data problems out of equation - unified shaders in 360 would have the biggest advantage in vertex-bound scenarios (like shadowmap generation) but those were also pixel fill bound (which on RSX just ends up being memory bandwidth again) so it didn't even matter.
It boiled down to one machine with a 'lot' more performance gotchas you had to carefully maneuver around not to cripple the system throughput. It's not that 360 was perfect there - but they were just few and far in between in comparison, and as a lead platform, it would also get the benefit of extra care for years before that dynamic shifted.

I will say though - building hw that had innumerable ways to hang yourself with - performance wise - was a very Sony approach to HW in that era (PS2, PSP, PS3 all did this a lot).

But ok enough derailing the thread - I agreed with the same point above - heterogenous processing (on CPU and GPU alike) has been par for the course for most consoles from the start, the shift to more unified approaches didn't actually happen until 360. Mainly because the cost/benefit ratio no longer resulted in meaningful gains.
Even AI compute acceleration shares the same execution resources now and that gets us far enough for memory to be the main problem - again, so it's hard to fathom any scenarios where specialized compute is the more cost-effective solution. I speculated reconfigurable compute might be an interesting concept (FPGA style) but I'm not enough of HW engineer to be able to give educated guess if it'd be worth it in any scenario (like maybe portables, with power constraints and all?).
 
Asymmetrical CPU cores and GPU Compute Units combined with an intelligent task scheduler, offer significant benefits for video game consoles by optimizing performance, power efficiency, and responsiveness. In this architecture, high-performance cores handle demanding tasks like physics calculations and AI, while smaller efficiency cores manage background operations such as audio processing or system services without wasting power. Similarly, GPU CUs can be designed with varying capabilities—some optimized for complex shading and others for simpler rendering or post-processing. A smart scheduler dynamically assigns workloads based on the nature and priority of each task, ensuring that resources are used efficiently. This enables smoother frame rates, reduced latency, and improved thermal performance, all while extending hardware lifespan and supporting more immersive and complex game worlds without requiring developers to micromanage hardware resources.

yes I used chatgpt but its my idea.
I came up with this while being infected with the lawless one's unclean spirit. I made a bet with deepenigma that either xbox would crawl out of their grave or the PC I was building wouldn't work. It's still too early to tell with Xbox, but my pc's bios wouldn't prioritize the windows 11 installation drive over SSD, and when I loaded the installer from the boot menu it said my specs were incompatible. I have crucial 32 gb ddr5, wd black sn850x 2tb, pny rtx 5070 12gb, Intel core i7 14700k. Now the monitor isn't responding. I'm not dumb, I've studied 5 manifold spaces at the graduate level.

Xbox will succeed but only if my 2 year old nephew accepts my stupid PC as an engineering project. I hate chatgpt, copilot all the way.
 
Big CUs for big workloads, small CUs for small workloads. Task Scheduler; Machine Learning distributes workloads automatically. All of this means more optimized machine
Operating systems are already going to have a task scheduler to prioritize and schedule threads accordingly, even those on video game consoles. It doesn't take machine learning to do that. That's been happening ever since multi-threading became a thing decades ago.

The only real reason to bring the p-core/e-core thing to a console would be power savings for smaller workloads like video streaming or running the store application, but as a video game console typically only focuses on a single primary workload at a time energy savings can be accomplished with variable clocks without the sophistication of different types of cores.

I suppose some subsystems within games could be delegated to lower power cores, but aside from power savings what would the real benefit be?
 
Top Bottom