PlayStation 6 to utilize AMD's 3D stacked chips; AMD UDNA Flagship GPU revived for 2026, Zen 6 Halo with 3D stacking technology, and Zen 6 all on TSMC

A 30 TF PS6 with FSR4+ would definitely be good enough for Full path tracing. But without a ZEN6 CPU it may only do it at 30fps.
The 9070xt is nearly 30tflops and it buckles in pathtracing and that's with current gen fidelity settings. If they just want current gen visual fidelity with pathtracing turned on then the next gen architecture at 30tfops might suffice but it will not be able to deliver any improvements in fidelity such as geometry,draw distance, particles etc on top.
 
How many RTX 5090s is that?
To answer that we gotta sum up what we know/what can be safely assumed:
1) holidays 2028 release
2) 3nm node
3) console form factor
4) powerdraw below 250W
5) cooperation with amd hardware/tech
With that we can easily assume ps6 will be weaker from 5090, maybe not by much but its physically impossible to contain 5nm 575W powerdraw(oced models draw up to 666W actually) gpu and some decent cpu(likely gonna need those 20-30W too, even if its heavily undervolted/underclocked for max efficiency), add into that vram(maybe 24, maybe 32 gigs, but either of those 2 configurations for sure), fast ssd(not much faster from still extremly fast ps5 ssd, but likely bit faster aka will draw more power too).

Ofc it will still be huge jump from base ps5, especially in terms of rt and ai upscaling, raw raster tho, likely at max will be 3x what base ps5 got, aka 4090 territory, rt perf maybe even 10-15 better, add to that ai upscaling so native 1080p to 4k upscaling will look still crisp and not blurry at all, close to native 4k, thats how current dlss4 new transformative model is already btw).

What new console wont be is- cheap, im predicting likely 1k$/€ at least with disc drive.
 
To answer that we gotta sum up what we know/what can be safely assumed:
1) holidays 2028 release
2) 3nm node
3) console form factor
4) powerdraw below 250W
5) cooperation with amd hardware/tech
With that we can easily assume ps6 will be weaker from 5090, maybe not by much but its physically impossible to contain 5nm 575W powerdraw(oced models draw up to 666W actually) gpu and some decent cpu(likely gonna need those 20-30W too, even if its heavily undervolted/underclocked for max efficiency), add into that vram(maybe 24, maybe 32 gigs, but either of those 2 configurations for sure), fast ssd(not much faster from still extremly fast ps5 ssd, but likely bit faster aka will draw more power too).

Ofc it will still be huge jump from base ps5, especially in terms of rt and ai upscaling, raw raster tho, likely at max will be 3x what base ps5 got, aka 4090 territory, rt perf maybe even 10-15 better, add to that ai upscaling so native 1080p to 4k upscaling will look still crisp and not blurry at all, close to native 4k, thats how current dlss4 new transformative model is already btw).

What new console wont be is- cheap, im predicting likely 1k$/€ at least with disc drive.

No way it'll be $1,000 for what you just typed. That should be illegal. At most it'll be $700.
 
$200 for a disc drive would be nuts! :messenger_tears_of_joy:
giphy.gif

half a year after ps5pr0 launch, so u would think bluray drive would be relatively cheap by now, and yet:
749pln so 198usd :)

german amazon sells it for 128€ so 145usd, but we are still 3,5years away from holiday 2028, prices will only go up :messenger_face_steam:
 
Last edited:
giphy.gif

half a year after ps5pr0 launch, so u would think bluray drive would be relatively cheap by now, and yet:
749pln so 198usd :)

german amazon sells it for 128€ so 145usd, but we are still 3,5years away from holiday 2028, prices will only go up :messenger_face_steam:

I hate how you saying prices will go up, it's a crazy statement nowadays.
 
you guys think that there is a chance for ps6 cpu to be as powerful as 9800x3D?

Very unlikely.
CPUs on consoles are always sacrificed. In part because the memory controller is tunned for GDDR and high bandwidth, instead of latency.
In part because they tend to sacrifice cache sizes.
The base Zen5 is only slightly faster than Zen4. What makes the 9800X3D so good is the 3D cache under the CPU.
 
you guys think that there is a chance for ps6 cpu to be as powerful as 9800x3D?

Number of cores (8+), IPC of cores = Maybe, Most Probably
96MB L3 Cache, Frequency (especially on the portable) = Definitely not.

A good bet is probably something based on some budget mobile variant of some future AMD SoC with like 4xZen6, 4xZen6c cores and stripped down L3 cache (16? 24? 32?)
 
you guys think that there is a chance for ps6 cpu to be as powerful as 9800x3D?
1st option- if sony decides to go with cheaper/older zen5 archi(so same as 9800x3d) then ps6 cpu gonna be around 30% slower coz of simply undervolting/underclocking just so it doesnt eat 120W tdp like desktop 9800x3d but rather maybe 30-40W as part of SoC in console form factor.

2n option- if sony decides to go balls to the wall and choses newer(and definitely much more expensive) zen6- we dont know yet how much faster it could be, and how much more or less powerefficent its gonna be- so here total guess but ofc we know gen on gen cpu archi is considered really good/huge step forward if it provides 15-20% more performance, so likely zen5 to zen6 will be that, at best.
How ps6 cpu performance gonna look vs ps5 cpu? In cpu demanding scenarios roughly 3x faster, which we can easily predict coz of rare test we can see with both of 8core zen2 vs zen5 9800x3d ingames, like here, timestamped:



3x cpu increase in performance is what we got coming from ps4 to ps5 cpu so u guys can easily extrapolate the performance jump(more 60fps games/modes, more npc's/ bigger draw distance, less pop in, fewer fps drops and much more stable frametime when shit hits the fan etc.
Best example of how important is fast cpu is last gen vs current gen cp2077 console port, we all know difference is humongous not only in the quality of settings but overall experience/ fps/frametime/lack of hitching/slow downs etc.
 
Sure proper BC will never happen, but -- given this is the latest PS6 thread, thought I'd drop this little ad concept for the fellow dreamers:


ps6-future-bc-spcshwcse-2028-ad-mockup.jpg
 
Last edited:
Latest rumours on Zen 6 are it's using 12-Core single-CCX CCDs, which could be a nice middleground for PS6.

16C/32T is probably a bit much and will be wasted in many games, 16C/16T would be fine, 12C/24T will use less area and provide a nice balance on clocks vs cores/threads, while 12C/12T would probably make PS4/5 back compat a little awkward (unless they can just redistribute the 16T workload easily).

Think 12C/24T is probably the sweetspot.

Edit:

Updated my spec mockup in line with this:

My latest bit of "SPECulation" (..more hopes than reality :messenger_tears_of_joy:, click each for full-size):






Edit (2): Adjusted ROPs, RAM, Nodes, CPU & Style.
 
Last edited:
First what is the ps6 max wattage as a console ?
Going by past PlayStations and flex leads used, if it ships with a PC/Kettle 3 pin lead OG PS3/PS4Pro, then typically under 450watts, probably 380watts in real world, but with new green directives and Sony being a green company, I would doubt this is the option.

If it predictably ships with a 2 lead PS1/PS2/PS4/PS5/PS5pro then 250watts maximum with not more than ~230watts in real world use.
 
Last edited:
A 30 TF PS6 with FSR4+ would definitely be good enough for Full path tracing. But without a ZEN6 CPU it may only do it at 30fps.
Why CPU?
PT doesn't do anything with CPU that regular RT doesn't - and that in of itself isn't CPU limited on console.
 
Last edited:
Why CPU?
PT doesn't do anything with CPU that regular RT doesn't - and that in of itself isn't CPU limited on console.

Although games that use PT are very GPU bound, PT still increases the load on the CPU.
It can increase the amount of draw calls that the CPU has to process. Especially if the game is rendering reflections, which will require rendering more objects from out of the usual player viewport.
 
Nobody knows. PS5/PS5 Pro draws about 200-230w depending on the game. I'm betting on it not being vastly higher than that. Maybe a slight increase, but pretty sure it will still be under 300w.
To go higher they need to change the cable and socket to that of a PC PSU. 250watt is the limit on the normal cable, but you also need to leave 10% margin, so it ends up 20-25watts less in typical use.
 
To go higher they need to change the cable and socket to that of a PC PSU. 250watt is the limit on the normal cable, but you also need to leave 10% margin, so it ends up 20-25watts less in typical use.
Don't figure 8 cables go up to 500+ watts? I know the kettle type cables can take a couple thousands watts.
 
It can increase the amount of draw calls that the CPU has to process. Especially if the game is rendering reflections, which will require rendering more objects from out of the usual player viewport.
RT traversal populates the acceleration structure for this - you don't issue draw-calls against objects that rays hit. And contents of these bounding hierarchies would be the same as regular RT, so PT doesn't change anything there.
Draw-call cost is also dramatically lower to begin with on well optimised console codebase, but granted - a lot of modern games aren't that tightly optimised anymore.

I assumed the CPU on some level would affect a game if the devs wanted it to have PT.
Not in ways that regular RT doesn't already do (see above).
Also there's a misconception that RT is inherently CPU heavy process because of PC APIs that doesn't really apply to consoles the same way.
 
Don't figure 8 cables go up to 500+ watts? I know the kettle type cables can take a couple thousands watts.
Pretty sure in the UK electrical products using the figure 8 are limited below 250watt, and it is likely because of the risk of them not being wired with a earth/ground pin, as the cable is only live and neutral AFAIK, but similar gauge cable - presumably with 3 wires - on a hair dyer can be as high as 1.8KW, so the cable being fixed and with a wired ground might be the difference - in the UK at least.
 
RT traversal populates the acceleration structure for this - you don't issue draw-calls against objects that rays hit. And contents of these bounding hierarchies would be the same as regular RT, so PT doesn't change anything there.
Draw-call cost is also dramatically lower to begin with on well optimised console codebase, but granted - a lot of modern games aren't that tightly optimised anymore.


Not in ways that regular RT doesn't already do (see above).
Also there's a misconception that RT is inherently CPU heavy process because of PC APIs that doesn't really apply to consoles the same way.

I'm not talking about bvh traversal.
I'm saying that with RT the game has to render more objects that would usually be frustum culled. And that means more draw calls for the CPU to process.
 
Don't figure 8 cables go up to 500+ watts? I know the kettle type cables can take a couple thousands watts.

In Europe C7/C8 plugs are up to 2.5A so ~230V * 2.5A = 575W. In the US they can apparently be rated to 10A because of the lower voltage so ~120V * 10A = 1200W. With European standards they'd be limited to 300W or so.
 
Last edited:
i have a feeling that the ps6 will use a consolelike modified 3d cache APU

Some features missing, and some modified ones for PS6, like the current ps5 apu
 
Last edited:
I'm not talking about bvh traversal.
Neither am I - the cost of RT falls mainly to two things
BVH traversal + shading cost
BHV realtime updates

What you speak about (updating objects out of view) is part of BVH updates. But unlike frustum updates - BVH can be nearly or entirely static frame-2-frame - so the equivalent overhead here is relatively smaller. The downside is that when BVH does change (say, if large parts of levels move/change) - the cost can be substantial - but that's a workload that will typically run on GPU. Ultimately there should be very little for CPU to do here on a console (or in some cases, nothing at all).

Also - 'out of view' is a choice also - plenty of RT implementations out there exclude parts of the scene from RT hierarchy for performance reasons, so it's not a given 'all objects that can potentially be hit by rays need to be part of the update'. That's ground truth - but in realtime we almost never hit that.
 
Neither am I - the cost of RT falls mainly to two things
BVH traversal + shading cost
BHV realtime updates

What you speak about (updating objects out of view) is part of BVH updates. But unlike frustum updates - BVH can be nearly or entirely static frame-2-frame - so the equivalent overhead here is relatively smaller. The downside is that when BVH does change (say, if large parts of levels move/change) - the cost can be substantial - but that's a workload that will typically run on GPU. Ultimately there should be very little for CPU to do here on a console (or in some cases, nothing at all).

Also - 'out of view' is a choice also - plenty of RT implementations out there exclude parts of the scene from RT hierarchy for performance reasons, so it's not a given 'all objects that can potentially be hit by rays need to be part of the update'. That's ground truth - but in realtime we almost never hit that.

Like I said, using PT, the bottleneck will always be in the GPU.
But because there is a need to render more objects that are outside the player view frustrum, that means more draw calls that the CPU has to process per frame.
 
Like I said, using PT, the bottleneck will always be in the GPU.
But because there is a need to render more objects that are outside the player view frustrum, that means more draw calls that the CPU has to process per frame.
Putting to onside that on a console like a PlayStation5 you have unified RAM so a draw call is reduced to the GPU reading from an updated area of RAM where it has been reading continuously for rendering instructions, I'm pretty sure the limit on PT/RT on consoles is a number of intersections and casts per frame - per BVH region - so that even if there was more work offscreen the GPU would finish calculating when it passed the lower threshold for that region and would then use the fallback technique for what it didn't finish tracing.

I don't think PT/RT on console runs to completion when it will miss its frame render time target.
 
Putting to onside that on a console like a PlayStation5 you have unified RAM so a draw call is reduced to the GPU reading from an updated area of RAM where it has been reading continuously for rendering instructions, I'm pretty sure the limit on PT/RT on consoles is a number of intersections and casts per frame - per BVH region - so that even if there was more work offscreen the GPU would finish calculating when it passed the lower threshold for that region and would then use the fallback technique for what it didn't finish tracing.

I don't think PT/RT on console runs to completion when it will miss its frame render time target.

Consoles don't have to deal with the cost of going through the PCIe bus, to transfer data from the CPU to the GPU.
But the CPU still has to process those draw calls.
 
Consoles don't have to deal with the cost of going through the PCIe bus, to transfer data from the CPU to the GPU.
But the CPU still has to process those draw calls.
It will depend on the implementation.

But something like Nanite does a lot of what it does within the GPU using a large complex shader call because the efficiency to give the complex shader call a more generalised request and leave the details within the shader allows for more throughput, so I don't believe it would burden the CPU much more, because even if it has to stream pre-calculated BVH structures of a largely static representation of off screen data only seen in a reflection, that work is offloaded to the IOcomplex.
 
It will depend on the implementation.

But something like Nanite does a lot of what it does within the GPU using a large complex shader call because the efficiency to give the complex shader call a more generalised request and leave the details within the shader allows for more throughput, so I don't believe it would burden the CPU much more, because even if it has to stream pre-calculated BVH structures of a largely static representation of off screen data only seen in a reflection, that work is offloaded to the IOcomplex.

That is true. But even DX11 had the ability to join a bunch of draw calls with Driver Command List function. It's something that proper devs have been using for a while.

Nanite can join that. But it's big strength is that it's doing rasterization in compute, nor being limited by the pixel quad rasterizers, that nvidia and AMD uses.
 
That is true. But even DX11 had the ability to join a bunch of draw calls with Driver Command List function. It's something that proper devs have been using for a while.

Nanite can join that. But it's big strength is that it's doing rasterization in compute, nor being limited by the pixel quad rasterizers, that nvidia and AMD uses.
The real question is: where is the flow control for the bulk of a frame's render time? is it on the CPU or the GPU?

I would argue that most rendering on AAA consoles games takes place on the GPU in the compute shaders and the CPU just manages things between successive frames and the simulations/AI within frame-time, but is effectively out of the conversation once a frame is being processed..

So like Fafalada I don't see the CPU use between successive frames massively increasing for PT/RT.
 
The real question is: where is the flow control for the bulk of a frame's render time? is it on the CPU or the GPU?

I would argue that most rendering on AAA consoles games takes place on the GPU in the compute shaders and the CPU just manages things between successive frames and the simulations/AI within frame-time, but is effectively out of the conversation once a frame is being processed..

So like Fafalada I don't see the CPU use between successive frames massively increasing for PT/RT.

Notice my previous posts.
I always sad that the bottleneck will be on the GPU side, while using PT.
What I meant to say is that using PT, due to having to render objects outside the normal player frustrum, there will be more draw calls for he CPU to process, per frame.
But of course, since the bottleneck is on the GPU side, by a significant margin, although the CPU has more work to do, it will still have to wait on the GPU, as the completes calculating PT.
On the other hand, we only have a handful of years developing real time hardware for RT/PT. While we have close to 25 years of shader development.
Maybe a few years from now, GPUs will have RT units that are much more efficient that current ones, so GPUs won't struggle as much with these loads.
 
Nobody knows. PS5/PS5 Pro draws about 200-230w depending on the game. I'm betting on it not being vastly higher than that. Maybe a slight increase, but pretty sure it will still be under 300w.
Considering the limitations of advancements in node shrinks and significant performance jumps between architectures, I think next gen, they might increase the power draw limits to get higher clocks. It's not looking feasible to get larger dies to fit in acceptable mm2 targets, and they might go the route of keeping clocks closer to desktop GPUs.

How this will fit into the green initiatives and policies is the question. I would certainly like for them push out an ambitious/capable hardware, even if it pushes heat/power draw trends for consoles.
 
Last edited:
Notice my previous posts.
I always sad that the bottleneck will be on the GPU side, while using PT.
What I meant to say is that using PT, due to having to render objects outside the normal player frustrum, there will be more draw calls for he CPU to process, per frame.
But of course, since the bottleneck is on the GPU side, by a significant margin, although the CPU has more work to do, it will still have to wait on the GPU, as the completes calculating PT.
On the other hand, we only have a handful of years developing real time hardware for RT/PT. While we have close to 25 years of shader development.
Maybe a few years from now, GPUs will have RT units that are much more efficient that current ones, so GPUs won't struggle as much with these loads.
I don't think you are following my point. In the past CPU issued drawcalls at the type of granularity you are describing, but on console since the PS4 gen the drawcalls have became flow controlled within the compute shader, a shader that has full access to he unified memory of all the resources it needs to derive what it needs to render for that frame, and in the PS5 it can do the same for PT/RT. So the drawcalls don't really exist in the client/server paradigm that you are referencing. The GPU is working more like an SPU satellite Processor, so the CPU has no more low latency work to do for PT/RT IMO.
 
But because there is a need to render more objects that are outside the player view frustrum, that means more draw calls that the CPU has to process per frame.
I'm not sure we're talking about the same thing here - but there's no added 'classical' draw-calls for objects off screen - RT computes those bounces against whatever has been submitted into acceleration structure.
Those submissions are a draw-call analog, but it's only loosely comparable to frustum DC, the temporal coherence is very different, most of it is precomputed (in typical games) and in some cases nothing changes frame-to-frame so you might not submit anything regardless of the number of objects on screen.
More importantly - even if you only use RT subsets like reflection, shadow or GI bounce, they have the same pre-requisite for off-screen stuff as full-PT, so there's nothing saved on that end.
The one notable difference is that standard frustum draw-calls are still needed in this case (since RT is only partial and most of the scene is still rendered via rasterization) where as full-PT would - strictly speaking, only need to bounce around the acceleration structure and have no further DCs issued (so really - it should be cheaper on the CPU, if anything).
 
Last edited:
Considering the limitations of advancements in node shrinks and significant performance jumps between architectures, I think next gen, they might increase the power draw limits to get higher clocks. It's not looking feasible to get larger dies to fit in acceptable mm2 targets, and they might go the route of keeping clocks closer to desktop GPUs.

How this will fit into the green initiatives and policies is the question. I would certainly like for them push out an ambitious/capable hardware, even if it pushes heat/power draw trends for consoles.
The only way I see them moving above 235watts is if they can do a revision within the first 18months that comes back to 235watts and I just can't see that happening with a single monolithic die.

It is quite interesting that they were above 250watts with the PS4 Pro but returned to below that level with the PS5 Pro despite them being 7 or so years along when there's even less improvement, very much suggesting green initiatives and staying below 250watts are a redline for the design IMO.

I still keep coming back to a dual APU design asymmetrically clocked - maybe with CPU cores disabled on the primary to allow for 2 primary game CPU cores to clock high and still keep most of the GPU clock on that APU and stay below 100watts per APU.

From a BoM perspective Sony can do far more with a complex motherboard and quickly save on it while using older lithography for two identical PS5Pro enhanced APUs clocked asymmetrically, than anything else IMO.

The only other option I can think of is a ASIC for PSSR2 that allowed them to take it off chip and also let them drop down a tier of resolution for native rendering - say to 540p - so all the other resources in relative terms got a free x4 improvement in fx density when scaled back up to 4K.
 
i have a feeling that the ps6 will use a consolelike modified 3d cache APU

Some features missing, and some modified ones for PS6, like the current ps5 apu
One way I can see 3D cache working, is if Sony did something similar to Intel's Arrow Lake / Core Ultra 9 285K.


But the base tile houses the L3 cache on a cheaper 5nm node, while the CCD would be on 2nm and has up to the L2 cache, similarly to the image below, where the CCD is 3D stacked on top of the base tile.
BPPRnYX.jpeg

This should allow for a cheaper larger amounts of L3 cache.

The only issue Sony would face, is if mass 3d stacking via micro-bump is viable.
 
Last edited:
One way I can see 3D cache working, is if Sony did something similar to Intel's Arrow Lake / Core Ultra 9 285K.


But the base tile houses the L3 cache on a cheaper 5nm node, while the CCD would be on 2nm and has up to the L2 cache, similarly to the image below, where the CCD is 3D stacked on top of the base tile.
BPPRnYX.jpeg

This should allow for a cheaper larger amounts of L3 cache.

The only issue Sony would face, is if mass 3d stacking via micro-bump is viable.

The question I would be asking is why would Sony need a 64MB L3 performance cache in the PS6- when the critical performance cache - vector register memory - in the PS5 Pro is way beyond L3 performance AFAIK and assuming there is double the WGPs of the PS5 Pro in the PS6 that cache will double by default to 30MB?
 
Sony wont use 3D, with costs of making chips increasing non-stop.

They just need a new zen 6 architecture.

Just take a 9600X and compare to 5800X3D, the newer architecture is better in most games
 
Sony wont use 3D, with costs of making chips increasing non-stop.

They just need a new zen 6 architecture.

Just take a 9600X and compare to 5800X3D, the newer architecture is better in most games
I would question exactly what we need extra from a console CPU over the current Zen2 mobile and what is it worth in taking die space from a GPU going forward - other than a higher base clock on the primary CPU core to alleviate single core bottlenecks?

Too often PC benchmarks of games are used as a discussion point where it is the chipset, caches, latencies and clocks of the slowest process unlock performance that are miles away from the console optimization that lets games like Horizon and Death Stranding run on a 1.6Ghz clocked mobile Jaguar in a PS4. The step up to Zen2 mobile was huge and I still can't think of a developer this gen saying they need more from a CPU on console for their game vision.
 
Sony wont use 3D, with costs of making chips increasing non-stop.

They just need a new zen 6 architecture.

Just take a 9600X and compare to 5800X3D, the newer architecture is better in most games

The 3D vcache chips are still being made in N6. So they are not as expensive as cutting edge chips in N5 or N3 nodes.
 
Top Bottom