Crayon
Member
I believe we will see another PS5 before or at the same time as PS6
Hey that one was mine. Wanna quote battle?
I believe we will see another PS5 before or at the same time as PS6
3x Crossfire GPU and 48GB GDDR6 in a console?! With an MSRP of $1499?My guess would be that the GPU in the PS6 is going to be three device layers, each one being a newer revision of the PS5 Pro GPU and clocked at whatever allows them to hit the 250watt limit or less
Then paired with a modern mobile Zen CPU inside the APU with decent power efficient but with equal or higher clocks to the PS5/Pro, for B/C with but better IPC and throughput because of the 3D cache.
Assuming they were going with 3x Crossfire GPU I would expect 48GBs of whatever GDDR memory won't bottleneck performance, so possible sticking with GDDR6 and just relying on the GPU crossfire setup with a memory controller operating in parallel on three 16GB regions to give a big multiplier in bandwidth by controller complexity rather than chasing expensive GDDR, combined with an update IOComplex with three times the bandwidth (ESRAM) to scale appropriately.
If they were doing it this way, they'd be completely covered for PS5 B/C, mostly PS5 Pro B/C with patches to handle clocks and redirecting raster, RT and ML to the different GPUs, and Cross gen by taking the Pro solution and just ramping up the ML AI and RT on those hardly used parallel GPUs and using the newer Zen CPU and more GDDR.
Early native PS6 games would then probably utilise the Zen CPU, new IOcomplex and RAM fully with Raster on one GPU1, RT on GPU2 and ML AI on GPU3,
Fully developed PS6 games would instead split the Raster, ML AI and RT across the GPUs 1-3 as jobs to scale by need rather than dedicate whole GPU cores per feature IMO.
The BVH accelerators are inside the general purpose WGPs (for raster shader, BVH RT, ML AI CNN's stacked CU caches) so yeah, why wouldn't each have the functionality? It is the very reason AMD and PlayStation haven't been chasing performant Nvidia style ASIC features outside of the WGPs.3x Crossfire GPU and 48GB GDDR6 in a console?! With an MSRP of $1499?
CF/SLI is dead even on PC these days.
There's no way Sony won't adopt GDDR7, since it will be cheaper and higher density.
I don't even understand what makes you think GPU2 will only have RT circuitry. Are you sure you understand modern GPU architectures?
Splitting jobs among cores sounds too Cell-y (software rendering), but modern GPUs have dedicated circuitry for RT.
It depends... can they have competitive enough ARM cores to rival Zen 6? (wideness, AVX-512)
"PS4 Games", I don't see original ps5 games, this gen is just for getting through.The question is, do we need AMD 3D chips to play PS5 and PS4 remastered games?
Sony is part of the Tech they invented, it's not a %100 AMD.Just because something is on AMD's roadmap, doesn't mean it's going to be on the PS6.
Who ever made this list has no idea of what really is going onto the PS6 and just threw everything at the wall to see what sticks.
Minor maybe, but from what Cerny was saying cache bandwidth was too low, they are relying on dedicated registers in each CU shader processor.Splitting jobs is what is already happening, and with the PSSR solution on PS5 Pro relying on stacking L1 and L2 bandwidth from CUs
Too early to tell, but aside from focus on some areas like RT which will get another speedup on top of the 2-3x speedup they gained with PS5 Pro, this generation will be a lot more based on AI to render sparsely and fill the gaps. So more focused used of yes increased resources (I would not expect the absolute number of “old” metrics like TFLOPS or fillrate to jump by an order or magnitude or so).Based on that rumor is it possible to guess the power of the PS6 ? Is it 3x, 6x, 10x vs PS5 ?
HBM can still be on the table.I would see still an SoC (maybe the Pro model could explore other solution) with GDDR7 and maybe VCache / Infinity Cache stacked memory for the GPU for the bandwidth they need.
If PS6 is serious about more flexible AI acceleration they need more memory and much more bandwidth (RT would benefit too). I think 32 GB of GDDR7 is a good part of the puzzle sorted, but we are not talking about HBM like super high bandwidth external memory so I think that pushing on the caching hierarchy and stacked memory to give the needed bandwidth is a must.
I would like seeing it practical, but a last level memory cache like VCache/Infinity Cache and GDDR7 as well as tweaks to the CUs to reduce bandwidth needed from external I/O might be more economical.HBM can still be on the table.
If Sony is using 3D stacking, they can stack the HBM on top of the I/O die or use fanout HBM, which is similar to RDNA 3 MCDs.
Both of which removes what makes HBM so expensive, the interposer.
High Bandwidth Memory Will Stack on AI Chips Starting Around 2026 With HBM4
Currenty, HBM stacks integrate 8, 12, or 16 memory devices as well as a logic layer that acts like a hub. HBM stacks are placed on the interposer next to CPUs or GPUs and are connected to their processors using a 1,024-bit interface. SK Hynix aims to put HBM4 stacks directly on processors, eliminating interposers altogether.
This approach resembles AMD’s 3D V-Cache, which is placed directly on CPU dies. But HBM will feature considerably higher capacities and will be cheaper but slower than V-Cache albeit slower.
SK hynix Prepares for ‘Fan-out Packaging’ with Next-generation HBM
A major reason for SK hynix’s application of Fan-out packaging in the memory semiconductor field is interpreted as a cost reduction in packaging. The industry regards 2.5D Fan-out packaging as a technology that can reduce costs by skipping the Through-Silicon Via (TSV) process while increasing the number of input/output (I/O) interfaces. The industry speculates that this packaging technology will be applied to Graphic DRAM (GDDR) and others that require an expansion of information I/O.
HBM can still be on the table.
If Sony is using 3D stacking, they can stack the HBM on top of the I/O die or use fanout HBM, which is similar to RDNA 3 MCDs.
Both of which removes what makes HBM so expensive, the interposer.
High Bandwidth Memory Will Stack on AI Chips Starting Around 2026 With HBM4
Currenty, HBM stacks integrate 8, 12, or 16 memory devices as well as a logic layer that acts like a hub. HBM stacks are placed on the interposer next to CPUs or GPUs and are connected to their processors using a 1,024-bit interface. SK Hynix aims to put HBM4 stacks directly on processors, eliminating interposers altogether.
This approach resembles AMD’s 3D V-Cache, which is placed directly on CPU dies. But HBM will feature considerably higher capacities and will be cheaper but slower than V-Cache albeit slower.
SK hynix Prepares for ‘Fan-out Packaging’ with Next-generation HBM
A major reason for SK hynix’s application of Fan-out packaging in the memory semiconductor field is interpreted as a cost reduction in packaging. The industry regards 2.5D Fan-out packaging as a technology that can reduce costs by skipping the Through-Silicon Via (TSV) process while increasing the number of input/output (I/O) interfaces. The industry speculates that this packaging technology will be applied to Graphic DRAM (GDDR) and others that require an expansion of information I/O.
The article states 3D stacking HBM is cheaper than 3D V-Cache.I really doubt we'll see a console using HBM, because it's too expensive and on a console price is the main factor.
But having a stack of 3DVcache, made in N6, as AMD is using on their Ryzen CPUs, could be something more realistic.
A chunk of 32 or 64Mb of L3 on top of the SoC, could do wonders for data locality.
Anything is better than getting excited about Mario kart 87, rev 15c, part XXilol ps fanboys upset that Nintendo getting all the leaks they have to make something up we won't know or care about for years.
I clearly need to go back and gleam more info from that chat as I was under the impression the register caches were the L1 data. Going by your correction of what I was thinking, I suspect that means the granularity of the processing - within the WGPs - is even finer grain than I thought, which again leads me to the view that the cheapest way to scale up the ML AI TOPs and RT without breaking the GPU CU count (36Cus, etc) that has proved to be more optimal to CU saturation with raster/shader is to crossfire/SLI GPU units within an APU.Minor maybe, but from what Cerny was saying cache bandwidth was too low, they are relying on dedicated registers in each CU shader processor.
The article states 3D stacking HBM is cheaper than 3D V-Cache.
This video shows how expensive 3D V-Cache can be due to the steps involved, which may not be viable for mass production of millions of chips.
Not gonna happen, soon enough GDDR6 will be EOL. Even nVidia has abandoned it. Sony needs to have solid logistics for the PS6 all the way to mid-2030s.I clearly need to go back and gleam more info from that chat as I was under the impression the register caches were the L1 data. Going by your correction of what I was thinking, I suspect that means the granularity of the processing - within the WGPs - is even finer grain than I thought, which again leads me to the view that the cheapest way to scale up the ML AI TOPs and RT without breaking the GPU CU count (36Cus, etc) that has proved to be more optimal to CU saturation with raster/shader is to crossfire/SLI GPU units within an APU.
In your other comment you mentioned GDDR7, assuming it is already most likely, but thinking back to the PS5 having RDNA1.x ROPs to get double the amount within cost, I'm still leaning towards GDDR6 being the same situation where having more versus memory controller simplicity and more expensive, and still less will lean PlayStation to older, cheaper and more complicated, especially as it would be an easy tick list improvement on a PS6 Pro.
Nvidia's recent vision transformer(new) versus CNNs (old) despite being slight of hand marketing in all likelihood via partial attention (partial context full frame analysis) unlikely to scale down to RTX 20xx, I do foresee this being added as full self attention to PSSR on PS6 where latency hiding the analysis on a crossfire/sli GPU unit in an APU, or done quicker across all 3 GPU units with a frame's latency seems more feasibly when you don't have to account for bandwidth of in/out of an external NPU unit, and similarly certain parts of raster/shader do benefit from more CUs in specific tasks like generating the scene's depth buffer that is the initial ray of a ray trace, so that again would benefit from the latency reduction of a three crossfire GPU unit, to then kickstart the BVH intersection tracing on just one or two units, So I still think there's a lot of software flexibility benefits of a multi GPU unit when done with AMD's very general purpose WGPs.
Even a PS6 Pro could probably just be bigger CU counts on a better lithography for GPU units attached to GDDR7
Don't expect a huge raster bump. Sony/AMD are following nVidia's approach (less brute force, more AI).Based on that rumor is it possible to guess the power of the PS6 ? Is it 3x, 6x, 10x vs PS5 ?
CPUs are more latency sensitive than bandwidth.Then how come AMD doesn't use it on Ryzen CPUs?
Not gonna happen, soon enough GDDR6 will be EOL. Even nVidia has abandoned it.
How is that comparable with the slim, are you suggesting that the PS5 or Pro won't sell in 2030? And logistics is a long game, they could design the launch console with GDDR6 and end up on GDDR7 at GDDR6 specs because of logistics and dropped price to buy GDDR7, but they can be forward compatible that way, but if the prohibitive costs of 48GBs of GDDR7 at launch make the console cost £1000 in parts that isn't a superior option when you didn't need GDDR7 speeds when 3x downclocked GDDR6 would have been bigger or the same.Then how come AMD doesn't use it on Ryzen CPUs?
Not gonna happen, soon enough GDDR6 will be EOL. Even nVidia has abandoned it. Sony needs to have solid logistics for the PS6 all the way to mid-2030s.
That's like saying PS4 should have stuck to good ol' GDDR3, even though the latest PS3 Super Slim revision (28nm RSX) adopted 2 x GDDR5 chips vs 4 x GDDR3 chips without compromising BC. It's just more economical, plain and simple:
The multi-GPU setup sounds too complicated for most devs, kinda reminiscent of Cell in a way... most devs would just use 1 GPU out of 3, so you'd be stuck with an expensive console that barely anyone utilizes to its full potential.
Sony will not go that wide as you suggest, in fact they're willing to go narrower judging by the PS5 GPU (less CUs, higher clocks).
Don't expect a huge raster bump. Sony/AMD are following nVidia's approach (less brute force, more AI).
At best it might be equal to RTX 4090... maybe...
That is a bad idea. Sony has a good tick-tock strategy with base and Pro model, a suicidal empathy with MS approach there does not make sense.
If they have a base PS6 and a handheld again, with a Pro coming down the line, maybe, but looking at what MS did and copying it when it did not work (it did not show promise and it was an attempt to sandwich PS5 from low and high end at the same time more than anything IMHO) does not mean Sony should follow. By all accounts PS5 should have shipped with PSVR/PSVR2 bundled in then because MS did it with Kinect.
A Pro is not just an overclocked slightly bigger chipset. It is an evolution of an existing platform and testing grounds for a new generation. You need time to observe what developers do with the existing console (including internal devs and select third parties with pre-release DevKits) and where the industry is leading to in the future to chart the path to a Pro model and then to the new generation.
The “we launched our mid generation upgrade with the base console” idea is simply not the point nor, no offence meant, a good idea / good understanding of what mid-generation upgrades are meant for.
It doesn't need to get faster, but in my hypothetical scenario of a PS6 being modern mobile Zen chiplets with 3d cache to 3 unit GPU stacked layers with each GPU unit a more modern Pro GPU downclocked to hit a 250watt limit, each GPU unit could in effect have its own 16GBs of GDDR6, effectively tripling system memory performance using old GDDR6.A non-handheld PS6 is going to use 24-32Gbit GDDR7 with whatever speed bin is cheap at the time. HBM is too expensive (it's not even used in $2000 GPUs, so why would consoles be able to afford it?) and GDDR6 is EOL and is not going to get faster, but more importantly denser chips.
It doesn't need to get faster, but in my hypothetical scenario of a PS6 being modern mobile Zen chiplets with 3d cache to 3 unit GPU stacked layers with each GPU unit a more modern Pro GPU downclocked to hit a 250watt limit, each GPU unit could in effect have its own 16GBs of GDDR6, effectively tripling system memory performance using old GDDR6.
CPUs are more latency sensitive than bandwidth.
Who told you HBM increases latency?CPUs are more latency sensitive than bandwidth.
PS5 Pro will be phased out by then, since the console enthusiast audience will no longer be interested by then (same thing with PS4 Pro vs PS5).How is that comparable with the slim, are you suggesting that the PS5 or Pro won't sell in 2030?
They don't use it not because of latency, but because it's expensive as hell.But AMD doesn't use HBM, even on their GPUs. They use the same L3 cache, just on the side, as their CPUs.
Even if Cerny became Crazy Ken Vol2 (highly unlikely), this would still be a bad idea.So you expect the PS6 to have 3 GPUs, each with 256-bit GDDR6 interfaces, for a 768-bit total? And this SoC will be how many mm^2? You're going to get some laptop variant CPU paired with a mid-range AMD GPU with 24-32GB.
They don't use it not because of latency, but because it's expensive as hell.
Not even nVidia uses HBM on their consumer GPUs (even the prosumer RTX 5090) and they dominate the GPU field (90% marketshare).
Stacked, with a 3D cache. Lets consider Sony's previous PlayStation interfaces that others wouldn't have considered, even the IO Complex is a modern interface others wouldn't have come up with, but when their choice is expensive 3rd party parts or complex internal EE solutions, they favoure the latter because it is at cost and eventually endeds up cheaper very quickly.So you expect the PS6 to have 3 GPUs, each with 256-bit GDDR6 interfaces, for a 768-bit total? And this SoC will be how many mm^2? You're going to get some laptop variant CPU paired with a mid-range AMD GPU with 24-32GB.
Very well said.The series s situation is worse that that. Playststions always get support well past the launch of their successors. These days it's cross gen, but back then it was bespoke versions or even distinct games. The PS4 IS Sony's series s. A way to still enjoy games if you aren't ready for $500 at the moment.
Except you probably already have a ps4. Series S wouldn't be needed if ms wasn't so dead set on burying the xb1. Miles, GT7, and Horizon fw coming out on PS4 meant happier PS4 owners, who are your biggest pool of prospective PS5 owners.
AMD used HBM on Radeon VII.But AMD doesn't use HBM, even on their GPUs. They use the same L3 cache, just on the side, as their CPUs.
I still don't think you guys are reading the article and how an interposer free HBM can benefit a unified memory architecture.They don't use it not because of latency, but because it's expensive as hell.
Not even nVidia uses HBM on their consumer GPUs (even the prosumer RTX 5090) and they dominate the GPU field (90% marketshare).
I've been reading for over a decade about how HBM will become mainstream (i.e. organic vs silicon interposer), but so far it's been a nothingburger.I still don't think you guys are reading the article and how an interposer free HBM can benefit a unified memory architecture.
HBM is high bandwidth and low latency which benefits both the CPU and GPU without adding a large amount of L3 cache.
The interposer is the reason HBM is expensive. Removing it now makes HBM suitable for a console unified memory architecture.
@Xyphie
What is Sony's alternative other than major EE innovation and software innovation for the PS6 to remain better bang for buck than any generic competitor with money, or over spending on BoM or under delivering on a launch PS6 spec that puts daylight between itself and the PS5 Pro to repeat the successes of PS1, PS2, PS4 and PS5?
Why would it cost £1000? many of the Ps5 innovations like the IO complex and use of SSD modules with multiple channels to increase bandwidth have already been amortised. The Pro GPU x3 at a smaller node would be amortised too, so the 3D cache to interface it all would surely be the biggest BoM cost, especially if downclocking GPU units on newer RDNA and not going for newest lithography.Even if Cerny became Crazy Ken Vol2 (highly unlikely), this would still be a bad idea.
A $1000 console is DOA. Consoles need to be mass market products, otherwise game devs won't bother.
PSVR2 has probably taught Sony a lesson or two...
@Xyphie
What is Sony's alternative other than major EE innovation and software innovation for the PS6 to remain better bang for buck than any generic competitor with money, or over spending on BoM or under delivering on a launch PS6 spec that puts daylight between itself and the PS5 Pro to repeat the successes of PS1, PS2, PS4 and PS5?
Why would it cost £1000?
If it costs £1000 with GDDR7, you can be 100% sure it's going to cost even more with GDDR6 (more chips)...but if the prohibitive costs of 48GBs of GDDR7 at launch make the console cost £1000 in parts
The cost per transistor is not going down signficantly with newer nodes, so 3x the transistors could require well over 2x the silicon cost. It makes more sense to minimise total area and maximise clocks within your power budget, which is exactly what Sony did with the PS5.Why would it cost £1000? many of the Ps5 innovations like the IO complex and use of SSD modules with multiple channels to increase bandwidth have already been amortised. The Pro GPU x3 at a smaller node would be amortised too, so the 3D cache to interface it all would surely be the biggest BoM cost, especially if downclocking GPU units on newer RDNA and not going for newest lithography.
Where are you seeing the costing exploding on an APU with 3 GPU Units and Zen processor?
Not 4 gens, and even just looking a those 2gens you are downplaying what was delivered, the innovation started with the PS4 using an advance EE innovating hUMA setup for a APU when the generic option was a split RAM with DDR and Esram like the competition. The PS4 set a new expectation of what was possible hence X1X copied..We've seen Sony build the most predictable SoC with a mid-range GPU with a 256-bit GDDRx bus for 4 consoles in a row, why do you expect this to change? If you bet that the PS6 is just going to be some Zen6-7 CPU paired with a mid-range RX 10060-10070-derived 256-bit GDDR7 UDNA GPU, you'll more than likely be right. And with rumors abound that Sony will do a 2 SoC strategy my expectation is that that 256-bit is the high-end.
Look at what the feature set nVidia is launching with Blackwell does, that's the baseline Sony and AMD will catch up to in 3-4 years.
But that's only if chasing the latest lithography, when mixing and matching is now very much the norm for custom stuff and PlayStation fabs at least 10M PS5 Pro chips minimum to get beyond such direct costing.The cost per transistor is not going down signficantly with newer nodes, so 3x the transistors could require well over 2x the silicon cost. It makes more sense to minimise total area and maximise clocks within your power budget, which is exactly what Sony did with the PS5.
We will never see a console launch now that isn't BCNoob here but based on the specs, is it backwards compatible to ps5/4 games?
PS6 is design complete and in pre-si validation already, with A0 tapeout scheduled for late this year.
Hey! Considering that 2025 only just started, wouldn't we be looking at a late 2026 release here? Are we even sure that this is PS6 we're talking about?Sony's usual cadence is 2 years from A0 tapeout to console release.
Don't forgetThey’ve been at 7 years almost on the dot for three generations. I think they’ll continue. Plus, November is right on time for the holidays.
November 2006: PS3
November 2013: PS4
November 2020: PS5
November 2027: PS6
It applies to the 7m to 5nm transition, which is why Sony already had to increase the price for the Pro vs. the base model. Now if you want to ship the equivalent of three of those Pro chips, that's 3x the cost on 5nm, and possibly slightly less on a newer process.But that's only if chasing the latest lithography, when mixing and matching is now very much the norm for custom stuff and PlayStation fabs at least 10M PS5 Pro chips minimum to get beyond such direct costing.
Higher clocks offer less and fight power efficiency and power limits, and stability, and the further you clock above 1.4GHz the less performance per watt you get in parallel processing, that's a theoretical calculation that isn't going to move much by real world improvements or tweaks.
So how exactly do they do a console 2-3x the performance of a PS5 pro below a £750 BoM?
The article states 3D stacking HBM is cheaper than 3D V-Cache.
This video shows how expensive 3D V-Cache can be due to the steps involved, which may not be viable for mass production of millions of chips.
CPUs are more latency sensitive than bandwidth.
Well we know the Pro is massively overpriced and not intended to be mass mainstream sold priced product,It applies to the 7m to 5nm transition, which is why Sony already had to increase the price for the Pro vs. the base model. Now if you want to ship the equivalent of three of those Pro chips, that's 3x the cost on 5nm, and possibly slightly less on a newer process.
Yes, you lose power efficiency by clocking higher, but the price of the console is limited by the size of the APU, so it makes sense to sacrifice power efficiency to sell at a cheaper price. Which again, is exactly what Sony did with the PS5.
As for my prediction, I think Sony have a choice between building the smallest GPU that can fit a 256-bit bus with GDDR7 for ~1TBs, which on 3nm or 2nm is going to be really expensive. Or they can build a smaller chip (eg. 150 mm^2) with a 128-bit bus and rely on cache. The GCD for the 7900 XTX is 300 mm^2, so on 2mm it could be half that and perhaps there is still some room to increase clock speeds.