• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Next-Gen PS5 & XSX |OT| Console tEch threaD

Status
Not open for further replies.

Racer!

Member
Fwiw




Since 7nm+ is only sprinkling in EUV on a few critical layers rather than whole cloth, it sounds like AMD is downplaying expectations for it. So for a price sensitive console, probably don't expect it. Even if yeilds are the same, it wouldn't make sense for TSMC to charge the same for two differently performing nodes.

Papermaster quote is 2 years old.

Sony with PS5 is a long term customer over several nodes, and would probably be treated favourably.
 

SonGoku

Member
LordOfChaos LordOfChaos
Papermaster expects foundries will begin to use extreme ultraviolet (EUV) lithography starting in 2019 to reduce the need for quad patterning. EUV “could bring a substantial reduction in total masks and thus lower costs and shorten cycle time for new designs,” he said.
“Foundries will introduce [EUV] at different rates but…I urge them all to go as fast as they can,” he said.
 

Lort

Banned
Wanted to post something I've calculated. With zen 2 clocked at 3.2Ghz, supporting AVX256, and 7 cores dedicated to games you still cant emulate ps3 games reliably. The cell was effectively an APU with SPUs doing floating point calculation like a gpu but at speeds far greater than the fastest GPU ever made. To emulate ps3 games you need a 3.2Ghz gpu or 8 cores dedicated to gaming. The cell was truly a beast in rendering considering that 15 years on (when the ps5 releases) we won't be able to reliably recreate its capabilities. That is bananas

That is simply not true.. ps3 remasters ran with upgrades on ps4.
 

stetiger

Member
That is simply not true.. ps3 remasters ran with upgrades on ps4.
Bruh I don't know if you are serious or kidding. Emulation is how you would do BC, remaster means that the game is rebuilt and the functions are repurposed. I am talking specifically about emulation
 

stetiger

Member
cell only had 6 SPEs available for games
7 Zen2 cores at 3.2GHz has superior floating point perfomance, GPU could assist cell emulation too.
You could use the ACE but their speed are too slow to not run into some issues. Unless you could split those instructions which would not be easy. I thought they had 8 with one disabled for yield. But 6 sounds right. PS3 emulation requires 6 units capable of running double precision ops at 3.2GHz. The PPE will be very simple to emulate. The SPEs are a different beast. You still need to convert RISC to x86 which is up to Sony and relatively simple.
 
Last edited:

SonGoku

Member
You could use the ACE but their speed are too slow to not run into some issues. Unless you could split those instructions which would not be easy. I thought they had 8 with one disabled for yield. But 6 sounds right. PS3 emulation requires 6 units capable of running double precision ops at 3.2GHz. The PPE will be very simple to emulate. The SPEs are a different beast. You still need to convert RISC to x86 which is up to Sony and relatively simple.
1 disabled, 1 dedicated to OS (hypervisor) 6 remaining for games
Zen2 peak floating point perf was posted a few pages back:
32 x 8 x 3.2 = 819 GFLOPS
PlayStation 3's Cell CPU achieves a theoretical maximum of 230.4 GFLOPS in single precision floating point operations and up to 15 GFLOPS double precision
 

stetiger

Member
1 disabled, 1 dedicated to OS (hypervisor) 6 remaining for games
Zen2 peak floating point perf was posted a few pages back:
32 x 8 x 3.2 = 819 GFLOPS
Had to verify but that is right on the money. So we are looking at better than i7 8700K(base clock) on ps5. With enough power to emulate the ps3 cpu wise. GPU emulation is also a bit of a concern as the RSX used separate vertex and pixel shader pipelines but that should be easy as well on GCN/RDNA. Looks like we should get excited if sony is serious. Even though having ps3 back compat is really not that much of a system seller
CPU FamilyDual precision DP IPCSingle precision SP IPC
AMD Jaguar and Puma48
AMD Ryzen 3 (7 nm)??
AMD Ryzen and AMD Ryzen 21632
IBM PowerPC A2 (Blue Gene/Q), per core8SP elements are extend-
ed to DP and processed
on the same units
IBM PowerPC A2 (Blue Gene/Q), per thread4SP elements are extend-
ed to DP and processed
on the same units
Intel Haswell - Intel Coffee Lake (and Devil's Canyon?)1632
Intel Ice Lake??
 

stetiger

Member
Also note that zen2 8 core at 3.2Ghz is at the very least 2(clock speed)*4(instruction per cycle) = 8x the speed of jaguar. Man I am happy. The PS4 to PS5 is more impressive than PS3 to PS4 despite the slowing of technology. On the cpu side at the very least.

EDIT: also on the HDD side we are going from 50mb/s to around 4000 to 5000mb/s. Memory speed from 176GB/s to 400GB/s+ with GDDR6

EDIT2: even with a 10TF RDNA GPU we will have a great visual and physics leap
 
Last edited:

SonGoku

Member
Also note that zen2 8 core at 3.2Ghz is at the very least 2(clock speed)*4(instruction per cycle) = 8x the speed of jaguar. Man I am happy. The PS4 to PS5 is more impressive than PS3 to PS4 despite the slowing of technology. On the cpu side at the very least
A leaked synthetic bench of Zen2 was posted on this thread and it was "only" 5x the jaguar score
Of course there's more to a cpu than raw performance
 

stetiger

Member
A leaked synthetic bench of Zen2 was posted on this thread and it was "only" 5x the jaguar score
Of course there's more to a cpu than raw performance
Yes and no, all games on jaguar are probably aiming at 8Bit instruction. Which is more than enough at 1080p, the difference in performance will show at higher resolution. Jaguar did have a custom cache with low latency so assuming the same is applied to ps5 you can expect an increase there as well

EDIT: Remember that the jaguar on the ps4 had some serious work done on it for latency, which greatly affects real world performance.
 
Last edited:

SonGoku

Member
Yes and no, all games on jaguar are probably aiming at 8Bit instruction. Which is more than enough at 1080p, the difference in performance will show at higher resolution. Jaguar did have a custom cache with low latency so assuming the same is applied to ps5 you can expect an increase there as well
Here is the post i mentioned btw
 

stetiger

Member
Here is the post i mentioned btw
Yeah, I wouldn't read too much into it. Interesting nonetheless. I wonder if PS5 might move to 16Bit instructions to save on space given that you don't really need 32 bit for gaming. With that you can reduce power consumption and free up space for bigger gpus
 

TLZ

Banned
Finally we'll be able to play all MGS games on one machine.

giphy.gif
 

xool

Member
Since 7nm+ is only sprinkling in EUV on a few critical layers rather than whole cloth, it sounds like AMD is downplaying expectations for it. So for a price sensitive console, probably don't expect it. Even if yeilds are the same, it wouldn't make sense for TSMC to charge the same for two differently performing nodes.

afaik TSMC's "7FF+" is now called "6nm" (both add additional layers made using EUV) .. there's an increase in density, and possibly a small decrease in power consumption - but 0 increase in frequency/performance .. it's cheaper though
 

xool

Member
No they are different
6nm is beginning risk production this year while 7nm+ is entering volume producing already

[beat me to it] - yes TSMC are claiming 6nm builds upon 7+


I'm not sure there's a huge difference - the numbers I found said 7nm+ is 17% better on area (than N7), 6nm is 18% better (than N7).
 
Last edited:

SonGoku

Member
[beat me to it] - yes TSMC are claiming 6nm builds upon 7+
I'm not sure there's a huge difference - the numbers I found said 7nm+ is 17% better on area (than N7), 6nm is 18% better (than N7).
The main reason for 6nm is to transition 7nm designs with minimal reworking
If you build your chip around 7nm+ it offers greater benefits than 6nm

Crazy/not so crazy theory time:
gpuroadmap.png

Navi 20 is next gen (aka pure RDNA). PS5/SNEK APU will be built on the 7nm+ EUV node.
The continuous perf/watt gains bit lines up pretty well for a console chip.
TLZ TLZ 's dream redeemed?
 
Last edited:

Lort

Banned
Remaster and BC are different case. In the same way native BC and software emulation.

So even naughty dogs best coders couldnt tap the potential of the PS3.. so there was all this untapped potential ... apparently.

Alternatively in the real world .. even maxed out, the ps3 with all its SPU power was exactly 0 years ahead performance wise.
 

Ar¢tos

Member
I have the solution to both problems.
Shrink the CELL to 7nm and use a chiplet design. Cpu + GPU + cell, with a more powerful cell than the ps3.
Cell would do ps3 emulation and for PS5 games it would be used for RT help, since it can do SIMD faster the GPU (i assume, like every other math calculations, RT calculations can be broken down into many simpler ones).
Prophecy: The CELL will return!
 

stetiger

Member
So even naughty dogs best coders couldnt tap the potential of the PS3.. so there was all this untapped potential ... apparently.

Alternatively in the real world .. even maxed out, the ps3 with all its SPU power was exactly 0 years ahead performance wise.
I think you don't fully grasp what bespoke hardware do. It's not that it was powerful beyond time, it's more that it was unique and different. It used RISC instead of x86. It's design was experimental, there were other chips with one core and assisting cores but mostly not on console. Back then people thought that was the future of multi core cpus, before out of order execution. It had multiple severe limitation in not having strong branch prediction, no support for out of order executions. In many ways, all chips below 3.2ghz and 8 cores cannot match its function. Naughty Dog did tap the potential of the cell, and that kind of design would not work today. Ahead or behind performance wise, has more to do with what task you want to do. We did not have proper float support in modern x86 cpus until 2011 ish. Another way to look at it, is that many PowerPc cpus today support 4 thread per core, would you say that is ahead of zen2? Maybe in some task. Hyperthreading had been around since the 2000s so was that ahead of time? Even though modern chips trump chips from the early 2000s, that fact remain absolute.
 

Panajev2001a

GAF's Pleasant Genius
Yeah, I wouldn't read too much into it. Interesting nonetheless. I wonder if PS5 might move to 16Bit instructions to save on space given that you don't really need 32 bit for gaming. With that you can reduce power consumption and free up space for bigger gpus

With unified memory over 4GB you need 64 bit addressing to allow a single software process to address it all, definitely at 8 GB, let alone 24 GB ;). Pointer wise you are at 64 bit and I do not see the processor being redesigned to remove 64 data types support either (super costly effort, not even semi custom anymore...).

CPU instructions width wise the ISA is what it is and I do not see it changing either. Would undo all the work made designing optimising compilers for Ryzen and Ryzen 2 so far...
 

stetiger

Member
With unified memory over 4GB you need 64 bit addressing to allow a single software process to address it all, definitely at 8 GB, let alone 24 GB ;). Pointer wise you are at 64 bit and I do not see the processor being redesigned to remove 64 data types support either (super costly effort, not even semi custom anymore...).

CPU instructions width wise the ISA is what it is and I do not see it changing either. Would undo all the work made designing optimising compilers for Ryzen and Ryzen 2 so far...
Yes and no. I should have said 16Byte op there. Jaguar is 8Byte = 64Bit. 16B = 128Bit more than enough. But yeah the rework would be extensive and you would not save much space on your ALU.
 

nowhat

Member
With unified memory over 4GB you need 64 bit addressing to allow a single software process to address it all, definitely at 8 GB, let alone 24 GB ;)
Weren't there those extensions to x86 (I can't recall the name right now) that allowed addressing memory ranges over 4GB?

So technically, it could be done. It would be incredibly stupid, and it will not happen. But hey, as far as implausible rumours go - PS5 will go 32-bit!
 

Ar¢tos

Member
Ideally, we would need a brand new CPU architecture built just with gaming in mind. All existing ones have drawbacks. This would work for consoles, but it would make pc ports harder, since pcs need to do other things and would keep x86.
 

Panajev2001a

GAF's Pleasant Genius
Weren't there those extensions to x86 (I can't recall the name right now) that allowed addressing memory ranges over 4GB?

So technically, it could be done. It would be incredibly stupid, and it will not happen. But hey, as far as implausible rumours go - PS5 will go 32-bit!

You are referring to Physical Address Extensions: https://docs.microsoft.com/en-us/windows/desktop/memory/physical-address-extension

Virtual address space is still limited for each process and it is not something you would find in modern x86-64 cores since they can natively access more than 4 GB of memory.
 

TeamGhobad

Banned
I just noticed something when it comes to theoretical power Sony always had the superior system.
(yes original xbox was more powerful but it came out 1.5years after ps2)
 

Panajev2001a

GAF's Pleasant Genius
Yes and no. I should have said 16Byte op there. Jaguar is 8Byte = 64Bit. 16B = 128Bit more than enough. But yeah the rework would be extensive and you would not save much space on your ALU.

Were you talking about the FP HW bandwidth in particular (32 bytes ops move to allow single cycle AVX for 256 bit ops)?
If you have code you can vectorise and you have enough items to process wider CPU vectors can still make some sense (if you can work your data layout comfortably in SoA form).
 

ethomaz

Banned
Had to verify but that is right on the money. So we are looking at better than i7 8700K(base clock) on ps5. With enough power to emulate the ps3 cpu wise. GPU emulation is also a bit of a concern as the RSX used separate vertex and pixel shader pipelines but that should be easy as well on GCN/RDNA. Looks like we should get excited if sony is serious. Even though having ps3 back compat is really not that much of a system seller
CPU FamilyDual precision DP IPCSingle precision SP IPC
AMD Jaguar and Puma48
AMD Ryzen 3 (7 nm)??
AMD Ryzen and AMD Ryzen 21632
IBM PowerPC A2 (Blue Gene/Q), per core8SP elements are extend-
ed to DP and processed
on the same units
IBM PowerPC A2 (Blue Gene/Q), per thread4SP elements are extend-
ed to DP and processed
on the same units
Intel Haswell - Intel Coffee Lake (and Devil's Canyon?)1632
Intel Ice Lake??
That table has some wrong info, no?

Bobcat
SP IPC: 4flops
DP IPC: 2flops

K10/Jaguar
SP IPC: 8flops
DP IPC: 4flops

Bulldozer/Piledriver/Steamroller/Excavator/Zen/Zen+
SP IPC: 16flops
DP IPC 8flops

Zen 2
SP IPC: 32flops
DP IPC: 16flops

Core/Penryn/Nehalem
SP IPC: 8flops
DP IPC: 4flops

Sandy Bridge/Ivy Bridge
SP IPC: 16flops
DP IPC: 8flops

Haswell/Broadwell/Skylake/Kaby Lake/Coffe Lake/Whiskey Lake/Amber Lake
SP IPC: 32flops
DP IPC: 16flops
 
Last edited:

T-Cake

Member
Wanted to post something I've calculated. With zen 2 clocked at 3.2Ghz, supporting AVX256, and 7 cores dedicated to games you still cant emulate ps3 games reliably. The cell was effectively an APU with SPUs doing floating point calculation like a gpu but at speeds far greater than the fastest GPU ever made. To emulate ps3 games you need a 3.2Ghz gpu or 8 cores dedicated to gaming. The cell was truly a beast in rendering considering that 15 years on (when the ps5 releases) we won't be able to reliably recreate its capabilities. That is bananas

You make it sound like some kind of enigma, a super-wonder chip. Yet it could barely manage 720p30 in a lot of games. If the current Xbox One X can elevate Xbox 360 games to 4K then I can't see why the next-gen PS5 couldn't do the same with PowerPC code from PS3.
 

farmerboy

Member
I have the solution to both problems.
Shrink the CELL to 7nm and use a chiplet design. Cpu + GPU + cell, with a more powerful cell than the ps3.
Cell would do ps3 emulation and for PS5 games it would be used for RT help, since it can do SIMD faster the GPU (i assume, like every other math calculations, RT calculations can be broken down into many simpler ones).
Prophecy: The CELL will return!

I FUCKING WISH!

This Cell talk is stirring something in my loins.

Imagine for a moment, Cerny explaining how they shoe-horned the Cell somewhere into the ps5.

Secret fucking sauce motherfuckerrrr.

BOOM!
 

Ar¢tos

Member
You make it sound like some kind of enigma, a super-wonder chip. Yet it could barely manage 720p30 in a lot of games. If the current Xbox One X can elevate Xbox 360 games to 4K then I can't see why the next-gen PS5 couldn't do the same with PowerPC code from PS3.
Only the PPE was powerpc, and it had an extented instruction set.
The SPUs were RISC based with a custom instruction set.
The PPE can be emulated by a toaster, its the SPUs that are the issue, and not even 2 X1Xs taped together would be able to match the SPUs SIMD abilities.
 

T-Cake

Member
Only the PPE was powerpc, and it had an extented instruction set.
The SPUs were RISC based with a custom instruction set.
The PPE can be emulated by a toaster, its the SPUs that are the issue, and not even 2 X1Xs taped together would be able to match the SPUs SIMD abilities.

Thanks for the info. So why were, in general, multiplatform games performing worse on PS3? Is it because the code wasn't using the SPUs to their full advantage? If so, wouldn't that be a benefit for running on PS5?
 

xool

Member
Thanks for the info. So why were, in general, multiplatform games performing worse on PS3? Is it because the code wasn't using the SPUs to their full advantage? If so, wouldn't that be a benefit for running on PS5?

The SPEs had 0.25MB (0.000246GB) of memory for data and program - if they needed to work on different blocks of memory they needed to load it from the main memory. And if two SPEs tried to do that at the same time there would be an additional wait. There was no L2 or L3 cache.

iirc SPE utilization graphs did show a lot of 'downtime' for this reason, even on optimised code (though i can't find one).

In practice, if you were coding for the cell's SPE you would have known what you were doing and getting the vectorisation right, so the actual code optimisation shouldn't have been a huge issue. It's just that the system had a huge flaw/drawback if the data set+program exceed the small amount of memory available
 
Last edited:

Ar¢tos

Member
Thanks for the info. So why were, in general, multiplatform games performing worse on PS3? Is it because the code wasn't using the SPUs to their full advantage? If so, wouldn't that be a benefit for running on PS5?
To get the most out of the CPU, it was required to code for the SPUs directly and that was very time consuming. Then there was the ps3 gpu that was crap and forced the devs to use the cell SPUs to help. In the end it was almost manually assigning threads per SPU for the latest 1st party games!

Emulating ps3 is hard because the SPUs were SIMD focused and very good at it! x86 cpus aren't made with SIMD in mind (we have gpus for that now), so it requires a very powerful cpu to brute force spu emulation.
 

FranXico

Member
This Cell talk is stirring something in my loins.

Imagine for a moment, Cerny explaining how they shoe-horned the Cell somewhere into the ps5.

Secret fucking sauce motherfuckerrrr.

BOOM!
If "BOOM" is referring to the price, I agree. I wish that were true, but unless CELL2 was going to cost pennies, it's not happening.
 

Lort

Banned
I think you don't fully grasp what bespoke hardware do. It's not that it was powerful beyond time, it's more that it was unique and different. It used RISC instead of x86. It's design was experimental, there were other chips with one core and assisting cores but mostly not on console. Back then people thought that was the future of multi core cpus, before out of order execution. It had multiple severe limitation in not having strong branch prediction, no support for out of order executions. In many ways, all chips below 3.2ghz and 8 cores cannot match its function. Naughty Dog did tap the potential of the cell, and that kind of design would not work today. Ahead or behind performance wise, has more to do with what task you want to do. We did not have proper float support in modern x86 cpus until 2011 ish. Another way to look at it, is that many PowerPc cpus today support 4 thread per core, would you say that is ahead of zen2? Maybe in some task. Hyperthreading had been around since the 2000s so was that ahead of time? Even though modern chips trump chips from the early 2000s, that fact remain absolute.
Actually i do ive studied CPU design. AMD and Intel x86 CPUs have a microarchitecture design that decodes x86 commands into RISC style instuctions, out of order execution has been around since 1990. The DEC Alpha engineers joined AMD. Did you know any of that?

The design of the Cell sounded good in theory until you understand its crippling memory bandwidth issues, you cant get data in and out of the SPU, theres a reason every other processor prioritizes cache and mem IO as part of the CPU and GPU design because without it you end up with Bayonetta for PS3.

The Cell failed at everything it was designed for. The custom PPC on the xbox 360 had a hardware dot product instruction, prefetch commands, could write from GPU back into the CPU and the GPU hosted the mem controller (not the CPU as in every other computer ever made).

The 360 was extremely smartly designed but lacked a legion of fans who would be prepared to flood the internet with ill-informed tech propaganda.
 
Last edited:
Status
Not open for further replies.
Top Bottom