• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Next-Gen PS5 & XSX |OT| Console tEch threaD

Status
Not open for further replies.

ethomaz

Banned
Some comments on ERA because I'm interested in understand that change... so maybe posting here somebody can help me :)

Miniature Kaiju said:
Oh yeah, just noticed that. I guess that means that each SIMD-VU can be done with a wavefront in 2 cycles rather than 4 cycles. Fewer wavefronts per cycle, but faster, so, lower latency. I guess that's a way of making more efficient use of any speed gains.
dgrdsv said:
Wider SIMDs usually mean less efficiency, not more. You're running the same operation on more data but you're also stalling more when there's not enough data for processing. But with next gen going to 4K basis this might be the right move, especially if they'll do something to increase utilization in compute.
Miniature Kaiju said:
I imagine that such a move would come accompanied by deeper wavefront queues and smarter scheduling. As you said, 4K and more mixed compute might be enough to keep it fed.

I also imagine that faster cycles, coupled with something like the variable wavefront sizing and a deep, self-feeding queue might do quite a bit to enhance compute RT.

Indeed being wider (SIMD-32 instead SIMD-16) means less efficiency for normal tasks... it did help the schedule for compute tasks.

So that is good for 4k?

I'm very curious about benchmarks now :D
 
Last edited:

SonGoku

Member
If the game is highly optimised for the original hardware, and if the change to HBM doesn't bring any great improvement in bandwidth, they might end up breaking the original optimisation - leading to frame rate drops etc ..

One possible example I can think of is if the original software is optimised to fetch data in the exact size (and aligned to) the original GDDR6 data width .. the changed to the wider HBM data bus width might result in much reduced utilizations of the wider bus - could even be a disaster for performance .. but I need to think about it more.

Working extreme example - original game uses scattered 32 bit wide aligned data fetches .. when shifting to 128 bit data bus width (HBM) those 32 bit fetches now are ultilising only 25% of the available bandwith.

It's an extreme example.. maybe/maybe not [the old adage - if it can break it will]
Even using cheaper binned HBM3 chips, 24GB worth of HBM3 stacks will provide bandwidth in excess of 1TB/s. They can brute force past any incompatibility plus lower latency to boot!
Just look at the xbone, low latency sram and ddr3 + move engines didn't have any trouble being emulated with gddr5.

If X didn't have any issues, i fail to see how this theorized switch would.
 
Last edited:

Aceofspades

Banned
PS5 having major advantage in loading speed and streaming data will be huge and is a smart engineering decision. It will also makes DF remove the words of "Loading times" " asset pop in" from their dictionary for an entire gen, or unless MS catch up with hardware revision somehow 🤣
 

Bogroll

Likes moldy games
PS5 having major advantage in loading speed and streaming data will be huge and is a smart engineering decision. It will also makes DF remove the words of "Loading times" " asset pop in" from their dictionary for an entire gen, or unless MS catch up with hardware revision somehow 🤣
Dream on.
 

FranXico

Member
PS5 having major advantage in loading speed and streaming data will be huge and is a smart engineering decision. It will also makes DF remove the words of "Loading times" " asset pop in" from their dictionary for an entire gen, or unless MS catch up with hardware revision somehow 🤣
Nah, loading isn't going away yet.
 

SonGoku

Member
334x00.png

Gives credance to Sony taking a loss on PS5, exiting times ahead.
 

ethomaz

Banned
Im confused now, wouldn't more SE mean more efficient use of CUs?
I will have to study more... but in theory the increase in SEs decrease the efficiency (AMD said that in their own tests)... they changed the CUs to SIMD-32 that in theory decrease the efficiency too.

But you can cover that decrease in efficiency adding more schedulers and fixed functions to each SE (nVidia does that).

Let's see how it will turn out.
 

xool

Member
Wait a minute where's as this SIMD 16 stuff from - I thought all we had was KOMACH_ENSAKA 's tweet which just said 2 SIMD32 per CU

ok I see it now - change from 4 16 slot SIMD (old) to 2 32 slot SIMD s (new)

[ignore me]
 
Last edited:
Maybe its just for devkits, one of the first pastebin leaks claimed 32GB devkits and 24GB retail.


excuse the old man's question, but whats pastebin and why should we trust it?




if 512bit really was an option, the old 16GB GDDR6 + 8GB GDDR4 rumor might be true after all.
 

SonGoku

Member
excuse the old man's question, but whats pastebin and why should we trust it?
atm im treating all rumors as speculation not giving more credibility to any one in specific, the pastebin i mentioned is in the OP
if 512bit really was an option, the old 16GB GDDR6 + 8GB GDDR4 rumor might be true after all.
heh.. idk feels super wasteful to use a 512bit bus for a mere 16GB
Is there a full slide deck up somewhere?
 

Ar¢tos

Member
Which regions get PS5 first .. or simultaneous worldwide launch etc ..


Anyone know what the slide "The 99, Not the 1 " is actually mean ?
I have no idea, can't figure it out either.
I'm dumb, I missed the grey text above country rollout.


Finally using Sony music on 1st party titles! I have been wanting this for ages.
Games with great music without having to worry with limited music licenses.
 
Im just theorizing that if they were able to do such a drastic arch change then 1.8GHz gonzolo speeds suddenly dont sound so crazy. Navi is vastly different than Vega.

I will have to study more... but in theory the increase in SEs decrease the efficiency (AMD said that in their own tests)... they changed the CUs to SIMD-32 that in theory decrease the efficiency too.

But you can cover that decrease in efficiency adding more schedulers and fixed functions to each SE (nVidia does that).

Let's see how it will turn out.


it might increase power draw at full load somewhat. but this will be more than made up by the increase of real world perf per FLOP.

geometry bottleneck is probably the biggest contributor to idle shader circles in GCN. idle shaders have still be kept under current without doing work. this uses up power. much more power than the additional silicon would draw. that's (a big point) why nvidia is more energy efficient.
 
Last edited:

Tripolygon

Banned
Which regions get PS5 first .. or simultaneous worldwide launch etc ..


Anyone know what the slide "The 99, Not the 1 " is actually mean ?
It just means focus on the customers you already have, not the ones you don't. This is especially important as the console is nearing it's 100th million consoles sold. The 99 generate more money for the company so they need to focus on them rather than worrying about how they can get the 1 to buy a new PS4.
 

LordOfChaos

Member
F Fafalada LordOfChaos LordOfChaos any comment on this?
mv2GllF.png


Seems like a huge arch change that nobody is discussing.


I think this will largely change the "overhead" per ALU rather than impacting the maximum ALU counts
So twice as many shader engines that hold half as many CUs each (SIMD-32 groupings rather than SIMD-16, so now 8 CUs per SE instead of 16)

So each front end will be responsible for less shaders, so each one will be better used, is what I think. Less stalled shaders = more power utilized, more performance per paper Gflop.


Just to visualize the architecture
small_slide-3.jpg




Edit: Also may change (double) the geometry & rasterization ratios, as I think those were old GCN bottlenecks.
 
Last edited:

SonGoku

Member
LordOfChaos LordOfChaos It would translate into more efficient use of resources? maybe Navi flops>Vega flops
This change really got my attention because it was said to be impossible to add more than 4 SE without a new post GCN arch
 

LordOfChaos

Member
LordOfChaos LordOfChaos It would translate into more efficient use of resources? maybe Navi flops>Vega flops
This change really got my attention because it was said to be impossible to add more than 4 SE without a new post GCN arch

Ayup, I think it's getting more Nvidia-ey, an Nvidia Warp is a group of 32 threads. Not that it would completely bridge the paper Gflops to real performance ratio but every change seems to move in that direction.

Right now AMD pays a 4 clock cycle price for any execution which is some of the difference in real world shader occupancy, I think this reduces it to 2.
 
Last edited:

Aceofspades

Banned
I hate feeding into this garbage but, I'll bite.

It seems after the initial waves of leakes that happened last months, we are just getting useless conflicting reports made of fanboys wet dreams or attention seeking internet users for fun.
 
Is that related? Seems more like why AMD made GCN like it is.

well it has quite good visualization what a SIMD is and what it does. probably why komachi posted it.

that one is particulary good:

02mrkk5.jpg


but to be honest. i don't really see what the benefit of two 32 lane instead of four 16 lane SIMDs per CU should be.

the main story is probably the doubling of the front ends.
 
Last edited:

CyberPanda

Banned
Ninja Theory prepares for the arrival of Ray Tracing with Anaconda

At the beginning of the current generation, the current consoles, and especially the original Xbox One model, were accused of not representing a technological break with the previous generation. Maybe that's why this was the first of the generations of modern consoles, to have new models more powerful in the middle of generation. However, it seems that in the face of the new consoles that will arrive in 2020, it does anticipate the arrival of new techniques and more advanced technologies.

One of these techniques that is already emerging as one of the protagonists of the next generation, as they have been in this 1080p or image reconstruction techniques, is the so brought and carried Ray Tracing. A new rendering technique that requires dedicated chips or a great power to achieve it by emulation. It has recently been rumored that one or both models of the next Xbox will be compatible with it, and today we can tell you that Ninja Theory begins to arm itselffor use in its upcoming titles.

Ninja Theory bets on Ray Tracing in real time

As you can see in the website of Ninja Theory, the new job offers published ask for experience in UE4 (something quite normal), Ray Tracing in real time (typical of the new nVidia more powerful), and DirectML a new api graph presented by Microsoft in the GDC this year, incorporated into DirectX and that allows developers to create code compatible with virtually any type of GPU. As you can see Ninja Theory not only prepares for the arrival of Anaconda and Scarlett, but also for a greater focus on the PC.

Surely very soon we will see very similar movements in other members of Xbox Game Studios, but for now it is a very good sign that Ninja Theory is already preparing for the future.​
 

ethomaz

Banned
well it has quite good visualization what a SIMD is and what it does. probably why komachi posted it.

that one is particulary good:

02mrkk5.jpg


but to be honest. i don't really see what the benefit of two 32 lane instead of four 16 lane SIMDs per CU should be.

the main story is probably the doubling of the front ends.
That is what I thought... this link just shows how the GCN worked until 5.1 and why AMD choose that design over the previous Arch.

It did nothing to tells us how increasing the SIMD lanes will affect performance or others points.
 

ethomaz

Banned
PS5 SSD Leak:
Dev from AAA studio here! Tested a next gen build of our own game and when we try to harness the true power of PS5 the loading times hold up compared to the Spider-Man demo. We don't know how Cerny did it, but there is some magic sauce here.
 

Aintitcool

Banned
So sony is going more innovative and MS is going more brute force 2TF difference possibly. Very interesting if MS is really on vega they might have a ray tracing disadvantage. That troll rumor about PS5 having Cell 2 gave me hope.
 
Last edited:

FranXico

Member
So sony is going more innovative and MS is going more brute force 2TF difference possibly. Very interesting if MS is really on vega they might have a ray tracing disadvantage. That troll rumor about PS5 having Cell 2 gave me hope.
LOL, yeah I wish too. But we both know CELL2 isn't going to happen.
 

TeamGhobad

Banned
So sony is going more innovative and MS is going more brute force 2TF difference possibly. Very interesting if MS is really on vega they might have a ray tracing disadvantage. That troll rumor about PS5 having Cell 2 gave me hope.

this is all rumors and speculation and mostly all BS and unfortunately i dont think e3 will answer any questions either. but according to the leaks MS will use more than 2 threads per core for ray tracing.
 

Aceofspades

Banned
So sony is going more innovative and MS is going more brute force 2TF difference possibly. Very interesting if MS is really on vega they might have a ray tracing disadvantage. That troll rumor about PS5 having Cell 2 gave me hope.

If thats ends up true, a 2TF GPU power is unnoticeable compared to better asset streaming and loading.

I would say Sony engineers have outsmarted MS yet again .
 
Status
Not open for further replies.
Top Bottom