• Hey Guest. Check out your NeoGAF Wrapped 2025 results here!

Interesting 'new' public PS3 GPU facts emerge

Izzy

Banned
The RSX is a 90nm GPU weighing in at over 300 million transistors and fabbed by Sony at two plants, their Nagasaki plant and their joint fab with Toshiba.

Given the transistor count and 90nm process, you can definitely expect the RSX to feature more than the 16 pipes of the present day GeForce 6800 Ultra.

NVIDIA confirmed that the RSX is features full FP32 support, like the current generation GeForce 6 as well as ATI's Xbox 360 GPU. NVIDIA did announce that the RSX would be able to execute 136 shader operations per cycle, a number that is greater than ATI's announced 96 shader ops per cycle.


1) NVIDIA stated that they had never had as powerful a CPU as Cell, and thus the RSX GPU has to be able to swallow a much larger command stream than any of the PC GPUs as current generation CPUs are pretty bad at keeping the GPU fed.

2) The RSX GPU has a 35GB/s link to the CPU, much greater than any desktop GPU, and thus the turbo cache architecture needs to be reworked quite a bit for the console GPU to take better advantage of the plethora of bandwidth. Functional unit latencies must be adjusted, buffer sizes have to be changed, etc...


Link




Since that interview with David Kirk conducted by Zenji Nishikawa contains so little words from Kirk that it's almost dedicated to technical illustration for readers and speculation by Nishikawa, I have little to add to what Mikage's post. Kirk says nothing about specifics about RSX, but only about how this RSX-CELL system works. Anyway here's a bit more descriptive version of the latter half of the article -

David Kirk: SPE and RSX can work together. SPE can preprocess graphics data in the main memory or postprocess rendering results sent from RSX.

Nishikawa's speculation: for example, when you have to create a lake scene by multi-pass rendering with plural render targets, SPE can render a reflection map while RSX does other things. Since a reflection map requires less precision it's not much of overhead even though you have to load related data in both the main RAM and VRAM. It works like SLI by SPE and RSX.

David Kirk: Post-effects such as motion blur, simulation for depth of field, bloom effect in HDR rendering, can be done by SPE processing RSX-rendered results.

Nishikawa's speculation: RSX renders a scene in the main RAM then SPEs add effects to frames in it. Or, you can synthesize SPE-created frames with an RSX-rendered frame.

David Kirk: Let SPEs do vertex-processing then let RSX render it.

Nishikawa's speculation: You can implement a collision-aware tesselator and dynamic LOD by SPE.

David Kirk: SPE and GPU work together, which allows physics simulation to interact with graphics.

Nishikawa's speculation: For expression of water wavelets, a normal map can be generated by pulse physics simulation with a height map texture. This job is done in SPE and RSX in parallel


Sounds amazing.
 
most of these are no brainers really.
more pipes then a 6800 eh? wow...
NV have been doing 30fp precision since the FX fiasco.
And a more efficient design can prevail over a high transistorcount as proven by the 9700 cards
 
Nvidia also announced RSX can process 51 billion dot products per second (in comparison , Xenos does about ~30 billion per second)


NVIDIA did announce that the RSX would be able to execute 136 shader operations per cycle, a number that is greater than ATI's announced 96 shader ops per cycle.

Most impressive. (considering the higher core clock) I do wonder about the potential ALU config, though.
 
Izzy said:
Nvidia also announced RSX can process 51 billion dot products per second (in comparison , Xenos does about ~30 billion per second)

Are you sure this wasn't for the whole system, the 51bn dot products figure?

Also, I figure Xenos can do ~24bn dot-products a second. Well, just dividing the vector4 flops figure by 8 gives you that anyway..
 
gofreak said:
Are you sure this wasn't for the whole system, the 51bn dot products figure?

Also, I figure Xenos can do ~24bn dot-products a second. Well, just dividing the vector4 flops figure by 8 gives you that anyway..

That's why I wrote ~30 - it looks better for Xenos. 51 bn figure was nVidia's GPU only figure.
 
The RSX is a 90nm GPU weighing in at over 300 million transistors and fabbed by Sony at two plants, their Nagasaki plant and their joint fab with Toshiba.

This irks me. They're using the same process (CMOS4) for RSX in 2006 that the PSP and [EE+GS] have been using, the latter since 2H2003. And the latter two designs have eDRAM to boot... CELL I can understand being at 90nm, as it's a highly-custom sSOI design, but WTF happened here. CMOS5 (65nm) has been sampling since 2004, somewhere, some group fucked up.

And does anyone know the NV40s sie size? It has to be huge, having 220M tranistsors at 130nm. [130] => [90nm] is a 2X increase in area, RSX is just over 300M... I'm guessing NV40 is huge and nVidia|Sony are banking over absolute preformance.
 
Oh, BTW - NV40 is approximately 18 mm in width and height. This makes the die size approximately 324 mm squared.
 
From one @ B3d:

Since that interview with David Kirk conducted by Zenji Nishikawa contains so little words from Kirk that it's almost dedicated to technical illustration for readers and speculation by Nishikawa, I have little to add to what Mikage's post. Kirk says nothing about specifics about RSX, but only about how this RSX-CELL system works. Anyway here's a bit more descriptive version of the latter half of the article -

David Kirk: SPE and RSX can work together. SPE can preprocess graphics data in the main memory or postprocess rendering results sent from RSX.

Nishikawa's speculation: for example, when you have to create a lake scene by multi-pass rendering with plural render targets, SPE can render a reflection map while RSX does other things. Since a reflection map requires less precision it's not much of overhead even though you have to load related data in both the main RAM and VRAM. It works like SLI by SPE and RSX.

David Kirk: Post-effects such as motion blur, simulation for depth of field, bloom effect in HDR rendering, can be done by SPE processing RSX-rendered results.

Nishikawa's speculation: RSX renders a scene in the main RAM then SPEs add effects to frames in it. Or, you can synthesize SPE-created frames with an RSX-rendered frame.

David Kirk: Let SPEs do vertex-processing then let RSX render it.

Nishikawa's speculation: You can implement a collision-aware tesselator and dynamic LOD by SPE.

David Kirk: SPE and GPU work together, which allows physics simulation to interact with graphics.

Nishikawa's speculation: For expression of water wavelets, a normal map can be generated by pulse physics simulation with a height map texture. This job is done in SPE and RSX in parallel.

Talking about this article: http://www.watch.impress.co.jp/game/docs/20050519/ps3_r.htm

The RSX already is a beast, but by using the Cells SPE it will be a real monster.

Fredi
 
McFly said:
From one @ B3d:



Talking about this article: http://www.watch.impress.co.jp/game/docs/20050519/ps3_r.htm

The RSX already is a beast, but by using the Cells SPE it will be a real monster.

Fredi

Was just about to post this. This sounds good :) If you can store the framebuffer in XDR, this is good on a number of levels.

:)

Phil Harrison alluded to some of this in his gamesindustry.biz interview - for example, using Cell to take on some heavy lighting calculations in the Doc Oc demo, the results of which the GPU fed from to render the image. Dynamic calculation of data which is usually precomputed and static could be exciting :)

This is perhaps where PS3's CPU advantage can make a "noticeable" difference, asides from the more "invisible" gains in physics, AI etc.
 
Izzy said:
Thanks, Fredi. That does sound amazing. I'll edit my first post in this thread to include what you just posted.

Yes, it really does. :)

I was the whole time hoping for something like this and just waited for an article to show up. There will be many (if not most) games that just don't need the full power of Cell for physics, AI and other stuff, so those games can use Cell to further improve the graphics.

Fredi
 
There seems to be some pretty tight integration between Cell and RSX.

Maybe the RSX being so "PC" is not going to let it down as much as the XBots hope.
 
McFly said:
Yes, it really does. :)

I was the whole time hoping for something like this and just waited for an article to show up. There will be many (if not most) games that just don't need the full power of Cell for physics, AI and other stuff, so those games can use Cell to further improve the graphics.

Fredi

Indeed - RSX is already a beast - 51bn dot ops a second, 136 shader ops/clock, high fill-rate, but this could really take it to another level. :)
 
McFly said:
Yes, it really does. :)

I was the whole time hoping for something like this and just waited for an article to show up. There will be many (if not most) games that just don't need the full power of Cell for physics, AI and other stuff, so those games can use Cell to further improve the graphics.

Fredi

Indeed, I was very much hoping for this kind of flexibility/collaboration.

I hope one of the big (english-speaking) tech sites gets an interview with him to discuss some of this in more detail.
 
The possibilities offered by this strong interaction between RSX and SPEs are really impressive.The Cell becomes an extension of the GPU,you can do vertex processing or post processing effects sharing the workload and some rendering steps between the two as needed.
Also the fact that XDR and GDDR memory can both be used actively as video memory is great.I really can't wait to see what developers will make from this hardware.
 
fortified_concept said:
So how do the RSX's and Cell's combined graphics capabilities compare to 360's?


At the moment since all the key GPU details for PS3 are missing I don't think it's possibile to make definitive comparisons,also these informations suggest that PS3 hardware should be considered as whole.
But I think there's no doubt PS3 has the edge.
 
fortified_concept said:
So how do the RSX's and Cell's combined graphics capabilities compare to 360's?

RSX alone sounds like a beast. Higher core clock than Xbox 360 GPU, higher number of shading ops/clock (136 vs. 96), more pipes (16+ vs. 8), significantly better dot product calculation (51 billion dot products per second vs. 24 billion). It's a beast.

However, in combination with a SPE (or two), it could become a monster.

David Kirk: Post-effects such as motion blur, simulation for depth of field, bloom effect in HDR rendering, can be done by SPE processing RSX-rendered results.

Nishikawa's speculation: RSX renders a scene in the main RAM then SPEs add effects to frames in it. Or, you can synthesize SPE-created frames with an RSX-rendered frame.

David Kirk: Let SPEs do vertex-processing then let RSX render it.

Nishikawa's speculation: You can implement a collision-aware tesselator and dynamic LOD by SPE.

David Kirk: SPE and GPU work together, which allows physics simulation to interact with graphics.

Nishikawa's speculation: For expression of water wavelets, a normal map can be generated by pulse physics simulation with a height map texture. This job is done in SPE and RSX in parallel.

Now that's just jaw dropping.
 
So how do the RSX's and Cell's combined graphics capabilities compare to 360's?
Unless nVidia and Sony are hiding some crippling flaw in design the PS3 should have the edge, the full specs of the RSX will be a major determing factor in just how big that edge is, and if Sony makes any final hardware adjustments.
 
More from this article.

David Kirk (nVIDIA) said:

1) RSX can use XDR-RAM(256MB) as VRAM too.

2) 7 SPEs and RSX can work togehter as a total GPU. SPE as vertex shader
, post processing a rendering result from RSX etc...

BTW - nVidia's RSX 128-bit HDR implementation is rumoured to be exceptional. :)
 
Vince said:
This irks me. They're using the same process (CMOS4) for RSX in 2006 that the PSP and [EE+GS] have been using, the latter since 2H2003. And the latter two designs have eDRAM to boot... CELL I can understand being at 90nm, as it's a highly-custom sSOI design, but WTF happened here. CMOS5 (65nm) has been sampling since 2004, somewhere, some group fucked up.

And does anyone know the NV40s sie size? It has to be huge, having 220M tranistsors at 130nm. [130] => [90nm] is a 2X increase in area, RSX is just over 300M... I'm guessing NV40 is huge and nVidia|Sony are banking over absolute preformance.


What are the yields @ 65nm?
 
so what does this mean for the RSX, especially re: the unified shader architecture of the ATi chip? Will offloading some of the workload onto CELL help them make up for the deficiencies of a rigid hardware design? I see where it mentions the SPEs can act as vertex shaders, can they act as pixel shaders too? Sounds really cool, Sony's marketing should really focus on the synergy of all the chips working together.
 
This is why I was excited about the potential of the PS3. If the Cell really is capable of taking on the load of number crunching for the GPU, the graphic effects accomplished later on its life cycle could be simply amazing.
 
Nerevar said:
so what does this mean for the RSX, especially re: the unified shader architecture of the ATi chip? Will offloading some of the workload onto CELL help them make up for the deficiencies of a rigid hardware design? I see where it mentions the SPEs can act as vertex shaders, can they act as pixel shaders too? Sounds really cool, Sony's marketing should really focus on the synergy of all the chips working together.

Read this:

Since that interview with David Kirk conducted by Zenji Nishikawa contains so little words from Kirk that it's almost dedicated to technical illustration for readers and speculation by Nishikawa, I have little to add to what Mikage's post. Kirk says nothing about specifics about RSX, but only about how this RSX-CELL system works. Anyway here's a bit more descriptive version of the latter half of the article -

David Kirk: SPE and RSX can work together. SPE can preprocess graphics data in the main memory or postprocess rendering results sent from RSX.

Nishikawa's speculation: for example, when you have to create a lake scene by multi-pass rendering with plural render targets, SPE can render a reflection map while RSX does other things. Since a reflection map requires less precision it's not much of overhead even though you have to load related data in both the main RAM and VRAM. It works like SLI by SPE and RSX.

David Kirk: Post-effects such as motion blur, simulation for depth of field, bloom effect in HDR rendering, can be done by SPE processing RSX-rendered results.

Nishikawa's speculation: RSX renders a scene in the main RAM then SPEs add effects to frames in it. Or, you can synthesize SPE-created frames with an RSX-rendered frame.

David Kirk: Let SPEs do vertex-processing then let RSX render it.

Nishikawa's speculation: You can implement a collision-aware tesselator and dynamic LOD by SPE.

David Kirk: SPE and GPU work together, which allows physics simulation to interact with graphics.

Nishikawa's speculation: For expression of water wavelets, a normal map can be generated by pulse physics simulation with a height map texture. This job is done in SPE and RSX in parallel

It's very flexible, there's just so many possibilities - it's all up to dev to exploit.
 
Nerevar said:
so what does this mean for the RSX, especially re: the unified shader architecture of the ATi chip? Will offloading some of the workload onto CELL help them make up for the deficiencies of a rigid hardware design?

If your workload targets the hardware, as it should in a closed box, there should be no relative deficiencies. Maybe you mean less flexibility?

The GPU in PS3 alone is probably more powerful than X360's on balance, from the looks of things - though we could do with a bit more detail ;)

Nerevar said:
I see where it mentions the SPEs can act as vertex shaders, can they act as pixel shaders too?

They can pixel shade, but not as well as they can do other things...

I think the big potential would be for simulation of things that you usually couldn't do dynamically..perhaps. I'd love to hear Pana's thoughts :)
 
Nerevar said:
so what does this mean for the RSX, especially re: the unified shader architecture of the ATi chip? Will offloading some of the workload onto CELL help them make up for the deficiencies of a rigid hardware design? I see where it mentions the SPEs can act as vertex shaders, can they act as pixel shaders too? Sounds really cool, Sony's marketing should really focus on the synergy of all the chips working together.

Having the possiblity to assign parts of the graphical 3D pipeline to CPU or GPU as programmers want isn't really a prerogative of a rigid hardware design :lol :lol
 
gofreak said:
I think the big potential would be for simulation of things that you usually couldn't do dynamically..perhaps.

Indeed - think PS3 Getaway WORLD DEMO - it was mostly done on Cell. (with RSX being idle most of the time)
 
Izzy said:
Read this:

Nishikawa's speculation: You can implement a collision-aware tesselator and dynamic LOD by SPE.

What's that mean? Does it mean we'll get stuff like pieces of cloth calculating for collision against each other, and thus not intersecting each other?
 
Zaptruder said:
What's that mean? Does it mean we'll get stuff like pieces of cloth calculating for collision against each other, and thus not intersecting each other?

In that particular case, an SPE would perform both the physics and the tessalating part, leaving RSX free to do something else.
 
gofreak said:
If your workload targets the hardware, as it should in a closed box, there should be no relative deficiencies. Maybe you mean less flexibility?

The GPU in PS3 alone is probably more powerful than X360's on balance, from the looks of things - though we could do with a bit more detail ;)

sorry, I meant less flexible (as you pointed out). I agree that RSX is "more powerful", and I never doubted that, but it seemed to sacrifice some flexibility for greater raw power. Allowing it to work in concert with the CELL CPU / SPE's will alleviate this, which is why it sounds like a lot of cool "synergy" in there. I'm no EE, but it does sound like a pretty elegant design (much moreso than they let on to).
 
Nerevar said:
sorry, I meant less flexible (as you pointed out). I agree that RSX is "more powerful", and I never doubted that, but it seemed to sacrifice some flexibility for greater raw power. Allowing it to work in concert with the CELL CPU / SPE's will alleviate this, which is why it sounds like a lot of cool "synergy" in there. I'm no EE, but it does sound like a pretty elegant design (much moreso than they let on to).

From another article:

1) NVIDIA stated that they had never had as powerful a CPU as Cell, and thus the RSX GPU has to be able to swallow a much larger command stream than any of the PC GPUs as current generation CPUs are pretty bad at keeping the GPU fed.

2) The RSX GPU has a 35GB/s link to the CPU, much greater than any desktop GPU, and thus the turbo cache architecture needs to be reworked quite a bit for the console GPU to take better advantage of the plethora of bandwidth. Functional unit latencies must be adjusted, buffer sizes have to be changed, etc...

David Kirk (nVIDIA) said:

1) RSX can use XDR-RAM(256MB) as VRAM too.

2) 7 SPEs and RSX can work togehter as a total GPU. SPE as vertex shader
, post processing a rendering result from RSX etc...

It seems to be really well intengrated.
 
We knew about this interactio from last year. One of the NVidia guys had that presentation showing how the next-gen of cards would try to focus on 2-way interaction between CPU and GPU. And how they're now blocked lumped together in one big block, essentially.

And as for the 360, it has the memory controller on Xenos, which has access to UMA. I wouldn't doubt that you could do the same sort of thing with it. They have ~20GB between them. But Cell's FLOPS advantage is gonna pay dividends in this regard. PEACE.
 
The Getaway and that satelite data landscape demo was done with Cell, not with the GPU.

Pimpwerx said:
And as for the 360, it has the memory controller on Xenos, which has access to UMA. I wouldn't doubt that you could do the same sort of thing with it. They have ~20GB between them. But Cell's FLOPS advantage is gonna pay dividends in this regard. PEACE.

Not only has the Cell more flops, the architecture is much better suited for graphics processing. Hey, initialy it was planned to use Cell for the GPU as well, that alone should tell you how good Cell is for this kind of things.

Fredi
 
fortified_concept said:


Phil Harrison (SCEE):
Was most of what we saw really just showing off the graphics capabilities - stretching the RSX graphics part rather than the Cell chip? The assumption is that Cell is there for complex physics and AI...

You're right; obviously Cell allows you to do complex collisions, physics, dynamics, simulations, all of those things. Though, the Getaway demo was a good example of how you can have a living city brought to life as a result. Although it was pretty graphics, most of that power was actually Cell-based.

There.
 
Pimpwerx said:
We knew about this interactio from last year. One of the NVidia guys had that presentation showing how the next-gen of cards would try to focus on 2-way interaction between CPU and GPU. And how they're now blocked lumped together in one big block, essentially.

I think you're talking about this one:

hofstee45ti.jpg


:)

Fredi
 
Well, I just hope the X360 GPU can keep up.

To be honest, since these console are using more PC tech I'm actually starting to understand more about these consoles at launch. I hope the X360 GPU could hold up against the PS3 GPU but you gotta be crazy to think it will be more powerful. Forget aboutt the CPUs, 6 months is a long time in the world of video cards. 6 months can be a generational difference. If Sony/Nvidia hit a target higher than MS/ATI the Nvidia GPU could not only be more powerful, but a genration ahead of the X360GPU. Thus, the PS3 demos we saw at the Sony conference.
 
McFly said:
The Getaway and that satelite data landscape demo was done with Cell, not with the GPU.





Fredi

WOW! The Getaway looked amazing and it was only running on Cell. I would have been happy with just those quality graphics next gen, but it seems Sony set the bar even higher. :D
 
Top Bottom