• Hey Guest. Check out your NeoGAF Wrapped 2025 results here!

Xbox 360's "1 TFLOP+" semi-explained

xexex

Banned
http://www.beyond3d.com/forum/viewtopic.php?t=25399

CPU:
- 12 flops / clock cycle and Core
- 12 x 3.2 Ghz x 3 = 115.2 GFlops

GPU:
- 10 flops / clock cycle and ALU
- 10 x 500 Mhz x 48 ALUs = 240.0 GFlops

Non Programables GFlops = 697

Total = 1052.2 GFlops = 1 TFlop.

x34.jpg



so while this does not make Xbox 360 a true 1 TFLOPs system as far as general purpose, programmable flops, it does show how when everything in combined together, you get over 1 teraflop of combined performance. about 1/3 of that (355 Gflops) is actually programmable (CPU plus programmable portion of GPU)


Xbox 360 would really spank an 1980s (maybe even late 80s) supercomputer :D

.......and so would PS3
 
xexex said:
so while this does not make Xbox 360 a true 1 TFLOPs system as far as general purpose, programmable flops, it does show how when everything in combined together, you get over 1 teraflop of combined performance.

It shows us nothing we didn't know already ;) It doesn't really explain anything. We've always known that the undefined "non programmable flops" were 697 Gflops, since we've always been able to work out the programmable flops (and it's a simple matter of subtraction after that..). I don't think we'll ever get an explanation of the non-programmable stuff..I wouldn't expect one. It's just "gpu flops".
 
Kleegamefan said:
Still cant see shit @ the B3D console forums and it wont allow me to make a new account either :(

Still? I'll PM a mod again. They said they'd look into it..
 
BirdySky said:
Isn't the Ps3's cpu 2.1terrafliops alone?

XOwned. If true. Probably not.


PS3 CPU is 218 Gflops (less than 1/4th of a teraflop) alone.


-PS3 has more flops performance than Xbox 360

-Xbox 360 has more general purpose computing power than PS3

-Xbox 360 has more inter-chip graphics memory bandwidth than PS3.
 
sangreal said:

Eh well I just searched to confirm I wasn't imagining things and..

Processor Core Spec 1 Core, 7 x SPE 3.2GHz (256KB SRAM per SPE), 7 x 128b 128 SIMD GPRs
Marketing Performance Measurement: 2.18 TFLOPs
Processor Clock Speed 3.2GHz
L2 Cache 512KB L2 cache, 256KB per SPE
Processor Cell processor

Friom gamespot. "Marketing performance"...whatever that manes..but still, an whole terraflop for Sony's Marketing men to play about with.

-Xbox 360 has more general purpose computing power than PS3

-Xbox 360 has more inter-chip graphics memory bandwidth than PS3.

You're suggesting xbox360's 3 cpu's are more powerful than sony's 7 core cell?
That's the first time I've heard that. Surprising.
And going on the strength of the demos..well..
 
BirdySky said:
Eh well I just searched to confirm I wasn't imagining things and..



Friom gamespot. "Marketing performance"...whatever that manes..but still, an whole terraflop for Sony's Marketing men to play about with.

That is for CPU + GPU

Cell is rated at 218gflops

You didn't provide a link so I can't see the context, but if gamespot is talking only about the cell they are mistaken.
 
sly said:
Non Programables GFlops = 697

What does this mean and how is it calculated?


probably in a similar way that Nvidia got the 80 Gflops from Xbox's NV2A GPU. all the things that the GPU does to render graphics that the programmer does not have total control of. even though the developer uses it. its hard to explain.
 
midnightguy said:
Xbox 360 has more inter-chip graphics memory bandwidth than PS3.

That is only true within the eDRAM daughter board in Xenos/C1......in that case, can I count the memory BW of SRAM(times seven, because they can be parallel) in each SPU too :)


According to the goto translated article, the latest estimates put RSX at around 400Gflops of programable shader power

http://forum.gaming-age.com/showthread.php?p=1631856#post1631856

If that is true, that would put PS3 at ~618Gflops of programable math(218+400) vs X360, which has 355Gflops of programmable math (115+240)...

So if all this is true, this puts PS3 at about double the power of X360:)

Put another way, CELL alone is about 2/3rds as powerfull as the entire X360...
 
BirdySky said:
Eh well I just searched to confirm I wasn't imagining things and..



Friom gamespot. "Marketing performance"...whatever that manes..but still, an whole terraflop for Sony's Marketing men to play about with.



You're suggesting xbox360's 3 cpu's are more powerful than sony's 7 core cell?
That's the first time I've heard that. Surprising.
And going on the strength of the demos..well..


in general purpose applications that need integer performance, yes. Xbox 360 CPU is more powerful. when floating point performance is needed, the PS3 Cell CPU is more powerful.

unless the 7 SPEs can be coded for integer, general purpose needs, then the Cell CPU would beat XeCPU.

of course, we are still talking about theoretical performance peaks. some developers have said that in the real world, in games, the both CPUs will be much closer in performance and that Cell's advantage over XeCPU is much much smaller than the FLOPs rating would suggest.
 
And we all know next-gen games will need floating point power, which is why both CELL and XeCPU are both geared towards floating point power...


BTW, what is the integer ops performance of XeCPU?

We know that the SPUs have the same number of integer ops as floating point ops (218)

unless the 7 SPEs can be coded for integer, general purpose needs, then the Cell CPU would beat XeCPU.

As I said, an SPU can do the same number of integer ops as floating ops:

http://www.blachford.info/computer/Cell/Cell1_v2.html

Synergistic Processor Elements (SPEs)


Each Cell contains 8 SPEs.

An SPE is a self contained vector processor which acts as an independent processor. They each contain 128 x 128 bit registers, there are also 4 (single precision) floating point units capable of 32 GigaFLOPS* and 4 Integer units capable of 32 GOPS (Billions of integer Operations per Second) at 4GHz. The SPEs also include a small 256 Kilobyte local store instead of a cache. According to IBM a single SPE (which is just 15 square millimetres and consumes less than 5 Watts at 4GHz) can perform as well as a top end (single core) desktop CPU given the right task.
 
Put another way, CELL alone is about 2/3rds as powerfull as the entiere X360...

main.jpg

Damn!.

Must suck to be Bill Gates right now. Second place, second best machine, second best face.(Ken's cooler...) nothing but billions of follars to eas..n..never mind.


in general purpose applications that need integer performance, yes. Xbox 360 CPU is more powerful. when floating point performance is needed, the PS3 Cell CPU is more powerful.

unless the 7 SPEs can be coded for integer, general purpose needs, then the Cell CPU would beat XeCPU.

of course, we are still talking about theoretical performance peaks. some developers have said that in the real world, in games, the both CPUs will be much closer in performance and that Cell's advantage over XeCPU is much much smaller than the FLOPs rating would sugges

I shouldn't imagine integers would be a problem. I use floats in place of integers all the time mainly due to it being easier than casting(Or quicker...)...but the trade-off of having seven cores to run your code on or the native speed of ints should be worth it.
 
I shouldn't imagine integers would be a problem. I use floats in place of integers all the time mainly due to it being easier than casting...but the trade-off of having seven cores to run your code on or the native speed of ints...bring it on..

As you can see in the link I provided, CELL SPU has the same number of integer ops as floating point ops (218 billion integer operations on the PS3 version of CELL)....

For some unknown reason, Microsoft only included the SPE in their whole "we have 3 times the general purpose" speil since they classify an SPE as a processor and the 7 SPUs as "something else" :lol
 
Kleegamefan said:
That is only true within the eDRAM daughter board in Xenos/C1......
Considering the entire purpose of the daughter die is to move bandwidth intensive tasks to a chip with extreme bandwidth, its contribution should not be discounted. (Nor should it necessarily be directly compared to the RSX)

According to the goto translated article, the latest estimates put RSX at around 400Gflops of programable shader power

http://forum.gaming-age.com/showthread.php?p=1631856#post1631856

If that is true, that would put PS3 at ~618Gflops of programable math(218+400) vs X360, which has 355Gflops of programmable math (115+240)...

So if all this is true, this puts PS3 at about double the power of X360:)

Put another way, CELL alone is about 2/3rds as powerfull as the entire X360...

This comparison seems dubious at best
 
We should be careful to clarify what we mean by "integer" operations. Do we mean purely mathematical integer operations, or branching aswell etc.?

I think it's questionable if XeCPU would be more powerful than Cell for mathematical integer, despite integer ops apparently being quite "second class" on the SPEs.

It's very difficult to quantify "general purpose" performance, and to come up with a metric to compare the two systems based on that (unlike for flops). It's not a matter of counting the cores on x360 and the PPE on Cell, and saying "it's a 3x difference". The SPEs don't do branch prediction, but you can branch, and they don't have hardware cache mechanisms, but you can implement a software cache with some penalty for tasks requiring random access to memory and large datasets. SPEs can accomodate some tasks very naturally, and the rest are a very open question. If you start (successfully) casting more and more "general purpose" tasks to SPEs..do they suddenly become "general purpose"? My point is, the SPE's contribution to "general purpose" processing should be non-zero.

It's also interesting to ask where the boundary is between general purpose cores and specialised cores, because relative to the desktop CPUs, neither of the two consoles' "general purpose" cores are particularly "general purpose".

Furthermore, it's interesting to ask how relevant "general purpose" performance is. I don't think it's as relevant as MS's PR suggests. If it truly was, why would they themselves have designed a chip so focussed on floating point and streaming performance? I'm not sure if the greater "balance" between that and "general purpose" performance was by design rather than a consequence of a simpler and less exotic architecture.

It seems to me that Cell is biased toward optimisation for the most computationally intensive tasks..many of the things that eat most CPU time would appear to be a good fit for the SPEs, for example. Does it matter if some things don't run well on them, if they're handled rather trivially on the PPE? (And that seems to be the suggestion from some devs that have spoken, specifically Epic).
 
What would be the average fall-off rate of wasted performance with RSX's non-unified shader architecture? (as opposedto the supposed 100% efficient ATi unified shaders?)

And where does the non-programable come from, and what is the 'non programable' on RSX?
 
sangreal said:
Considering the entire purpose of the daughter die is to move bandwidth intensive tasks to a chip with extreme bandwidth, its contribution should not be discounted. (Nor should it necessarily be directly compared to the RSX)


I am not saying it should be discounted, I am saying if we are counting it, lets count SPU SRAM too :)
 
Kobold said:
What would be the average fall-off rate of wasted performance with RSX's non-unified shader architecture? (as opposedto the supposed 100% efficient ATi unified shaders?)

Unknown, since it's a closed architecture.

On PC's, ATi (gu)estimates that average utilisation is between 50% and 70%. Two things to remember though:

1) Much of a Unified architecture's attraction hinges on the notion that there is a utilisation problem with current architectures, so don't expect them to downplay that ;)

2) PC games are targetting multiple different configurations and ratios of vertex:pixel shaders. One might expect a game designed for a closed system to fare better since the average instruction mix could be optimised to make best use of the ratio enshrined in the hardware.

One can ask about utilisation, but one can also query per-shader efficiency. Do you lose efficiency with a unified shader vs a dedicated shader for specific tasks? That's also another variable to consider.

Assuming a minimal loss of efficiency on the shader level, assuming good load balancing implementation etc. and crucially assuming all else were equal, I'd easily take the unified architecture over a dedicated one, especially in an open environment like the PC where you can't optimise for one set of hardware. But the situation we find ourselves comparing here is not so clear.

RSX's Gflops figures are a matter of debate right now. There's a couple of ways to count them.

Vertex Shaders are fairly clearly contributing 44Gflops.

The pixel shaders contribute anything from 211Gflops to 356Gflops depending on what you count or don't count. My figure of choice is 264Gflops (main ALUs + mini ALUs contribution, doesn't count "free" 16-bit normalisation), for a total of 308Gflops. There are a couple of other caveats to note if comparing to Xenos.

Kobold said:
And where does the non-programable come from, and what is the 'non programable' on RSX?

Non programmable = claimed figure minus programmable figure. So for RSX that'd be 1.8Tflops minus your choice of programmable figures ;)

sangreal said:
Why? I don't see how they alleviate GPU bandwidth constraints at all?

Theoretically it could. You could do some ops on Cell using the LS bandwidth that ordinarily would eat external memory bandwidth.
 
dorio said:
Anandtech says they both suck and can't equal the performance of a 386, so there! :)
Rock on! :D I loved the 386, thusly, I love next-gen!

Thanks, Gofreak. Once more it's 'let's wait and see'. :)
 
sangreal said:
I guess I should pay more attention. I'm pretty sure that nonsense just comes from Major Nelson's article

Hey, no problem.....and I hope you don't think I was attacking or belittling your opinion...that was not my intent...

Just friendly banter :)


This thread isn't about True Fantasy Live Online Plus. Damn.

:lol


That was clever.....cant believe noone else in the thread caught that :D
 
BirdySky said:
Eh well I just searched to confirm I wasn't imagining things and..



Friom gamespot. "Marketing performance"...whatever that manes..but still, an whole terraflop for Sony's Marketing men to play about with.



You're suggesting xbox360's 3 cpu's are more powerful than sony's 7 core cell?
That's the first time I've heard that. Surprising.
And going on the strength of the demos..well..

Applied in the context of gaming, no..It isn't.
 
Kleegamefan said:
Hey, no problem.....and I hope you don't think I was attacking or belittling your opinion...that was not my intent...

Just friendly banter :)

Of course, its why we have forums! ;)
 
Top Bottom