Intel Conroe vs Cell

Count Chocula said:
I would venture to say that Cell is more efficient than any X86 architecture. Especially at gaming applications.

I would venture to say that you're both correct and incorrect. It absolutely toasts current architecture in some ways, while falling extremely short in others (branch prediction comes to mind).
 
I'll take a Kentsfield (Two conroes on one die, quad core baby :P) over the Cell really. The Kentsfield would probably offer superior general performance, the Cell really isn't a jack of all trades from what I've read.
 
Marathon said:
Everything you just wrote is complete bullshit.
Bullshit because Sony fanboys don't want to hear it, or bullshit because you think John Carmack doesn't know what he's talking about?

Read Carmack's QuakeCon 2005 Keynote speech here. Of particular interest are the Console Development and Physics and AI sections, but I think you'll find this quote is sufficient enough:

"But if you look at the current platforms, in many ways, it’s not quite as powerful as it sounds if you add up all the numbers and flops and things like that. If you just take code designed for an x86 that’s running on a Pentium or Athlon or something, and you run it on either of the PowerPCs from these new consoles, it’ll run at about half the speed of a modern state of the art system, and that’s because they’re in-order processors, they’re not out-of-order execution or speculative, any of the things that go on in modern high-end PC processors. And while the gigahertz looks really good on there, you have to take it with this kind of “divide by two” effect going on there."

So I guess I was BSing when I said 25% off on real-world performance... I should have said 50% off.

I also suggest you read this entire article at Ars Technica. Even just the last page will do...particularly this paragraph:

"At any rate, Playstation 3 fanboys shouldn't get all flush over the idea that the Xenon will struggle on non-graphics code [NOTE: this means AI and Physics code]. However bad off Xenon will be in that department, the PS3's Cell will probably be worse. The Cell has only one PPE to the Xenon's three, which means that developers will have to cram all their game control, AI, and physics code into at most two threads that are sharing a very narrow execution core with no instruction window. (Don't bother suggesting that the PS3 can use its SPEs for branch-intensive code, because the SPEs lack branch prediction entirely.) Furthermore, the PS3's L2 is only 512K, which is half the size of the Xenon's L2. So the PS3 doesn't get much help with branches in the cache department. In short, the PS3 may fare a bit worse than the Xenon on non-graphics code, but on the upside it will probably fare a bit better on graphics code because of the seven SPEs."

Note that the ariticle is an inside look at the 360's CPU - Xenon - but the author (who isn't a fanboy) felt it necessary to point out that the PS3 is an even worse victim of the same pitfalls.

Marathon said:
There wouldn't be a company selling a Cell like add-on board for x86 PCs if anything you wrote was remotely true.
If you're talking about the AGEIA PhsyX processing unit, you should know that it is NOTHING like the CELL. In fact, the PhysX chip is more like a GPU than anything else (which is why ATi have announced that their GPUs can be used as standalone Physics processors and nVidia has signed a deal with Havok to make middleware that supports their GPUs). Plus the PhysX chip has a 128megs of DEDICATED (as in, onboard) ram -- as such the amount of cache on the die is moot.

... did you think I was just making this stuff up or something? Do you actually think next-gen consoles are substaintially more powerful than High-End PCs? What about you Chiggs/Lynux3?
 
-Rogue5- said:
If you just take code designed for an x86 that’s running on a Pentium or Athlon or something, and you run it on either of the PowerPCs from these new consoles, it’ll run at about half the speed of a modern state of the art system ...

Notice that part I bolded?

If you're talking about the AGEIA PhsyX processing unit, you should know that it is NOTHING like the CELL.

Not according to Ageia IIRC.
 
Onix said:
Notice that part I bolded?
Did you read/watch the entire thing? He goes on to say something like you can get a little more performance than half, but your code will have to be system specific... Any code that's multiplatform will have this limitation. Developers will need to program until their fingers bleed, and the returns won't be substaintial. There is no magic compiler that will make In-Order code run better than Out-of-Order code, especially if you're used writing code in an Out-of-Order fashion (like the ENTIRE industry is).

Onix said:
Not according to Ageia IIRC.
Any differences AGEIA claims is probably marketing to try and make their products look vastly different from a GPU. Why would they do this? In order to differentiate themselves enough so that they aren't competing with ATi and nVidia in the PPU department (because they'd get annhilated).
 
-Rogue5- said:
Did you read/watch the entire thing? He goes on to say something like you can get a little more performance than half, but your code will have to be system specific... Any code that's multiplatform will have this limitation.

of COURSE your ****ing code is going to be tailored to the goddamn architecture you're targeting.
 
_leech_ said:
Carmack's a smart guy, but when it comes to consoles he's as brain dead as any other PC developer.

Why don't we hold off on saying stuff like that until we see the original IP he's working on--with the Xbox 360 as the lead platform. We should at least give the man that. Sheesh.

blackadde said:
of COURSE your ****ing code is going to be tailored to the goddamn architecture you're targeting.

Well, that's how it should be, but... Say, did you ever play the MGS2 port for the Xbox?
 
blackadde said:
of COURSE your ****ing code is going to be tailored to the goddamn architecture you're targeting.

I seem to have struck a nerve here. I didn't mean to upset everyone.

Only the first party developers will be tailoring their code specifically for one architecture only... But that's nothing new. It still doesn't get around the fact that the Cell isn't the "be all and end all" of physics, AI, or non-graphics processing. It's the opposite, it's good at graphics, but there are huge limitations with the design that cause it to suffer in other places... so much so that I would say the PS3 version of it (7SPEs and 1 PPE) isn't as fast/efficient/good as a high end Core2Duo in real-world performance (which is the only performance that matters).

Is it far better than PS2? YES, even with these limitations it will be way better at ALL areas. Is it better than a Core2Duo? Not so much.

As far as I'm concerned, I'll take Carmack's/Hannibal's word over most GAFers's any day.
 
_leech_ said:
Carmack's a smart guy, but when it comes to consoles he's as brain dead as any other PC developer.
No.
And he's not a PC-only developer. He's making a 360/PS3/PC title atm with the 360 as lead platform if I'm not mistaken.
 
-Rogue5- said:
Bullshit because Sony fanboys don't want to hear it, or bullshit because you think John Carmack doesn't know what he's talking about?

Read Carmack's QuakeCon 2005 Keynote speech here. Of particular interest are the Console Development and Physics and AI sections, but I think you'll find this quote is sufficient enough:

"But if you look at the current platforms, in many ways, it’s not quite as powerful as it sounds if you add up all the numbers and flops and things like that. If you just take code designed for an x86 that’s running on a Pentium or Athlon or something, and you run it on either of the PowerPCs from these new consoles, it’ll run at about half the speed of a modern state of the art system, and that’s because they’re in-order processors, they’re not out-of-order execution or speculative, any of the things that go on in modern high-end PC processors. And while the gigahertz looks really good on there, you have to take it with this kind of “divide by two” effect going on there."

So I guess I was BSing when I said 25% off on real-world performance... I should have said 50% off.

I also suggest you read this entire article at Ars Technica. Even just the last page will do...particularly this paragraph:

"At any rate, Playstation 3 fanboys shouldn't get all flush over the idea that the Xenon will struggle on non-graphics code [NOTE: this means AI and Physics code]. However bad off Xenon will be in that department, the PS3's Cell will probably be worse. The Cell has only one PPE to the Xenon's three, which means that developers will have to cram all their game control, AI, and physics code into at most two threads that are sharing a very narrow execution core with no instruction window. (Don't bother suggesting that the PS3 can use its SPEs for branch-intensive code, because the SPEs lack branch prediction entirely.) Furthermore, the PS3's L2 is only 512K, which is half the size of the Xenon's L2. So the PS3 doesn't get much help with branches in the cache department. In short, the PS3 may fare a bit worse than the Xenon on non-graphics code, but on the upside it will probably fare a bit better on graphics code because of the seven SPEs."

Note that the ariticle is an inside look at the 360's CPU - Xenon - but the author (who isn't a fanboy) felt it necessary to point out that the PS3 is an even worse victim of the same pitfalls.


If you're talking about the AGEIA PhsyX processing unit, you should know that it is NOTHING like the CELL. In fact, the PhysX chip is more like a GPU than anything else (which is why ATi have announced that their GPUs can be used as standalone Physics processors and nVidia has signed a deal with Havok to make middleware that supports their GPUs). Plus the PhysX chip has a 128megs of DEDICATED (as in, onboard) ram -- as such the amount of cache on the die is moot.

... did you think I was just making this stuff up or something? Do you actually think next-gen consoles are substaintially more powerful than High-End PCs? What about you Chiggs/Lynux3?

Your argument is misguided. Yes, Carmack approves of pc architecture more than he does of in-order console cpus which are designed with bang for the buck in mind. You must remember that Carmack's home is the pc environment. He is probably used to writing branchy, stringy code into his games because pc cpus afford him that luxury. Just because you have to code differently for consoles does not imply that consoles are underpowered. Carmack's bias figures nicely into his point-of-view on this matter. It is in his best interest to keep the architecture in-line with how he likes to code.

Yes, these consoles (X360 and PS3) will not run branchy, stringy code as well as the pc architecture. They never had that goal in mind in the first place. They had console developers in mind who like to push an architecture to the breaking point with down-to-the-metal code. PC = open-box environment. Console = closed-box. The transistor budget is configured for maximum umph in a very specific closed-box environment--not maximum efficiency for code that needs to be flexible and scalable in an open-box environment.

Most anything that can be done with loops can be done without loops, as well. It just takes a lot more code and a different mode of thinking than what most programmers are used to. This code can be synthesized with certain types of compilers/decompilers. Programmers are going to need to learn new stuff this generation to deal with parallelism because it's not going away. It is just going to get more ridiculous as time goes on.
 
Wollan said:
Would be nice if anyone could try to give a comparison, theory or thoughts between the two cpu's for game-applications.

OMG what... have... you... done? o_0

In that direction lies nothing but madness and the howling cries of fanboys.
 
Chiggs said:
Why don't we hold off on saying stuff like that until we see the original IP he's working on--with the Xbox 360 as the lead platform. We should at least give the man that. Sheesh.

Well it'd be a lot easier to focus on his console work if he would talk about that instead of bitching and crying about everything. He's almost as annoying as Gabe Newell.
 
Moderation Unlimited said:
Your argument is misguided. Yes, Carmack approves of pc architecture more than he does of in-order console cpus which are designed with bang for the buck in mind. You must remember that Carmack's home is the pc environment. He is probably used to writing branchy, stringy code into his games because pc cpus afford him that luxury. Just because you have to code differently for consoles does not imply that consoles are underpowered. Carmack's bias figures nicely into his point-of-view on this matter. It is in his best interest to keep the architecture in-line with how he likes to code.

Yes, these consoles (X360 and PS3) will not run branchy, stringy code as well as the pc architecture. They never had that goal in mind in the first place. They had console developers in mind who like to push an architecture to the breaking point with down-to-the-metal code. PC = open-box environment. Console = closed-box. The transistor budget is configured for maximum umph in a very specific closed-box environment--not maximum efficiency for code that needs to be flexible and scalable in an open-box environment.

Most anything that can be done with loops can be done without loops, as well. It just takes a lot more code and a different mode of thinking than what most programmers are used to. This code can be synthesized with certain types of compilers/decompilers. Programmers are going to need to learn new stuff this generation to deal with parallelism because it's not going away. It is just going to get more ridiculous as time goes on.

Understandably so, but that doesn't detract from the hardware (transistor) shortcomings mentioned in the Ars Technica article. You can't stream that amount of information and there isn't enough storage to hold massive amounts of it. You can't compare 4megs of L2 cache shared between only two cores and 512k cache shared between 7SPEs and 1 PPE...even with massive amounts of programming wizardry.

And all of this is besides the point -- the fact of the matter is, I think Core2Duo will perform better. By the time developers get used to the Cell to harness even 80% efficiency, there will be Quad-Core Conroes available... maybe even with SMT (which means eight threads). As an architecture, Conroe is superior.
 
aaaaa0 said:
OMG what... have... you... done? o_0

In that direction lies nothing but madness and the howling cries of fanboys.

_leech_ said:
Well it'd be a lot easier to focus on his console work if he would talk about that instead of bitching and crying about everything. He's almost as annoying as Gabe Newell.

See what I did there?
 
Most programmers for any platform will agree that there is a lot of power in each of the next-gen consoles (PS3 and X360). Tapping it with their current mode of thinking is going to be a pain in the ass. They are going to have to spend more time doing research and development and relearning how to do things different ways. No one wants to do this, but we've almost hit a wall with the traditional architecture for processing power, as physics is giving us a hard time going faster and putting more transistors in the same amount of space.
 
Moderation Unlimited said:
It's quite similar in many regards, but quite different in many, too (how bandwidth is distributed, memory architecture, in-order vs. out-of-order).

Yeah and Carmack addresses these things in his speech.
 
Moderation Unlimited said:
Most programmers for any platform will agree that there is a lot of power in each of the next-gen consoles (PS3 and X360). Tapping it with their current mode of thinking is going to be a pain in the ass. They are going to have to spend more time doing research and development and relearning how to do things different ways. No one wants to do this, but we've almost hit a wall with the traditional architecture for processing power, as physics is giving us a hard time going faster and putting more transistors in the same amount of space.

Conroe proves that bolded statement wrong. You're getting architecture and multiprocessor/multicore coding confused. Multicore x86s (like conroe) are difficult to code for becuase their SMP, but their architecture is not because their x86 -- Xenon's and Cell's architecture AND Multicore designs are both hard(er) to program for... particularly the CELL's because it uses the SPEs rather than three symmetric cores like the Xenon's PPEs. Not only that, but Microsoft rapes Sony's ass when it comes to development suites.

Not once did I say the 360 or PS3 were weak/crappy... Both are going to be awesome and do some pretty amazing things. I'm saying that, as an architecture, Conroe is better than Cell. Even if you learn to take full advantage of the CELL, I think Conroe is a better architecture... Both are scalable, both are multicore, Cell maybe more difficult to program for, but all of that is besides the point.

As for the whole physics and AI performance of the CELL (or lack thereof), that was just a side note to prove I wasn't talking out my ass and that everything I said was substantiated.
 
-Rogue5- said:
As an architecture, Conroe is superior for a general purpose pc.

Fixed.

There seems to be this misconception that game applications are heavily general purpose, and they are not. There is relatively a smaller amount of data being worked on in a repetitive fashion, lets say, as compared with a program like Microsoft Word. MS Word is highly reliant on stringy, branchy code because of all of the random possibilities that can occur. Things are not so random in games. What appears as randomness is either scripted, canned, or simulated. Developers can do a lot more than you think with what is available on the CELL and Xenon both.

Another thing, threads have nothing to do with capacity. Rather, they have to do with parallelism. Having more hardware threads means that more programs can be run independently of one another. The amount of power each thread possesses is something altogether different. In fact, not all threads are created equal. Two single threaded processors would be much more effective than a single dual-threaded processor at running the same code. The single dual-threaded processor effectively splits its resources in half for each thread, and thus its output.

I did not mistake the difference between physical architecture and coding architecture. They are related on many levels. If you want to get more power out of a certain number of transistors, you configure them to do a specific kind of work very fast (in almost a DSP-like fashion). You sacrifice general purpose extensions, long-word instruction sets, and branch prediction for short, repetitive reduced instruction set code. You, the programmer, work around the phyiscal architecture in a closed-box system, not the other way around. If maximum computing power for the least amount of money is the ultimate goal of your architecture (consoles), making some sacrifices is a must. Will the maximum potential of the PS3's Cell (218 Gflops) and X360's Xenon (115 Gflops) be greater for game playing as compared with Conroe's Quad Core? I bet you the Conroe Quad Core might hit 50 Gflops on a good day. Once devs get good at in-order programming, Conroe won't be able to compare.
 
-Rogue5- said:
Read Carmack's QuakeCon 2005 Keynote speech here.
The quote is not particularly relevant. The 2 console PPCs are designed around using hw-threads to improve execution scheduling.
The notion that you take single-threaded code to them and complain how it runs is similar to disabling OOOe on your x86 and wonder what happened to the performance.

That said, in-order execution is actually the least of your worries on those PPC cores - there are bigger and far more concerning things that affect performance, but that's a whole other can of worms.

I'm saying that, as an architecture, Conroe is better than Cell.
Would help if you explained what you mean by architecture.
I sincerely doubt Conroe would be a cost competitive solution in a game console(for current gen anyway) for instance, but I'm not gonna claim this says anything about architectural superiority (or lack of thereof) - it's just part of respective design targets.
 
-Rogue5- said:
Conroe proves that bolded statement wrong. You're getting architecture and multiprocessor/multicore coding confused. Multicore x86s (like conroe) are difficult to code for becuase their SMP, but their architecture is not because their x86 -- Xenon's and Cell's architecture AND Multicore designs are both hard(er) to program for... particularly the CELL's because it uses the SPEs rather than three symmetric cores like the Xenon's PPEs. Not only that, but Microsoft rapes Sony's ass when it comes to development suites.

Symmetrical, OOOe, CISC processors will never output as much power as a configuration like CELL (Asymmetrical, In-order, RISC) will output because of the transistor overhead. It is not possible given the contraints of physics. On a transistor to transistor basis, an Asymmetrical configuration (like above) will physically allow for more power.

Harnessing the power mentioned above is a different story. It has to do with factors such as relearning to code differently which is based on developers' willingness to do so. If people are unwilling to do something because they are unwilling to invest the time and money, than that is not a function of available power afforded by the physical architecture.
 
Fafalada said:
Would help if you explained what you mean by architecture.
I sincerely doubt Conroe would be a cost competitive solution in a game console(for current gen anyway) for instance, but I'm not gonna claim this says anything about architectural superiority (or lack of thereof) - it's just part of respective design targets.

Thank you, Mr. Developer, for backing my argument. Bang for the buck (raw fp power-wise), Xenon and CELL clean house compared to a Conroe. A general purpose pc chip cannot be compared with a closed-box console chip. The pc chip is going to be better for what it was designed for and the console chip is going to be better for what it is designed for. For games, the console chip is going to be better.
 
jett said:
I really thought Cell wasn't going to turn out to be the Emotion Engine 2. I think even IBM has no plans to use the thing in the future. :P

Better luck next time Kutaragi, if there is a next time for you.

WHAT?!?! So nobody else will be using the CELL processor? Okay.
 
Count Chocula said:
oh man that 4D joke never gets old. I'm still dying from the first time someone made that joke in this thread!

:lol

that is no joke son.

the ps3 will take console gaming out of the console getto and BLAST it into the 4d world



I BELIEVE
 
mckmas8808 said:
WHAT?!?! So nobody else will be using the CELL processor? Okay.

I called this a long time ago. Just as no one used the ps2 vector units even though everyone wanted to rave about the ****ing things forever ago.
 
the.tutelary said:
I called this a long time ago. Just as no one used the ps2 vector units even though everyone wanted to rave about the ****ing things forever ago.


You called what exactly? Explain please.
 
jett said:
I think even IBM has no plans to use the thing in the future.

jett said:
I bolded it for you so you take notice.

This isn't quite correct. Asides from announced plans which will continue to be rolled out in the future, IBM (and others) seem to have their eye on it for HPC (high performance computing, aka supercomputers). IBM has said it will be used "later" in that area, it was reported a while ago that Standford is building a Cell-based supercomputer, and Berkeley has a glowing evaluation of its potential for HPC, even in its current non-double-precision-optimised state.

Added on top of its existing foray into other markets, it seems nonsense to cast its success beyond Playstation as being as limited as EE's. It already enjoys much greater success than EE ever did, and it seems there's much more in its future.
 
the.tutelary said:
I called this a long time ago. Just as no one used the ps2 vector units even though everyone wanted to rave about the ****ing things forever ago.

So let me get this straight, you actually think any modern PS2 game actually has the vector units sitting idle the whole time?






:lol
 
gofreak said:
This isn't quite correct. Asides from announced plans which will continue to be rolled out in the future, IBM (and others) seem to have their eye on it for HPC (high performance computing, aka supercomputers). IBM has said it will be used "later" in that area, it was reported a while ago that Standford is building a Cell-based supercomputer, and Berkeley has a glowing evaluation of its potential for HPC, even in its current non-double-precision-optimised state.

Added on top of its existing foray into other markets, it seems nonsense to cast its success beyond Playstation as being as limited as EE's. It already enjoys much greater success than EE ever did, and it seems there's much more in its future.

I guess we'll have to wait and see just how much cell will actually be used on future applications. :P
 
jett said:
I guess we'll have to wait and see just how much cell will actually be used on future applications. :P


Are you going to be realistic about your expectations or are you going to be wildly crazy about it?
 
Count Chocula said:
I would venture to say that Cell is more efficient than any X86 architecture. Especially at gaming applications.
Efficiency is how much you get out for what you put in. That pretty much makes cell the least efficient chip ever and conroe the most efficient chip ever.
 
elostyle said:
Efficiency is how much you get out for what you put in. That pretty much makes cell the least efficient chip ever

If you're talking about programmer effort, well, not really..

Efficiency can and is measured against a number of requirements also, like power (wattage), cost etc. etc.
 
elostyle said:
Efficiency is how much you get out for what you put in. That pretty much makes cell the least efficient chip ever and conroe the most efficient chip ever.

I was thinking more along the lines of automatic vs. manual transmission.
In most cases manual is more efficient.

But yeah if your talking about someone who doesn't know how to drive then automatic would probably give better output.
 
Count Chocula said:
I was thinking more along the lines of automatic vs. manual transmission.
In most cases manual is more efficient.

But yeah if your talking about someone who doesn't know how to drive then automatic would probably give better output.
Right. Wouldn't the right word be "effective"? :)

It's a matter of what you look at. If it is possible computing power per transistor then yes, cell is more efficient. If it is programmer time/compiler complexity for achieved power then conroe is more efficient.
 
elostyle said:
If it is programmer time/compiler complexity for achieved power then conroe is more efficient.

I'd say that'd vary with the application domain, programmer experience etc. If that's measured as a ratio between effort and performance achieved, then Cell could offer a lot here. I mean, look at those benchmarks we had were Cell was in some cases dozens of times faster than the PPC and P4s they were compared against - did the Cell implementation require dozens of times the effort? (And it would necessitate more effort still before that ratio would look bad for Cell). Not to mention instances were a certain level of performance is a requirement in and of itself...it might be doable on Cell with a lot of effort, but on other chips it might not be doable at all. Or instances where performance is so important, that it does not matter so much if the effort required is out of proportion with the gain.

So I think it would very much depend, that's not as black and white as you might think.
 
elostyle said:
Right. Wouldn't the right word be "effective"? :)

It's a matter of what you look at. If it is possible computing power per transistor then yes, cell is more efficient. If it is programmer time/compiler complexity for achieved power then conroe is more efficient.

No, the word is still efficient. Manual transmissions are more efficient than automatic. More fuel efficient and more power efficient (when you know how to drive it properly).
 
Count Chocula said:
No, the word is still efficient. Manual transmissions are more efficient than automatic. More fuel efficient and more power efficient (when you know how to drive it properly).
Well then conroe is still more programmer efficient while cell is more transistor efficient (in theory).
 
Top Bottom