CELL PROCESSOR AND PS3 details

Shapingo

Member
http://www.eet.com/semi/news/showArticle.jhtml?articleId=54200580

SAN FRANCISCO — The eagerly anticipated Cell processor from IBM, Toshiba and Sony leverages a multicore 64-bit Power architecture with an embedded streaming processor, high-speed I/O, SRAM and dynamic multiplier in an effort, the partners hope, to revolutionize distributed computing architectures.

Although the technical aspects of the design, which has been in the works for nearly four years, are tightly held, details are emerging in excerpts from papers to be released today for the 2005 International Solid-State Circuits Conference(see story, page 94), as well as in patent filings.

The highly integrated Cell device has been billed as a beefy engine for Sony's Playstation 3, due to be demonstrated in May. But the architecture also addresses many other applications, including set-top boxes and mobile communications. Workstations fitted with the Cell architecture — a $2 billion endeavor — are already in the hands of game developers.




Five ISSCC papers from members of the 400-strong Cell processor team (see related story, "Best Development Teams," page 64) open peepholes onto a highly modular and hierarchical first-generation device implemented in 90-nanometer silicon-on-insulator (SOI) technology.

At root, the Cell architecture rests on two concepts: the "apulet," a bundle comprising a data object and the code necessary to perform an action upon it; and the "processing element," a hierarchical bundle of control and streaming processor resources that can execute any apulet at any time.

The apulets appear to be completely portable among the processing elements in a system, so that tasks can be doled out dynamically by assigning a waiting apulet to an available processing element. Scalability can be achieved by adding processing elements.

These ideas are not easily achieved. According to data from Paul Zimmons, a PhD graduate in computer science from the University of North Carolina at Chapel Hill, they require a highly intelligent way of dividing memory into protected regions called "bricks," careful attention to memory bandwidth and local storage, and massive bandwidth between processing elements — even those lying on separate chips.

At the top level, the architecture appears to be a pool of "cells," or clusters of perhaps four identical processing elements. All of the cells in a system — or for that matter, a network of systems — are apparently peers. According to one of the ISSCC papers on the Cell design, a single chip implements a single processing element. The initial chips are being built in 90-nm SOI technology, with 65-nm devices reportedly sampling.

Each processing element comprises a Power-architecture 64-bit RISC CPU, a highly sophisticated direct-memory access controller and up to eight identical streaming processors. The Power CPU, DMA engine and streaming processors all reside on a very fast local bus. And each processing element is connected to its neighbors in the cell by high-speed "highways." Designed by Rambus Inc. with a team from Stanford University, these highways — or parallel bundles of serial I/O links — operate at 6.4 GHz per link. One of the ISSCC papers describes the link characteristics, as well as the difficulties of developing high-speed analog transceiver circuits in SOI technology.

The streaming processors, described in another paper, are self-contained SIMD units that operate autonomously once they are launched.

They include a 128-kbyte local pipe-lined SRAM that goes between the stream processor and the local bus, a bank of one hundred twenty-eight 128-bit registers and a bank of four floating-point and four integer execution units, which appear to operate in single-instruction, multiple-data mode from one instruction stream. Software controls data and instruction flow through the processor.

Another ISSCC paper describes a dynamic Booth double-precision multiplier designed in 90-nm SOI technology.

Performance estimates


The processing element's DMA controller is so designed, it appears, that any chip in a system can access any bank of DRAM in the cell through a band-switching arrangement. This would make all the processing resources appear to be a single pool under control of the system software.

Giving scale to the performance targets for the project, one of the ISSCC papers puts the performance of the streaming-processor SRAM at 4.8 GHz. This suggests the data transfer rate for 128-bit words across the local bus within the processing element. When the Cell alliance was announced in 2001, Sony Computer Entertainment CEO Ken Kutagari estimated the performance of each Cell processor — a collection of apparently four processing elements in the first implementation — at 1 teraflops.

But UNC's Zimmons has his doubts. "I believe that while theoretically having a large number of transistors enables teraflops-class performance, the PS3 [Playstation 3] will not be able to deliver this kind of power to the consumer," he wrote in response to an e-mail query from EE Times. "The PS3 memory is rumored to be able to transfer around 100 Gbytes/second, which would mean it could process new data at roughly 25 Gflops (at 32 bits) — far from the 1-Tflops number."

Sony's 300-mm fab at Nagasaki, Japan, will run the 65-nm process and IBM Corp.'s fab in East Fishkill, N.Y., the SOI line.
 
Details trickle out on Cell processor

By Brian Fuller Ron Wilson
EE Times
November 28, 2004 (10:01 PM EST)

Sorry I thought it was new
 
New tidbit?

The highly integrated Cell device has been billed as a beefy engine for Sony's Playstation 3, due to be demonstrated in May. But the architecture also addresses many other applications, including set-top boxes and mobile communications. Workstations fitted with the Cell architecture — a $2 billion endeavor — are already in the hands of game developers.
 
Yeah, that would only be true in the worst case, IE if every single instruction required a load or a store, and every single load/store caused a cache miss/flush.

Most CPUs don't operate like that.
 
http://www.nytimes.com/2004/11/29/t...n=cbba2839fd80203e&ei=5006&partner=ALTAVISTA1

Today, the Sony Corporation and its entertainment arm as well as I.B.M. and Toshiba will reveal some of the first details of the Cell chip, jointly developed by the three companies, that will form the basis of the next generation of PlayStation game consoles.

The chip, which is still being designed, has been one of the most guarded secrets in the entertainment, semiconductor and computing industries since the companies started work on it in 2001.

Industry analysts expect the new PlayStation to be released in late 2005 or early 2006. The advanced chip will include multiple processors versatile enough to provide richer video images, multiplayer gaming and the addition of still pictures, audio and other media, the companies and analysts said.

Sony plans to introduce high-definition televisions powered by Cell in 2006, while I.B.M. says the Cell chip has the potential to be included in other consumer electronics and computing products, the companies said.

I.B.M. and Sony also said they were testing a workstation driven by the Cell chip that will be used by video game makers and producers of special effects.

To handle all these functions, the Cell will have separate microprocessors that manage specific jobs simultaneously, as well as allow for data to be sent and received at high speeds over broadband lines. This is a step ahead of the current generation of chips that typically have one microprocessor and can manage more limited amounts of data and functions at a time.

"The Cell processor is probably the first major change in chip processors in 10 to 20 years because there are multiple brains" inside the chip, said Richard Doherty, the president and research director at the Envisioneering Group, a technology consultant in Seaford, N.Y. "With previous chip generations, communications was an afterthought. It was just about how fast it goes and never about the richness of the gasoline."

Though Sony has said little about the successor to the PlayStation 2, analysts expect it to range far beyond games. The current machine has a DVD player and offers access to the Internet to allow for multiplayer games online.

But with Cell inside, the new machine may also be able to download satellite television signals and connect to digital cable set-top boxes, Mr. Doherty and other analysts said. Because the machine is expected to include a hard drive, users may also be able to store photos and videos. The Cell-driven machine is expected to display video and pictures in high definition.

The workstations with Cell inside will allow video game developers and special effects producers to create products in a fraction of the time it takes now, the companies said. That could reduce the cost of making movies and video games, a potential boon for movie studios.
 
deadlifter said:
Zuh?

Have they said this before?
The Cell was always meant to be used as a cheap yet powerful generic processor for set-top boxes and other electronics. It's the whole reason Toshiba is one of the developers. I would not be surprised if higher end products such as big screen TV's start using Cell processors around the time the PS3 is released, and once some of the R&D cost is earned back it will start to be put into electronics with smaller profit margins.
 
Suranga3 said:
Now for the question we all want to know. Is nintendo and MS d00med as a result of sony using the "cell"?
The only thing these articles really give any specifics about is the memory bandwidth, which is at least 4 times greater than the bandwidth listed in the unconfirmed early XBOX 2 specs. Of course external memory bandwidth is a relatively small factor in determining the total power of many computer architectures.

In other words, we don't know anything more about the total power of the Cell than we knew yesturday.
 
PR is out. PDF download at http://www.gamefront.de

Sony: 'IBM, Sony, Sony Computer Entertainment Inc. and Toshiba Unveil Cell Processor Companies Released First Details of Multicore Chip Comprising Power Architecture and Synergistic Processor'

Sony: 'IBM, SONY AND SCEI POWER-ON CELL PROCESSOR-BASED WORKSTATION PROTOTYPE Workstation Provides Quantum Leap Advances in Creating Digital Entertainment Content'

Maybe a direct link works. Otherwise you have to visit their main page.

http://www.gfdata.de/gamefront-temp/sonycell1.pdf

http://www.gfdata.de/gamefront-temp/sonycell2.pdf
 
"Now for the question we all want to know. Is nintendo and MS d00med as a result of sony using the "cell"?"

If the CPU is easy to program with then yes. What are the odds of that?
 
February 6th to 10th, 2005, in San Francisco.

Alls i know is this is the date to expect specifications that pana and co. can translate for us.

edit: I just read the press release, and cell sounds impressive and all but i want to know how Sony's not going to sell PS3s at a massive loss.
 
deadlifter said:
i want to know how Sony's not going to sell PS3s at a massive loss.
Remember the PS2 and the Emotion Engine? It's the same scenario here, same technology and all. Price goes DOWN REALLY FAST, especially since Sony's manufacturing it themselves.
 
The Abominable Snowman said:
Remember the PS2 and the Emotion Engine? It's the same scenario here, same technology and all. Price goes DOWN REALLY FAST, especially since Sony's manufacturing it themselves.

So the Emotion Engine tech was finishing RIGHT as PS2 launched? Sorry, i wasn't caught up in the hoopla then.
 
The Abominable Snowman said:
Isn't the Cell just a bundle of R10000s like the Emotion Engine is a bundle of R5900?

Emotion Engine is 1 R5900 with 2 vector co-processors bolted on the side.

A Cell is probably a PowerPC with 8 vector co-processors bolted on the side.

A PS3 is probably 4 Cells. Or maybe more. Who knows.

Xenon is 3 dual-core PowerPCs glued together, and a GPU with 48 vector co-processors bolted onto it (as far as we know). Who knows for sure though.
 
In other words, we don't know anything more about the total power of the Cell than we knew yesturday.
There's the mention of 4.8GHz clock speed - which if true certainly tells me something new about the power of the chip that is to be used in PS3.
But I agree, still a whole lot of stuff that we need to learn. :P
 
Fafalada said:
There's the mention of 4.8GHz clock speed - which if true certainly tells me something new about the power of the chip that is to be used in PS3.
But I agree, still a whole lot of stuff that we need to learn. :P

New as in better or worse?
 
Gek54 said:
New as in better or worse?

4.8 ghz is pretty god-damn fast, unless it's just a small part of the chip.

(For example, aren't P4 ALUs double pumped, effectively 7.6 ghz (3.8ghz * 2)? Doesn't seem to help the rest of the CPU as much as you'd think though.)

Edit:

Giving scale to the performance targets for the project, one of the ISSCC papers puts the performance of the streaming-processor SRAM at 4.8 GHz. This suggests the data transfer rate for 128-bit words across the local bus within the processing element.

So it looks like the internal BUS for the scratchpad or cache memory runs at 4.8 ghz. If the bus is 128-bits, that means Cell's internal bandwidth is ~76.8 GB/s.

This is not outrageous. Pentium 4's bandwidth to its L2 cache is on the order of 120 GB/s in theory (256-bits @ 3.8 ghz). (Not that it reaches that in a realistic test.)
 
So it looks like both the Xbox2 and PS3 processors will be in 90nm after all, after all the 65nm hoopla.

Also, both are using SOI, and both using 300mm wafers. Sounds like they may even be manufactured at the same IBM plant (East Fishkill NY).
 
Rhindle said:
So it looks like both the Xbox2 and PS3 processors will be in 90nm after all, after all the 65nm hoopla.

Also, both are using SOI, and both using 300mm wafers. Sounds like they may even be manufactured at the same IBM plant (East Fishkill NY).

My gut feeling is Sony is going to Plan B in order to launch as soon after Xenon as possible.

Plan B is a cut down PS3 with a half or quarter tflops @ 90 nm.

But heck, this is just random ass speculation.
 
aaaa0 said:
4.8 ghz is pretty god-damn fast, unless it's just a small part of the chip.
Exactly - but while vague, the quote refers to local storage SRAM. Those are the parts that usually run at same, or lower clock then the rest of the chip, I don't recall seeing one where they'd be actually faster yet.
So if this is true, I would be inclined to assume the APUs at least will run at 4.8Ghz as well (considering previous projections were aiming for 4ghz chip, it's not completely outrageous if they managed to shoot a bit higher - it happened with EE before).

So it looks like the internal BUS for the scratchpad or cache memory runs at 4.8 ghz. If the bus is 128-bits, that means Cell's internal bandwidth is ~76.8 GB/s.
That's per APU - aggregate bandwith is 76.8 * 8 ~ 614GB/s. Assuming single cycle read/write (analogically to VUs, which people very much like to compare to APUs :)), double that.

Now, 1200GB/s is a bit closer to realms of outrageous :P
And that'd be just over a quarter of GFlops too, like you suggested.
 
Developers Working On PS3 WorkStations

26/11/2004
By: Chris Leyton


A US report alleges that PS3 kits have already been sent out to key developers...

In an extensive look at how chip manufacturers are working towards the future and next-generation of computer systems, the EE Times has alleged that a dozen or so “key videogame companies” are in possession of workstations based around the CELL processor.

If true this is the clearest indication yet that developer’s are already beginning to plan and test software for the Playstation3 and have moved away from the high-spec PC’s that many have begun preliminary work upon. Certainly the next-generation titles that TVG have been lucky enough to catch a glimpse upon at this stage have all been developed on high-spec PC’s, scaled towards a possible Playstation3 specification sent forward by Sony.



Tecmo must have been left out............
 
Considering each APu provides 8 fpi/cycle...@ 4.8Ghz that would make 38.4 GFlops per APU...8 APUs=307200 GFLOPS *4 PE=1.2 TFLOPS...And this still without considering the GPU part which I think it won't be totally useless this time (in terms of post-processing). Sounds crazy...of course these are peak figures...but sounds crazy.


I would wait till this is feasible...It would suppose a true revolution in games conception for sure...
 
Sounds crazy...of course these are peak figures...but sounds crazy.
Well it would more or less match the throughput of entire Xenon with with 1PE-CPU alone - if true. Personally I'm not worried about getting good performance on graphics processing with this - but we're still rather in the dark about many aspects that pertain general purpose code.
And of course, the pretty much complete blackout in terms of what the GPU will do or not do :|
 
Sounds really impressive.

If this architecture turns out really well, i hope Sony release and support a line of home and business computers based on it. Finally some true competition to Windows and Intel/AMD ?

Sony PCstation :)
 
All I want to know is if this Cell Processor is gonna be the equivalent of the Emotion Engine that powered the PS2 and was a year later bested in performance by a Nintendo hybrid Power PC/ gekko chipset.
 
Fafalada said:
Well it would more or less match the throughput of entire Xenon with with 1PE-CPU alone - if true. Personally I'm not worried about getting good performance on graphics processing with this - but we're still rather in the dark about many aspects that pertain general purpose code.
And of course, the pretty much complete blackout in terms of what the GPU will do or not do :|

If 1PE-CPU could match the whole of Xenon, then maybe Sony won't have a GPU - maybe it'll all be done in software?
 
huzkee said:
All I want to know is if this Cell Processor is gonna be the equivalent of the Emotion Engine that powered the PS2 and was a year later bested in performance by a Nintendo hybrid Power PC/ gekko chipset.

Ehhrr...The EE is far better than the IBM PPC that GC has. In fact when it comes to floating point performance it beats by a long margin it. EE was thought to deliver huge FP performance. Its main problems come from the fact that the MIPS core is crap (for several reasons people like Faf can elaborate better than me).
 
Can you see that thing? It’s way, way off in the distance.. So far in fact, it might not even be real. So far, it will take years before it’s put in a little black box, made ready for games and is available in stores.

PS3 – the whole thing is so far behind it’s becoming something of a joke. Every developer you speak to just shrugs and mutters something about Xbox 2. In answer to Milhouse, we should be seeing phase 2 technical demonstrations right now. Instead we get blah, blah, chip thinness, blah.

Instead of trying to perform magic, SCE should do what Microsoft is doing: Get a computer and a graphics card and put it in a box that going into the telly and is connected to t’Interweb.
 
Folder said:
Instead of trying to perform magic, SCE should do what Microsoft is doing: Get a computer and a graphics card and put it in a box that going into the telly and is connected to t’Interweb.

Or, they could try and push things forward in ways better equipped for playing 3D games and dealing with the way 3D data needs to be handled efficiently.

I don't know if they will succeed, but you have to admire their focus.
 
Folder said:
Can you see that thing? It’s way, way off in the distance.. So far in fact, it might not even be real. So far, it will take years before it’s put in a little black box, made ready for games and is available in stores.

PS3 – the whole thing is so far behind it’s becoming something of a joke. Every developer you speak to just shrugs and mutters something about Xbox 2. In answer to Milhouse, we should be seeing phase 2 technical demonstrations right now. Instead we get blah, blah, chip thinness, blah.

Instead of trying to perform magic, SCE should do what Microsoft is doing: Get a computer and a graphics card and put it in a box that going into the telly and is connected to t’Interweb.

Sony are going by their own schedule - it's always been a 2006 machine. Microsoft is pushing things earlier. We certainly shouldn't be seeing tech demos by now..May seems right if they're aiming for 2006.

And you invest too much in vague reports based on comments from just a handful of developers re. Xbox2 vs PS3. Of course developers are going to know more about Xbox2 right now - the thing will have been released this time next year! They'll all know about PS3 soon enough.

edit - also, if they can do it on 90nm, great, but if they're sampling 65nm now, it's possible PS3 chips could be manufactured on that process.
 
Am I the only one reading the press release and dropping my jaw at the Cell Workstation performance?

16TFLOP!

Now that is fricking ridiculous.

I honestly don't see the PS3 as being one 1/32 of the power of the Workstation. 1/16 sounds reasonable enough, no?
 
Wasn't Sonys GSCube 'workstation' roughly 16x the performance of the PS2?

If so wouldn't that indicate a similar ratio for workstation:console performance?

So is a 1Tflop PS3 looking likely after all?
 
the original GSCube was 16x the PS2 because it was basically 16 PS2s glued together :p there was a later version with 64 GS-chips that was even more powerful.

However, the GSCube came after the PS2, and while it was designed to do the kind of stuff Sony is intending the Cell WS to do, they're not really comparable.

But still, it seems odd that you could cram 16TFLOP into one rack-mount and then not be able to get 1TFLOP into a console 18 months later.
 
Freeburn said:
Wasn't Sonys GSCube 'workstation' roughly 16x the performance of the PS2?

If so wouldn't that indicate a similar ratio for workstation:console performance?

So is a 1Tflop PS3 looking likely after all?

Some people are pulling 1TFlop and 1.2TFlop estimates out of their calculations, but I'm not sure how they're deriving those numbers. If someone can explain, please feel free!

Also, although the ISCC won't be happening till Feb, the papers might be released earlier, so maybe we'll get more info sooner.

We need Pana :(
 
Shrike_Priest said:
But still, it seems odd that you could cram 16TFLOP into one rack-mount and then not be able to get 1TFLOP into a console 18 months later.

I don't think they have a 16TFlop one rack workstation right now:

"The companies expect that a one rack Cell processor-based workstation will reach a performance of 16 teraflops or trillions of floating point calculations per second."

That could mean anything - could mean they expect second or third generation one-rack solutions to reach that number. I wonder what kind of workstations are actually out there right now..

edit - how safe is it to assume that the APUs will be clocked at least as fast as the SRAM?
 
011bl.jpg


http://pcweb.mycom.co.jp/news/2004/11/29/011.html << any translations?

So the APUs are clocked at 4.6Ghz?

So if there are 4 PEs with 8 APUS, that'd be 8 fops*4.6Ghz*8APUs*4PEs = 1.1776TFlops?!?

Also, doesn't 6.4GB/s for off-chip communication seem low? Of course, this is "only" the first generation cell, and perhaps in a different application (i.e. PS3), off-chip memory bandwidth may be more of a concern...(?)

An 85 degree celcius operating temperature also seems *mightily* hot. Which makes me wonder about that clock speed and its relevance to current workstations, and possibly PS3 (although PS3 will be a more mature version of this, possibly even on 65nm, so the heat may no longer be a concern).

And again, we need Pana. And more Fafalada input. Please? :)
 
Hidden away somewhere, Deadmeats heart just sank through the floor :lol
 
An 85 degree celcius operating temperature also seems *mightily* hot. Which makes me wonder about that clock speed and its relevance to current workstations, and possibly PS3 (although PS3 will be a more mature version of this, possibly even on 65nm, so the heat may no longer be a concern).

Shrinking the manufacturing process should cool things...in theory. I say in theory because there is a well-known example where it doesn't happen.

P4 Northwood @ 130 nm vs P4 Prescott @ 90 nm.

Everybody (AMD, IBM, Sony, Intel) is having problems with the 90nm process...I doubt it so easy to go 65 nm without problems...
Anyways I am not an expert in this kind of things.
 
Top Bottom