• Hey Guest. Check out your NeoGAF Wrapped 2025 results here!

First Cell demo (MPEG2 decoding)

gofreak

GAF's Bob Woodward
Well, might not be what we're hoping for - no game demos - but Toshiba provided a first demo of Cell at the Cool Chips conference that took place last week.

Basically, using a 8-SPE chip, they had 48 standard definition MPEG2 movies decoding and running simultaneously.

To quote one at Beyond3d:

"Toshiba's demo is
1. Load 48 SDTV-resolution MPEG2 streams from HDD simultaneously then decode them with 6 SPEs
2. Another SPE resizes them to thumbnails, then displays them tiled on a 1920x1080 screen. (The remaining 1 SPE is idle throughout the demo) "

Basically, each SPE decodes 8 streams simultaneously. What's perhaps even more interesting is that they did this using software that allowed the programmers to code this without worrying what SPE does what - their software automatically split everything up for them (though this is perhaps easier to do with an application like this than in general).

Pics:

up34421.jpg


up34422.jpg


They didn't disclose the clockspeed of the chip doing this. Anyone know how this might compare to a regular PC CPU?
 
Damn. Well, I don't know how PC CPU/GPUs fare in a task like this, but I know I'd not even attempt doing something like this without Adobe AfterEffects, prepared to wait long time for rendering :P
 
Marconelly said:
Damn. Well, I don't know how PC CPU/GPUs fare in a task like this, but I know I'd not even attempt doing something like this without Adobe AfterEffects, prepared to wait long time for rendering :P

I don't know about CPUs either, we need to ask Faf or nAo about that, but I know ATI has a custom IC solution for these things, Xilleon, which they pimp as cutting-edge. It can decode 2 streams concurrently AFAIK. And generally speaking, any IC with hardwired blocks like that will drastically outpreform a CPU in such tasks.
 
Tech-heads, answer this, quick!

So this makes CELL how many times faster than say P4 3.8 GHz in this kind of task? Are there any MPEG2 benchmarks on Tom's Hardware/ Anandtech etc. for Intel and AMD systems?
 
I thought I saw at one point a longhorn demo that showed multiple movie streams on a desktop, all rotated and skewed in different ways (just to show the new GUI)..it had more than 2 on screen, though not 48 ;) Anyone else remember that? I can't remember if they were all the same stream, just replicated, or not though..
 
Um... it looks like Windows Media Player is just playing one single file, which is a file of 48 clips rendered together.
 
goodcow said:
Um... it looks like Windows Media Player is just playing one single file, which is a file of 48 clips rendered together.

I think that's just for presentation purposes..

edit - apply non-expert logic and in the aim of "making things sound great", I guess that might be the equivalent of being able to decode a 3840x3456 picture in realtime? I know my P4 chugs even with 1080i..

Of course, this is probably a simplification of the matter..
 
I can do this right now on my Power Mac G5. In fact, the most convincing demo is showing someone 48 porn videos running. ;)
 
Doom_Bringer said:
I hope the Cell in PS3 has 8 SPE'! That would be so killer. Make it happen Sony


blah. 8 SPEs is nothing. compared to 32 SPEs (32 APU Broadband Engine)
or especially 64 SPEs (64 Auxillary Processors) mentioned by another credible news website


on the other hand, even an 8 SPE Cell is beyond anything we've seen and used at home.
 
midnightguy said:
blah. 8 SPEs is nothing. compared to 32 SPEs (32 APU Broadband Engine)
or especially 64 SPEs (64 Auxillary Processors) mentioned by another credible news website


on the other hand, even an 8 SPE Cell is beyond anything we've seen and used at home.


You want the PS3 to look like a gian toaster and cost $900??

32 sounds good though! But it won't happen.
 
neptunes said:
and I doubt the ps3 will get 8 SPE's either, so.....

True, it'll be a couple less in all likelihood. Jeesh, some people are never pleased! :)

Thanks for the info, The Faceless Master. I'm wondering if there are any benchmarks out there, google isn't being very helpful. Anyone know how "good" this is? For all we know, the whole point might not have been performance, but just to demonstrate the underlying software platform..
 
This is something that Cell will excel at - running multiple smaller apps simultaneously. Be interesting to see how that translates to games - since it's a feature much more useful in an OS. Future cell based PCs will be cool.

I can't think of an example of a game that needs to be two apps concurrently. I guess you could have like, a Battelfield 1942 split screen game where one guy was 3D shooting and another was RTS-ing in the same game...
 
Stinkles said:
I can't think of an example of a game that needs to be two apps concurrently. I guess you could have like, a Battelfield 1942 split screen game where one guy was 3D shooting and another was RTS-ing in the same game...


You're thinking about things on way too high a level. Forget about concurrency between two different programs for a second, and think about concurrency within a program. There's a lot of concurrency in games waiting to be unlocked. Just because most games currently are single-threaded (and for good reason - hardware threading is a very recent thing, and multiple cores even more recent), doesn't mean they can't be concurrent going forward. They'll have to be, at least if they want to take advantage of the performance on offer...as Dr. Dobb's said, the free lunch is over as far as computing performance is concerned.
 
Blimblim said:
It looks like a P2 266 is the minimum requirement for a software MPEG 2 decoder : http://www.pioneeraus.com.au/computer/dvd-romdrives/HardwarevsSoftwareDVDDecoders19.html


So, that would mean a 5.3GH system could do 20 MPEG2's vs the 48 shown here? Of course, I would assume more efficient decoders have been written since then, but then again, we don't know how efficiently the 266 actually handled the MPEG2 decoding, but that was a loooong time ago and the with progressive scan added to MPEG2 since then we're still screwed as to what kind of processing power is needed.

Any newer articles out there?
 
sonycowboy said:
So, that would mean a 5.3GH system could do 20 MPEG2's vs the 48 shown here? Of course, I would assume more efficient decoders have been written since then, but then again, we don't know how efficiently the 266 actually handled the MPEG2 decoding, but that was a loooong time ago and the with progressive scan added to MPEG2 since then we're still screwed as to what kind of processing power is needed.

Any newer articles out there?
Don't forget that SSE instructions are also much more efficient than the old first version of MMX.
 
Blimblim said:
It looks like a P2 266 is the minimum requirement for a software MPEG 2 decoder : http://www.pioneeraus.com.au/computer/dvd-romdrives/HardwarevsSoftwareDVDDecoders19.html

I can't imagine how software decoding can be done through a P2 266MHZ.
On my P3 800MHz,256MB Ram,without Mpeg2 decoding on the video board (so pure software decoding) the reproduction is not even really fluid and I have horrible slowdowns whenever I try to change some settings (ex language,chapter etc) with PowerDVD.
 
gofreak said:
True, it'll be a couple less in all likelihood. Jeesh, some people are never pleased! :)

Thanks for the info, The Faceless Master. I'm wondering if there are any benchmarks out there, google isn't being very helpful. Anyone know how "good" this is? For all we know, the whole point might not have been performance, but just to demonstrate the underlying software platform..

8 is a given,european dev sources say... ;) :)
 
Elios83 said:
I can't imagine how software decoding can be done through a P2 266MHZ.
On my P3 800MHz,256MB Ram,without Mpeg2 decoding on the video board (so pure software decoding) the reproduction is not even really fluid and I have horrible slowdowns whenever I try to change some settings (ex language,chapter etc) with PowerDVD.
I could play DVDs perfectly on my work's Celeron 650 a few years ago. As long as your video card supports overlays correctly it really shouldn't be a problem. I'd say your DVD software is running in GDI mode or something.
 
Elios83 said:
I can't imagine how software decoding can be done through a P2 266MHZ.
On my P3 800MHz,256MB Ram,without Mpeg2 decoding on the video board (so pure software decoding) the reproduction is not even really fluid and I have horrible slowdowns whenever I try to change some settings (ex language,chapter etc) with PowerDVD.

Of course your PC is also running A LOT of other crap, like a huge bloated OS.
 
I think people are forgetting some key points. I'm not completely well versed in the specifics of the Cell architecture, but from what I know, the peak rate comes at only doing Single Precision calculations. This aspect is similar to the EE engine. The cell does contain a Double Precision Unit, but doing calculations with the DPU comes at the price of a huge performance drop. I'm not sure if I remember correctly but I think IBM said the result is about 1/10th of the speed of Single Precision. I'm not going to go into details about the disadvantages of doing everything in Single Precision, especially in the case of the Cell as it currently is, but the point is that the performance gain you see in Cell is definitely very restricted. So a statement saying "OMG CELL >>>>>>>>>>>> EVERYTHING ELSE" is just plain ignorant. Not to forget the fact that the demo is with a 8-SPE, something that's probably too expensive for the PS3.

The cell will be good at doing what it does best, people like MS is obviously not going to put out something not competitive. From a practicality viewpoint, xbox2 will likely have an advantage in terms of development than PS3 will at launch, since it's a familiar API to the developers. It'd be interesting to see if the PS3 gets delayed due to not enough games available at launch. We'll see at E3.

(Added with edit)

Remember when the Emotion Engine came out and Sony was boasting how it can do crazy calculations comparing it to super computers? Look at PS2 now. Most of these things are simply marketing to make it appealing to potential clients. I'll believe it when a game developer gets his/her hands on the Cell and put out some good looking stuff.
 
Celeron 500Mhz is an absolute minimum for software DVD decoding from what I've seen... I'm sure the lower specced machine could decode simpler streams than 480P MPEG2.

I can do this right now on my Power Mac G5. In fact, the most convincing demo is showing someone 48 porn videos running. ;)
You can run 48 DVD VOB 480P files simulateously and play them with no skipping?
 
Srider said:
The cell does contain a Double Precision Unit, but doing calculations with the DPU comes at the price of a huge performance drop

Anyways... Give me two names of proccessor delivering 27GFlops at double precision, and if you can in the cell price range...
 
Srider said:
I think people are forgetting some key points. I'm not completely well versed in the specifics of the Cell architecture, but from what I know, the peak rate comes at only doing Single Precision calculations. This aspect is similar to the EE engine. The cell does contain a Double Precision Unit, but doing calculations with the DPU comes at the price of a huge performance drop. I'm not sure if I remember correctly but I think IBM said the cost is about 1/10th of the speed of Single Precision. I'm not going to go into details about the disadvantages of doing everything in Single Precision, especially in the case of the Cell as it currently is, but the point is that the performance gain you see in Cell is definitely very restricted. So a statement saying "OMG CELL >>>>>>>>>>>> EVERYTHING ELSE" is just plain ignorant. Not to forget the fact that the demo is with a 8-SPE, something that's probably too expensive for the PS3.

The same is true of pretty much every chip. But single precision is pretty much all that's ever used in games. DP is really only there, in fact, to open up the door for its use in supercomputers. Have you looked at the typical double precision performance of CPUs?

I also like how you say "I'm not going to go into details about the disadvantages of doing everything in Single Precision" ;) Of course, more precision is desireable, but we live in the real world where DP processing is far too expensive to be used generally in games..

In other words, I think you're grasping at straws there.
 
Depends on the data rate. If it's low data rate then it's not that impressive.
 
Marconelly said:
Celeron 500Mhz is an absolute minimum for software DVD decoding from what I've seen... I'm sure the lower specced machine could decode simpler streams than 480P MPEG2.

I can do this right now on my Power Mac G5. In fact, the most convincing demo is showing someone 48 porn videos running. ;)

You can run 48 DVD VOB 480P files simulateously and play them with no jerking?


fixed
 
Marconelly said:
Celeron 500Mhz is an absolute minimum for software DVD decoding from what I've seen... I'm sure the lower specced machine could decode simpler streams than 480P MPEG2.?
I used to play DVDs on my POS K6-2 500 rig using PowerDVD. And I tell you, that piece of trash was slower than a Celeron.

God damn it, what a worthless computer that was =/
 
I think people are forgetting some key points. I'm not completely well versed in the specifics of the Cell architecture, but from what I know, the peak rate comes at only doing Single Precision calculations.
This is true as far as I know. However, there was a discussion about this at B3D, and from what I remember everyone said that for realtime rendering, single precision is all that is needed. They said double is mostly used for offline rendering, and even then not necessarily and not in all the software.

And I tell you, that piece of trash was slower than a Celeron.
Don't be so sure. Cel 500 is SLOOOOW. How do I know? I'm using Celerron 633, and it's SLOOOW. (one O less)

That range of Mhz (500 or so) is about what I had in mind though.
 
koam said:
Is cell going to ever hit the pc/mac market?
I dont see that happening any time soon, maybe specialist (artist)almost definately not mainstream.

On a side note, one thing no one seems to be addressing with regards to cell is its lack* of integer processing power as its PPE will be running the OS which will also have have to schedule for the SPEs. Sure a 4GHZ PPE is much better than the ~300Mhz and 733Mhz main cpus we have today but is it enough for advanced AI and lots (and I mean lots ) of enemies on the screen.

*well as compared to rumoured X2 specs, even though I know full well its not that bad, so all I can say is ............................

FLAME ON
 
I have had no problem playing dvds on my old 333mhz celeron fsb 100mhz, ram 256 megabytes on a win98se as long as I am not touching the computer.
 
One thing is for sure, these things don't scale well on PCs at least. Maybe one DVD decoding is possible on 333MHz celeron, but there's no chance I could run two VOB files on my 633Cel.

but is it enough for advanced AI and lots (and I mean lots ) of enemies on the screen.
Another thing that was discusses at B3D. That sort of stuff is in all likelyhood also be done on the SPE. Sure, AI routines are easier to make using heavy integer computations, but there is nothing that makes floating point based AI routines impossible (or so they said).
 
I just did a test with my 350 mhz P2. I ran three mpeg2 videos and one avi video in multiple instances of BSplayer. Here's what I concluded from the test:

1 video: perfect playback
2 videos: perfect playback
3 videos: The video of the movies already playing gets a bit sluggish when this video is opened, sound plays smoothly. Once this third video is playing the video is running smooth again.
4 videos: Video starts to chug a bit on the avi now, but the mpegs are still doing fine, sound still plays perfectly.

Well...I don't really know what this proves but uuh....well I'm pretty proud of my good old PC.
 
Marconelly said:
Don't be so sure. Cel 500 is SLOOOOW. How do I know? I'm using Celerron 633, and it's SLOOOW. (one O less)
K6-2 micros were absolute shit in that regard. Windows ran fine, but lord, that thing sucked for other than that and old DX6 games with 3DNow instructions. And sweet Jesus, don't make me even start with floating point operations, Delta Force 1 was unplayable unlike on my friend's silky smooth P2 333.

Crap crap crap. I was in *heaven* when my boss lended me his 733 laptop.
 
Blimblim said:
I could play DVDs perfectly on my work's Celeron 650 a few years ago. As long as your video card supports overlays correctly it really shouldn't be a problem. I'd say your DVD software is running in GDI mode or something.

No,I have extra features like CLEV abilitated in PowerDVD that use CPU power to have better colors (without that the image is really dark and washed out compared to the vibrant output you can have with a dedicated hardware decoder) and the reproduction is ok until you go through the software's menu,then slowdowns occur.
I think a P3 500MHZ is really the minimum to perform software MPEG2 decoding properly and even then it's a low quality reproduction.
 
think people are forgetting some key points. I'm not completely well versed in the specifics of the Cell architecture, but from what I know, the peak rate comes at only doing Single Precision calculations. This aspect is similar to the EE engine. The cell does contain a Double Precision Unit, but doing calculations with the DPU comes at the price of a huge performance drop. I'm not sure if I remember correctly but I think IBM said the result is about 1/10th of the speed of Single Precision. I'm not going to go into details about the disadvantages of doing everything in Single Precision, especially in the case of the Cell as it currently is, but the point is that the performance gain you see in Cell is definitely very restricted. So a statement saying "OMG CELL >>>>>>>>>>>> EVERYTHING ELSE" is just plain ignorant. Not to forget the fact that the demo is with a 8-SPE, something that's probably too expensive for the PS3.

The cell will be good at doing what it does best, people like MS is obviously not going to put out something not competitive. From a practicality viewpoint, xbox2 will likely have an advantage in terms of development than PS3 will at launch, since it's a familiar API to the developers. It'd be interesting to see if the PS3 gets delayed due to not enough games available at launch. We'll see at E3.

(Added with edit)

Remember when the Emotion Engine came out and Sony was boasting how it can do crazy calculations comparing it to super computers? Look at PS2 now. Most of these things are simply marketing to make it appealing to potential clients. I'll believe it when a game developer gets his/her hands on the Cell and put out some good looking stuff.

So your points are (as I interpret them):

A)Cell has "restricted" performance because of its Double Precision performance (as in the lack thereof)

B) People who say Cell is more powerfull than the competion are ignorant

C) 8 SPEs won't be used in PS3

D) Microsoft is "obviously" going to put out something competitive with CELL (will this MS chip include a lack of Double-precision performance too??)

E) PS3 is "impractical" development-wise because coders won't be familiar with its APIs (Gee, I can't wait until developers get a handle on something so exotic as Open GL!!!)

F) It will be "intresting" to see if the PS3 gets delayed (never mind the fact no launch date has been announced) due to A LACK OF GAMES??? :lol :lol :lol (I'm not even going to go there on that one)

G) You will believe CELL will put out good looking stuff when you see it (which can be applied to all three next-gen systems as we have seen nothing from any one of them, let alone enough stuff to COMPARE them?)

Are all these "key points" correct? :)
 
Top Bottom