VGleaks: Orbis Unveiled! [Updated]

Just a friendly reminder, GCN 2 has very minor differences from the original GCN. It offers virtually no savings in heat, and the architectural improvements are minimal at best. What GCN2 will do however, is offer headaches, because it hasn't hit mass production.
Wrong ..
 
Alright, I was just guessing since the re branding came out of the left field (to me),
Anandtech said:
As we discussed yesterday with AMD’s latest round of GPU rebadges, both AMD and NVIDIA are locked into playing the OEM rebadge game in order to fulfill their OEM partner’s calendar driven schedules. OEMs want to do yearly updates (regardless of where the technical product cycle really is), so when the calendar doesn’t line up with the technology this is achieved through rebadges of existing products. In turn these OEMs put pressure on component suppliers to rebadge too, so that when consumers compare the specs of this year’s “new” model to last year’s model the former look newer. The end result is that both AMD and NVIDIA need to play this game or find themselves locked out of the OEM market.
http://www.anandtech.com/show/6579/...730m-and-geforce-710m-partial-specs-published
 
Ahh I see. Then maybe they will use sea island gpus in the final dev kits. Makes it more unlikely though if the SoCs have southern island parts.
 
I'm a bit annoyed that so far it looks like neither Microsoft nor Sony have gone with a GCN 2 GPU. I think that's the least they could have done tbh. I mean, I appreciate the 4GB GDDR5 Sony put in, but come on...

Yeah I am surprised too. I really thought it would be beyond southern islands. When rumors dropped of 8GB for Xbox, I knew Sony knew they would be dead meat with only 2GB and would go for 4GB. I think what surprised a lot of people was the DDR3. Someone mentioned it last summer, but it seemed really implausible at the time because of bandwidth.
 
I thought the CPUs of the systems were supposed to be fairly comparable?

But aegies is saying that Durango's CPU is substantially different

Is it possible that Durango has a modified CPU that is much beefier than Orbis and accounts for some added GPGPU performance that is currently allocated to the 1.84 TF number for Orbis?

In other words, Durango has 400 GFLOPs more powerful CPU, but that's offset by Orbis' 400 GFLOPs allocated to GPGPU functions...ultimately resulting in systems that are comparable in power?

This is what some people are currently speculating on B3D....is it realistic?
 
theres got to be some difference, or they wouldnt of released it at all, and wouldnt of called it GCN2. What is it supposed to do differently? ANy new features? 100% sure it more efficient to some degree, 15%?

I think it's even more HSA adapted. It might not have been as big of deal in a closed console with APU. Just a hunch. But as other have said, unless it was going to be a huge leap, I think they wanted to stay conservative with something they knew wouldn't give them potential problems with a delay. I guess that won't happen but they had to plan just in case.
 
Yep. Exclusives play to the strengths while carefully hiding the weaknesses. It's not a 100% gauge of true console power.

I think it's a pretty big part of the console's power, actually, given that exclusives are a direct part of the console. Hardware is important, but I'd argue that it's even more important to the console that there is a stable of dedicated developers that know how to work well within the limitations of the system (or as you say, play to the strengths while hiding the weaknesses). This generation it really made the difference for the PS3, hopefully MS can compete next generation.
 
I thought the CPUs of the systems were supposed to be fairly comparable?

But aegies is saying that Durango's CPU is substantially different

Is it possible that Durango has a modified CPU that is much beefier than Orbis and accounts for some added GPGPU performance that is currently allocated to the 1.84 TF number for Orbis?

In other words, Durango has 400 GFLOPs more powerful CPU, but that's offset by Orbis' 400 GFLOPs allocated to GPGPU functions...ultimately resulting in systems that are comparable in power?

This is what some people are currently speculating on B3D....is it realistic?


A CPU with gpgpu capabilities? Would it not make more sense to have a gpu with that ability? I guess we had cell this gen, but that was a 400m investment on Sony and partners part.

When I was doing a rough guestimate benchmark for jaguar 8core based on expected performance levels, I found that there were xeons very similar to jaguar 8core level, which is - performance wise - similar to A10 hardware that Sony went with.
The extra Xeon was probably dsp related or as Aegies said, trying to partially mimic audio processing, etc...
 
I thought the CPUs of the systems were supposed to be fairly comparable?

But aegies is saying that Durango's CPU is substantially different

Is it possible that Durango has a modified CPU that is much beefier than Orbis and accounts for some added GPGPU performance that is currently allocated to the 1.84 TF number for Orbis?

In other words, Durango has 400 GFLOPs more powerful CPU, but that's offset by Orbis' 400 GFLOPs allocated to GPGPU functions...ultimately resulting in systems that are comparable in power?

This is what some people are currently speculating on B3D....is it realistic?

I don't think it realistic for Durango to have a 400 GFLOPs CPU based on Jaguar .
That is crazy talk but you never know .
 
I thought the CPUs of the systems were supposed to be fairly comparable?

But aegies is saying that Durango's CPU is substantially different

Is it possible that Durango has a modified CPU that is much beefier than Orbis and accounts for some added GPGPU performance that is currently allocated to the 1.84 TF number for Orbis?

In other words, Durango has 400 GFLOPs more powerful CPU, but that's offset by Orbis' 400 GFLOPs allocated to GPGPU functions...ultimately resulting in systems that are comparable in power?

This is what some people are currently speculating on B3D....is it realistic?



Hmmm... I wonder if that's the reason why vgleaks specifically calls the Orbis CPU Jaguar where as they make no such distinction for Durango. Also (not sure if I'm reading it wrong) for Orbis they say the two CPU clusters have a "shared 2MB L2 cache" where as for Durango they say it has a "total of 4 MB of L2 cache".
 
I can see some kind of special vector helpers like Xenon. I don't have any clue how many GFlops they could be. I do know the 8 core Jaguar itself is supposed to be a little over 100. So that seems really weird to have a CPU co-processor with 4x the power of the main one. Very Cell like. Some early insiders did say Durango was weird.
 
As much as I would love for backwards compatibility to be handled by hardware/software, I suspect it will be handled by GAIKAI.
 
I thought the CPUs of the systems were supposed to be fairly comparable?

But aegies is saying that Durango's CPU is substantially different

Is it possible that Durango has a modified CPU that is much beefier than Orbis and accounts for some added GPGPU performance that is currently allocated to the 1.84 TF number for Orbis?

In other words, Durango has 400 GFLOPs more powerful CPU, but that's offset by Orbis' 400 GFLOPs allocated to GPGPU functions...ultimately resulting in systems that are comparable in power?

This is what some people are currently speculating on B3D....is it realistic?

400glops for a CPU?? A Jaguar based CPU? That's even crazier. I doubt that Durango CPU would 4x as powerful as Orbis's. That be nuts.
 
Hmmm... I wonder if that's the reason why vgleaks specifically calls the Orbis CPU Jaguar where as they make no such distinction for Durango. Also (not sure if I'm reading it wrong) for Orbis they say the two CPU clusters have a "shared 2MB L2 cache" where as for Durango they say it has a "total of 4 MB of L2 cache".

They'll both have 4mb L2 cache. That's basically a two quad core jaguars put together. I think you are looking too deep if you are taking apart sentences that essentially mean the same thing.
 
Hmmm... I wonder if that's the reason why vgleaks specifically calls the Orbis CPU Jaguar where as they make no such distinction for Durango. Also (not sure if I'm reading it wrong) for Orbis they say the two CPU clusters have a "shared 2MB L2 cache" where as for Durango they say it has a "total of 4 MB of L2 cache".

No I think Orbis has a total of 4mb L2 cache as well. They say there is two clusters. Then the next line below they say that EACH cluster contains 4 cores and a shared 2mb L2 cache.

Once again just like the extra ALU on the each of the 4CU's I'm pretty sure that's the correct way to read it.
 
I thought the CPUs of the systems were supposed to be fairly comparable?

But aegies is saying that Durango's CPU is substantially different

Is it possible that Durango has a modified CPU that is much beefier than Orbis and accounts for some added GPGPU performance that is currently allocated to the 1.84 TF number for Orbis?

In other words, Durango has 400 GFLOPs more powerful CPU, but that's offset by Orbis' 400 GFLOPs allocated to GPGPU functions...ultimately resulting in systems that are comparable in power?

This is what some people are currently speculating on B3D....is it realistic?


For that Durango also need those Orbis 4 CU to match in power. From rumors right now both of them are the same and both 100 Gflop.
 
No I think Orbis has a total of 4mb L2 cache as well. They say there is two clusters. Then the next line below they say that EACH cluster contains 4 cores and a shared 2mb L2 cache.

Once again just like the extra ALU on the each of the 4CU's I'm pretty sure that's the correct way to read it.


Ok, makes sense. That's why I said maybe I was reading it wrong. Lol.
 
I thought the CPUs of the systems were supposed to be fairly comparable?

But aegies is saying that Durango's CPU is substantially different

Is it possible that Durango has a modified CPU that is much beefier than Orbis and accounts for some added GPGPU performance that is currently allocated to the 1.84 TF number for Orbis?

In other words, Durango has 400 GFLOPs more powerful CPU, but that's offset by Orbis' 400 GFLOPs allocated to GPGPU functions...ultimately resulting in systems that are comparable in power?

This is what some people are currently speculating on B3D....is it realistic?

It'd need 40 Jaguar cores total to have an extra 400 GFlops over the 8 core model. A modified CPU that has GPGPU performance... you think MS is putting a couple Cells in the Durango?
 
It'd need 40 Jaguar cores total to have an extra 400 GFlops over the 8 core model. A modified CPU that has GPGPU performance... you think MS is putting a couple Cells in the Durango?

It can. If they have a 256 wide AVX2 unit per core, the Durango CPU would sport 410 GFLOPs. In case you have forgotten MS did the same thing with Xenon CPU on the 360 where they increased the VMX128 count.
 
And that wouldn't change its targeted power limits?
What scently suggests wouldn't change it much, I suspect.
It can. If they have a 256 wide AVX2 unit per core, the Durango CPU would sport 410 GFLOPs. In case you have forgotten MS did the same thing with Xenon CPU on the 360 where they increased the VMX128 count.
Would they really get quadruple the theoretical FLOPs doing that? And if they did, would they see much of a real-world performance boost? It'd be a worthwhile upgrade to the Jags, as the double-pumped 128-bit AVX2 unit seems odd for a gaming console, but I don't think it'd match having an extra four CUs on the GPU.
 
What scently suggests wouldn't change it much, I suspect.

Would they really get quadruple the theoretical FLOPs doing that? And if they did, would they see much of a real-world performance boost? It'd be a worthwhile upgrade to the Jags, as the double-pumped 128-bit AVX2 unit seems odd for a gaming console, but I don't think it'd match having an extra four CUs on the GPU.

It will get the performance and it would actually be better to have that amount of flops on the cpu as compute jobs, even with the improvements to run them on gpus, are still predominantly a cpu side duty.

Mind you, I am not saying that they are adding it to their cpu but as for whether it is possible for them to do it? yes it is and it will more efficient at it. That is why you hear the suggestion that the 4 CUs that seems to have been reserved for compute jobs would perform better if the ps4 cpu can have access to them directly.
 
Can't you at least show the proof of what you stated, or is there none?
http://www.simmtester.com/page/news/images/image33.gif
ddr1 vs rambus. 10% difference. with ddr2 the difference will shrink to next to nothing.


No shit Sherlock!
But which parts will be removed and how many transistors do you think 8 Jaguar cores require?



Ok, take a look at the top XDRAM specs on this page
http://www.elpida.com/en/products/xdr.html

and then post a link to some compareble DDR3 specs.

If you can´t do that, please write a comprehensive apology to this thread for spreading dumb shit.
No shit? jaguar cores are 3mm^2 at 28nm which means they are like 24m transistors each core. the 4mb cache is 24m.

Please find a relevant benchmark of XDR for performance reference. If you can't please print out your post and eat it. GDDR5 is a DDR3 derivative. Google GDDR5 vs XDR.


XDR was faster than GDDR3...
XDR was about the same speed as GDDR5. But memory speed on a cpu has no really any performance gain since ddr2.
quake4.PNG
 
Quick question for aegies: What kind of PC hardware (CPU/GPU) would be needed in order to run next-gen games on PC at a similar level of quality/resolution? No need to be specific, just a ballpark figure would be fine. Also fine if you don't have that information or you don't want to reveal it.
 
Code:
[B]Display ScanOut Engine (DCE)[/B]


With Sony's love for 3D & knowing 3D is going to be a big part of PS4, I'm wondering could this chip be for more than just up-scaling , like could it be used to convert all games into 3D even if it's a 30 FPS game by up converting the FPS or by splitting the images up in a way that the TV will see it as 3D.
 
It can. If they have a 256 wide AVX2 unit per core, the Durango CPU would sport 410 GFLOPs. In case you have forgotten MS did the same thing with Xenon CPU on the 360 where they increased the VMX128 count.

Wouldn't 256 bit AVX2 only double what a standard Jaguar can do? Aren't they already 128 bit AVX units? So double that and you'd get up to 205 GFlops.
 
I asked this before but it got lost in the midst of another debate, would the +4 CUs be able to be set for different tasks separately or is it all set for one use. Say a game wanted to dedicate 16CUs to rendering and 2CUs to physics processing, would that be possible? I can only see first party games getting into things like that but it seems like it would really add a lot of flexibility to the machine.

They aren't in the normal graphics pipeline and you have to program them directly. But you can use them for what you like. Eg you could run physics for half the time (or on 2CUs), and use the rest of the time to do graphics related stuff like pre sorting or post processing. Not directly pushing polys, but impacting graohics anyway.

Not sure if the CUs are directly accessible (potentially something devs might want) or you just give them code and they divvy it up between them. Shouldn't matter too much - developers would arrange their code to take the correct amount of time to accomplish the same result.
 
360's gpu could do tesselation??? Since when?

Errr.....since the console was created. It has a Geometry Tessellation Unit though it is not as powerful nor as flexible as the ones found in AMD gpus. Several titles use it actually but not extensively eg. Viva Pinata, Alone in the Dark, and others.
 
Please find a relevant benchmark of XDR for performance reference. If you can't please print out your post and eat it. GDDR5 is a DDR3 derivative. Google GDDR5 vs XDR.
The only benchmark relevant to this discussion would be how XDR performed in PS3, vs. DDR of the time. I'm guessing it's practically impossible that they didn't test various memory types on PS3 kit, and whatever advantage XDR gave them, big or small (remember, GPU needed to have indirect access to it as well as the CPU) was probably deemed worth the price hike.
 
I assumed that was a typo. You wouldn't have alluded to an early predecessor of Xenos without knowing it wouldn't have evolved from R300.

Nope, it was an honest mistake. I know it derived from the R400 but for some reason I thought it was the R300 when making that post. These late nights are killing me. =P
 
There really wasn't anything wrong with the RSX. It was based on NVIDIA's mainline architecture at the time, though downgraded.

I just want to comment on this - IMHO, everything that is shitty about the PS3 architecture you can pretty much blame on the RSX.
 
360's gpu could do tesselation??? Since when?

the very beginning, same with unified shaders. All of amd's r600 cards are based off of the designs that AMD and MS collaborated for the 360. Memexport is also the foundation for compute shaders. Both tessellation and compute shaders were supposed to be in dx10, but were pushed back to dx11.

IIRC on paper the RSX had a higher theoretical output (fill rates, flops etc), was the beefer hardware but the xenos had a better instruction set and feature list.

I just want to comment on this - IMHO, everything that is shitty about the PS3 architecture you can pretty much blame on the RSX.
and the ram...
 
They aren't in the normal graphics pipeline and you have to program them directly. But you can use them for what you like. Eg you could run physics for half the time (or on 2CUs), and use the rest of the time to do graphics related stuff like pre sorting or post processing. Not directly pushing polys, but impacting graohics anyway.

Not sure if the CUs are directly accessible (potentially something devs might want) or you just give them code and they divvy it up between them. Shouldn't matter too much - developers would arrange their code to take the correct amount of time to accomplish the same result.
Cool. That's what I thought.
 
Well it wouldn't make sense to have CUs that are reserved for physics to be doing texturing. They might have texturing hardware but they won't be doing any texturing which makes it kind of a number on paper that don't mean anything in real life.

Those extra CU's might also be used for rendering so they need to have the same feature sets, also mapping vectors and matrices to textures buys you quite a bit of compute performance, seen it first hand years ago with CUDA kernels.

It does make sense if SCE asked for the HW scheduler in those extra CU's and the branch predictor, amongst other bits, to be optimized more for compute tasks than graphics processing so to offer more help to the CPU.
 
Googling, from Sotfopedia:

Jaguar now comes with most – if not all – of Bobcat’s abilities, but it also adds complete architectural computational support for SSE4.1, SSE4.2, AES, CLMUL, MOVBE, AVX, XSAVE, XSAVEOPT, FC16, and BMI instructions.

Since most SIMD instructions are 128-bit wide – if not wider, AMD decided to double the width of its FPU pipelines from 64-bit to 128-bit.

SSE instructions required two passes on the Bobcat, but they now can be handled in one single pass, and this is no small feat.

Ironically for AMD, Jaguar comes with support for AVX and these would be best fit by 256-bit wide FPU pipelines,but this is not the main concern for a mobile-oriented architecture.
 
IIRC on paper the RSX had a higher theoretical output (fill rates, flops etc), was the beefer hardware but the xenos had a better instruction set and feature list.

On paper, I think Xenos actually just edged out RSX in Flops. The main advantage was unified shaders. So the developers could decide how to allocate the pixel and vertex loads. PS3 developers were stuck with a predefined arrangement. So Xenos could shift most of its pipes to vertex loads and blow the doors off RSX max 8 vertex pipelines. Xbox 360 developers had a scheduler that would dynamically allocate the shader workload because there was no distinction between pixel and vertex units.
 
On paper, I think Xenos actually just edged out RSX in Flops. The main advantage was unified shaders. So the developers could decide how to allocate the pixel and vertex loads. PS3 developers were stuck with a predefined arrangement. So on paper, Xenos could shift most of its pipes to vertex loads and blow the doors off RSX max vertex 8 pipelines.

Well...IMHO this article hits on the main issues pretty well.

http://www.theinquirer.net/inquirer/news/1007286/ps3-hardware-slow-broken

Of course, workarounds were found, and I think they are well known to just about every PS3 developer worth their salt.
 
Googling, from Sotfopedia:

Jaguar now comes with most – if not all – of Bobcat’s abilities, but it also adds complete architectural computational support for SSE4.1, SSE4.2, AES, CLMUL, MOVBE, AVX, XSAVE, XSAVEOPT, FC16, and BMI instructions.

Since most SIMD instructions are 128-bit wide – if not wider, AMD decided to double the width of its FPU pipelines from 64-bit to 128-bit.

SSE instructions required two passes on the Bobcat, but they now can be handled in one single pass, and this is no small feat.

Ironically for AMD, Jaguar comes with support for AVX and these would be best fit by 256-bit wide FPU pipelines,but this is not the main concern for a mobile-oriented architecture.


This is big change any official confirmation from AMD ?
 
Top Bottom