VGLeaks rumor: Durango CPU Overview

So you want to play the "official announcement" game. If so, then all this discussion is utterly pointless and a waste of forum space until that rumoured time in April when XB3 is finally unveiled.

Now, going by the rumour, while peak theoretical figures may be meaningless in vacuum, it is not when comparing two GPUs that are share the same design philosophy to state which one is better.

1.84TF>rumoured 1.23TF. I really do not know how much simpler it can get than that. If you are trying to claim that there would not be any discernible real world performance "difference" then one only needs to look at the real world performance differences between any two graphics card on PC based the same architecture to see how untrue it is (7770 vs 7750/7870 vs 7850/7970 vs 7950 etc).

What this rumoured ~600GFlops difference would manifest as in console environment where unlike on PC, fps are generally either locked at 60 or 30fps (in general) remains to be seen.

This, Just do the math they would be the same in flops count if and only if the ps4 would have an efficiency of 67% and the Durango an efficiency of 100% which would be unworldly. Those efficiency number were maybe true for the 360 but we are like 2~3 gpu architectures further where 85+% if im not mistaken is more the norm. Just to bring it into perspective the ps4 would have like 2~3 x 360 more performance.
 
This, Just do the math they would be the same in flops count if and only if the ps4 would have an efficiency of 67% and the Durango an efficiency of 100% which would be unworldly.

Also brought up on gasp, B3D, but it might cause a heat problem too. The analogy was how furmark makes GPU's run hot.

But I guess, the behavior of GPU's running furmark does kind of prove to me they're not normally running at full efficiency either.

IF somehow the Durango GPU was really fully utilized, it wouldn't be free, heat and power draw would increase accordingly.

I guess the upside would be, less silicon for the same performance.

Except presumably the ESRAM+12 CU's~18 CU's.

So then you're back to the advantage being, able to run with cheap DDR3. Which is a quite large cost advantage to be sure.

But yet, if you use the ESRAM as cache rather than a framebuffer, I might think you'd then have bandwidth problems.

Long story short, I dont know, lol.
 
So you want to play the "official announcement" game. If so, then all this discussion is utterly pointless and a waste of forum space until that rumoured time in April when XB3 is finally unveiled.

No there's nothing wrong with discussion of rumors. But when people that don't have a full picture start using fuzzy logic that "Rumor = Fact" then there's something wrong with that thinking.

What I'm seeing here is some people want to speculate on the rumors while others come in this and other threads to constantly to say " ...you're wrong, this is a fact " and THAT makes the whole discussion pointless. Do you see my point?
 
No there's nothing wrong with discussion of rumors. But when people that don't have a full picture start using fuzzy logic that "Rumor = Fact" then there's something wrong with that thinking.

What I'm seeing here is some people want to speculate on the rumors while others come in this and other threads to constantly to say " ...you're wrong, this is a fact " and THAT makes the whole discussion pointless. Do you see my point?
Your statement about FLOPS being a useless metric was still inaccurate, was it not?

If the rumors eventuate, then given the similarities between the systems FLOPS would be a useful metric for comparing the two systems.

I don't see why, however, in a rumor and speculation thread everyone should have to preface their posts with "If the rumors eventuate."
 
No there's nothing wrong with discussion of rumors. But when people that don't have a full picture start using fuzzy logic that "Rumor = Fact" then there's something wrong with that thinking.

What I'm seeing here is some people want to speculate on the rumors while others come in this and other threads to constantly to say " ...you're wrong, this is a fact " and THAT makes the whole discussion pointless. Do you see my point?

Stuff ripped directly from internal MS documents are facts.

But it's not like whatever we're reading currently is set in stone.
 
Yet again, arguing which architecture is more powerful by simply relying on a metric like "floating point operations per second" when you REALLY don't know just how "identical" they are? MS hasn't announced anything yet, correct me if I'm wrong?
No they have not. But do you really expect MS not to have a GPU based on the GCN architecture at this point? Honestly, I think it makes most sense to operate on the assumption that both consoles are very similar in terms of GPU architecture.
 
Interesting, so pretty much we don't know. :D. I have a feeling though that if it made any huge difference we would be seeing eSRAM on desktop cards though.

Nvidia uses bits of sram on their gpu designs, to have lower latency caches compared to ati, wich is why their gpus can more than hold their own against Amd gpus that have much higher flops ratings...

There's a nvidia presentation where they talk about roadmaps for their architectures where they say that to improve overall computability performance, it's more important to reducing fetches operations because they take more time than the computations that's done on them. Specially on heterogeneous architectures...

That was more for general computing performance than graphics, though, so i dunno with it will actually bring that much advantage in that scenario too.

This is literally wrong. Particularly when comparing two almost identical GPU architectures, GFLOPs tell you a whole lot about their relative capabilities. Not everything of course, but also not "zip", not by a long shot.

Indeed.

On desktop gpus that's def true, but they usually scale everything with the flops count, so we can't say for sure how much of the performance gains are due the increased float point performance alone.

This, Just do the math they would be the same in flops count if and only if the ps4 would have an efficiency of 67% and the Durango an efficiency of 100% which would be unworldly. Those efficiency number were maybe true for the 360 but we are like 2~3 gpu architectures further where 85+% if im not mistaken is more the norm. Just to bring it into perspective the ps4 would have like 2~3 x 360 more performance.

It seems to me that the efficiency claims comparing to 360 was the standard gcn improved utilization. Each of 360's unified shaders could perform a operation on a 4float vector and a scalar, but hardly any shader used all that at once, so you ended up with wasted silicon. GCN reduces the execution units of each shader, but puts more of them, so instructions fully utilize each shader's execution units and performance is increased... That's a standard gcn feature that both consoles have.

But other than that, there are cases where you lose performance due stalls, be it a cache miss, or the gpu waiting for the cpu, or anything like that... You could still have a 100% efficient gpu and be hold back by something else. I would guess that's the problems Ms is trying to solve with esram on durango...
 
I think what most are saying is "Microchaft, don't give us a second rate machine where it counts."

So lets hope that they know what they're doing.

I'm looking forward to illumirum, and a fast kinect. But let's not skimp on fps, Rez, and blockbuster effects in game shall we.
 
It seems to me that the efficiency claims comparing to 360 was the standard gcn improved utilization. Each of 360's unified shaders could perform a operation on a 4float vector and a scalar, but hardly any shader used all that at once, so you ended up with wasted silicon. GCN reduces the execution units of each shader, but puts more of them, so instructions fully utilize each shader's execution units and performance is increased... That's a standard gcn feature that both consoles have.

But other than that, there are cases where you lose performance due stalls, be it a cache miss, or the gpu waiting for the cpu, or anything like that... You could still have a 100% efficient gpu and be hold back by something else. I would guess that's the problems Ms is trying to solve with esram on durango...

Effective efficiency is what i meant so the bolded part.
Microsoft is solving it with esram and if im not mistaken sony is solving this by using more ASE(?) to schedule and switch between more operations to mask memory latency better. After the ps4 event haven't been following the latest rumors and info spill regarding the ps4. Because the ps4 barely reaches my minimum expectation of next gen. And so far durango rumors is below my expectation.
 
Hey guys don't complain to me about " this being a rumor thread." i wasn't arguing about rumors with kid beta, I was arguing about something that really was factually wrong with what he claimed he knew. Nobody need to preface their rumor or speculation with anything, but they shouldn't go around throwing words like "fact " out. we don't have all the facts, not even close.

Ill keep commenting though, just so those crazy rumours don't get too rampart.

If anyone is in here to stifle people talking about rumors, or speculating, it's kidbeta not me.
 
On desktop gpus that's def true, but they usually scale everything with the flops count, so we can't say for sure how much of the performance gains are due the increased float point performance alone.
Well, going by the rumours, PS4 also has a significantly higher ROP count, more TMUs and more (external) memory bandwidth. So it's not just a FLOPs difference, it's pretty much everything.
 
Effective efficiency is what i meant so the bolded part.
Microsoft is solving it with esram and if im not mistaken sony is solving this by using more ASE(?) to schedule and switch between more operations to mask memory latency better. After the ps4 event haven't been following the latest rumors and info spill regarding the ps4. Because the ps4 barely reaches my minimum expectation of next gen. And so far durango rumors is below my expectation.

But the industry seems to be winking and implying its super sweet? Who knows.

I think the massive amounts of Ram on these systems will make things quite interesting. PC games are not built to these high benchmarks, they are scaled up from something quite pitiful.

Animation, diversity of environment, scope of environment, verity in characters, should all become somewhat higher grade than what we're used to as yet.

This combined with some new techniques for dynamic resolution ensuring quality of frame rate and I doubt most will be disappointed, if not even raise a few eyebrows.
 
Nvidia uses bits of sram on their gpu designs, to have lower latency caches compared to ati, wich is why their gpus can more than hold their own against Amd gpus that have much higher flops ratings...

There's a nvidia presentation where they talk about roadmaps for their architectures where they say that to improve overall computability performance, it's more important to reducing fetches operations because they take more time than the computations that's done on them. Specially on heterogeneous architectures...

That was more for general computing performance than graphics, though, so i dunno with it will actually bring that much advantage in that scenario too.



On desktop gpus that's def true, but they usually scale everything with the flops count, so we can't say for sure how much of the performance gains are due the increased float point performance alone.



It seems to me that the efficiency claims comparing to 360 was the standard gcn improved utilization. Each of 360's unified shaders could perform a operation on a 4float vector and a scalar, but hardly any shader used all that at once, so you ended up with wasted silicon. GCN reduces the execution units of each shader, but puts more of them, so instructions fully utilize each shader's execution units and performance is increased... That's a standard gcn feature that both consoles have.

But other than that, there are cases where you lose performance due stalls, be it a cache miss, or the gpu waiting for the cpu, or anything like that... You could still have a 100% efficient gpu and be hold back by something else. I would guess that's the problems Ms is trying to solve with esram on durango...

It is interesting you should say this because that is what ERP on B3D has speculated and, some of it, confirmed by people working on durango. The thread in which he said all this is too large for me to go seeking out all his post there but his post were quite insightful.
 
So you want to play the "official announcement" game. If so, then all this discussion is utterly pointless and a waste of forum space until that rumoured time in April when XB3 is finally unveiled.

Now, going by the rumour, while peak theoretical figures may be meaningless in vacuum, it is not when comparing two GPUs that are share the same design philosophy to state which one is better.

1.84TF>rumoured 1.23TF. I really do not know how much simpler it can get than that. If you are trying to claim that there would not be any discernible real world performance "difference" then one only needs to look at the real world performance differences between any two graphics card on PC based on the same architecture to see how untrue it is (7770 vs 7750/7870 vs 7850/7970 vs 7950 etc).

What this rumoured ~600GFlops difference would manifest as in console environment where unlike on PC, fps are generally either locked at 60 or 30fps remains to be seen.

What I have learned since the specs leaked ,

  • Flops don't mean anything

  • ~800 more GFLOPS = massive difference & Wii U will be left behind by the Xbox 3 but ~ 600 more GLOPS = just a slight difference & most people will not notice the differences between the Xbox 3 & PS4 games.

  • 8GB of 68GB/s ram + 32MB of 102GB/s ESRAM = the same as 8GB of 176GB/s ram.

  • GDDR5 has a big latency problem that's going to be a issue.
 
What I have learned since the specs leaked ,

  • Flops don't mean anything

  • ~800 more GFLOPS = massive difference & Wii U will be left behind by the Xbox 3 but ~ 600 more GLOPS = just a slight difference & most people will not notice the differences between the Xbox 3 & PS4 games.

  • 8GB of 68GB/s ram + 32MB of 102GB/s ESRAM = the same as 8GB of 176GB/s ram.

  • GDDR5 has a big latency problem that's going to be a issue.

Also that eSRAM can magically make up for a raw power deficiency.

The mind literally boggles sometimes...
 
What I have learned since the specs leaked ,

  • Flops don't mean anything

  • ~800 more GFLOPS = massive difference & Wii U will be left behind by the Xbox 3 but ~ 600 more GLOPS = just a slight difference & most people will not notice the differences between the Xbox 3 & PS4 games.

  • 8GB of 68GB/s ram + 32MB of 102GB/s ESRAM = the same as 8GB of 176GB/s ram.

  • GDDR5 has a big latency problem that's going to be a issue.

Lol, I hope people don't actually believe those things.
 
What I have learned since the specs leaked ,

  • Flops don't mean anything

  • ~800 more GFLOPS = massive difference & Wii U will be left behind by the Xbox 3 but ~ 600 more GLOPS = just a slight difference & most people will not notice the differences between the Xbox 3 & PS4 games.

  • 8GB of 68GB/s ram + 32MB of 102GB/s ESRAM = the same as 8GB of 176GB/s ram.

  • GDDR5 has a big latency problem that's going to be a issue.

Having a hard time believing anyone said most of the above, especially the comments about RAM.
 
What I have learned since the specs leaked ,

  • ~800 more GFLOPS = massive difference & Wii U will be left behind by the Xbox 3 but ~ 600 more GLOPS = just a slight difference & most people will not notice the differences between the Xbox 3 & PS4 games.

well I don't know the numbers, but if Wii U is 400GLOPS , that would mean an increase of 300% to a 720. And a PS4 would be 'only' 50% more. So yeah, that is quite a difference.
 
Also that eSRAM can magically make up for a raw power deficiency.

The mind literally boggles sometimes...

The eSRAM does not make up for raw power deficiency but it might enable more utility of available resources, which is what we are discussing based on actual implementation and real world examples and thus why the choice of eSRAM instead of eDRAM might make a a different in the utility of the computing performance on the durango.

If you can't understand this then I can see how it can boggle your mind.
 
Well, going by the rumours, PS4 also has a significantly higher ROP count, more TMUs and more (external) memory bandwidth. So it's not just a FLOPs difference, it's pretty much everything.

That's true, i was just speaking in a more general case...

However, ROPs and bandwidth are points of divergence on both designs, i mean, even if the ROPs themselves are the same, the memory layout is different enough to make this comparison not as straightforward.

ROPs from instance, from the leaks durango's design use esram (and probably the DME's ability to tile data) to keep their caches fed with portions of screen and so they can achieve peak performance. If they are able to actually operate at peak performance, that means durango even at 1080p will have about 40% more fillrate available (if i calculated it right) than 360 had at 720p. I dunno how that will work out in real world performance, but even if Ps4 is better at this, it's possible that durango is good enough for 1080p, so any advantage would only show on higher resolutions or 3d games...

The bandwidth maybe also not as straight as it looks. Ps4 obviously have more available, but durango seems to have a system in place that allows for some operations to use both memories at once, again, it's difficult to compare just by looking at the specs...

TMUs seems to be a clear area of advantage of Ps4...

Assuming they have similar cost, size, consumption budgets, release dates, i'm considering they have similar performance targets, and designed their consoles to hit that performance while staying inside the other constraints, and that we will again have two systems that while had significant differences in the design overall performed relatively the same...

Everything points out so far that on a best case scenario durango will at most match Ps4, while it could trail way behind in non optimal cases, though. So maybe, Ms had tighter constraints on cost than sony had, and thus had to make a console that makes most use out of less parts to work with.
 
The eSRAM does not make up for raw power deficiency but it might enable more utility of available resources, which is what we are discussing based on actual implementation and real world examples and thus why the choice of eSRAM instead of eDRAM might make a a different in the utility of the computing performance on the durango.

If you can't understand this then I can see how it can boggle your mind.

Low latency for GPUs isn't a big deal, they have been working fine with high latency GDDR based RAM for donkey's years.

I can understand what people are saying, I just don't see how it leads to conclusions like this, "ESRAM+12 CU's~18 CU's" because that's just bollocks. eSRAM is not going to make that much difference, more likely it will allow for better real time calculations for Kinect, that's why Durango looks like a latency monster.
 
What I have learned since the specs leaked ,

  • Flops don't mean anything

  • ~800 more GFLOPS = massive difference & Wii U will be left behind by the Xbox 3 but ~ 600 more GLOPS = just a slight difference & most people will not notice the differences between the Xbox 3 & PS4 games.



Flops means, but flops are only a part of the equation, you must understand it.
 
What I have learned since the specs leaked ,

  • Flops don't mean anything

  • ~800 more GFLOPS = massive difference & Wii U will be left behind by the Xbox 3 but ~ 600 more GLOPS = just a slight difference & most people will not notice the differences between the Xbox 3 & PS4 games.

  • 8GB of 68GB/s ram + 32MB of 102GB/s ESRAM = the same as 8GB of 176GB/s ram.

  • GDDR5 has a big latency problem that's going to be a issue.
A lot of people got bent out of shape after the PS4 reveal.
 
Also that eSRAM can magically make up for a raw power deficiency.

The mind literally boggles sometimes...

It's not that the esram can magically make up for a raw power rate. It's that that raw power does not translate directly to performance at all scenarios. If it were we could just throw our cpus in the garbage and just use the gpus for everything because they have at least an order of magnitude more float point performance than a cpu.

Over simplifying/exaggerating to make a point:
Lets take two different gpus, one with many processing units, that can process a workload for a frame in about 0.1ms. The other has significant less resources, and takes 14ms to do the exact same work.

So now, if your first gpu takes 16.6ms just to gather the necessary data to render a frame, then it means you are never ever hitting 60fps on this gpu. While the second gpu that has significant less compute power, but only takes 2ms to fetch the data it needs. Overall the frame time is lower than 16.6ms and despite being significantly less powerful that gpu can play the game at 60fps.

It's all about workload. Do you know for sure where do games most spend their time to say whether or not esram can increase performance compared to more processing units? I sure as well don't, but as a general rule processing power is usually the cheapest resource you have. And one you usually trade for other stuff (like compression which can be seen as trade off of compute resources in favor of more ram/bandwidth)
 
What I have learned since the specs leaked ,

  • GDDR5 has a big latency problem that's going to be a issue.

I think this point above all else drives me up the wall. I love the way some people constantly try to equalise the two console and clutch to that figure. They allegedly know how exactly GDDR5 would work in a customized APU environment. They also imply that this latency is apparently so big that it almost nullifies the bandwidth advantage. To me it seems like nVidia and AMD are run by idiots who clearly should have stuck with DDR3 with their mid-high end GPU given the constant insinuation of bandwidth being useless when bottlenecked by latency.

Flops means, but flops are only a part of the equation, you must understand it.

And what about the rumoured no. of ROPs and TMUs compared to its competition? Those are only another part of the equation, right?
 
It's not that the esram can magically make up for a raw power rate. It's that that raw power does not translate directly to performance at all scenarios. If it were we could just throw our cpus in the garbage and just use the gpus for everything because they have at least an order of magnitude more float point performance than a cpu.

Over simplifying/exaggerating to make a point:
Lets take two different gpus, one with many processing units, that can process a workload for a frame in about 0.1ms. The other has significant less resources, and takes 14ms to do the exact same work.

So now, if your first gpu takes 16.6ms just to gather the necessary data to render a frame, then it means you are never ever hitting 60fps on this gpu. While the second gpu that has significant less compute power, but only takes 2ms to fetch the data it needs. Overall the frame time is lower than 16.6ms and despite being significantly less powerful that gpu can play the game at 60fps.

It's all about workload. Do you know for sure where do games most spend their time to say whether or not esram can increase performance compared to more processing units? I sure as well don't, but as a general rule processing power is usually the cheapest resource you have. And one you usually trade for other stuff (like compression which can be seen as trade off of compute resources in favor of more ram/bandwidth)

But that's the problem, in your simplified scenario it might make a difference, but in reality we have GDDR5 GPUs in PCs running games are 120fps or above. It doesn't seem like a very big deal, and it feels like certain people are trying to turn it into an issue so Durango doesn't seem so bad. Which is quite sad.
 
well I don't know the numbers, but if Wii U is 400GLOPS , that would mean an increase of 300% to a 720. And a PS4 would be 'only' 50% more. So yeah, that is quite a difference.

Well, this is one of those math questions that change according to which way the problem is tackled.

The apparent correct way of doing it is like so:

1.23 is 67.5% higher than 400
1.84 is 33% higher than 1.23

Doesn't look so bad now does it. Even on tech sites I have have seen them say that 54.4GB bandwidth is 82% higher than 29.8GB. Which is correct if you add 82% to 29.8. The 'correct' answer is apparently 46%.
 
I think this point above all else drives me up the wall. I love the way some people constantly try to equalise the two console and clutch to that figure. They allegedly know how exactly GDDR5 would work in a customized APU environment. They also imply that this latency is apparently so big that it almost nullifies the bandwidth advantage. To me it seems like nVidia and AMD are run by idiots who clearly should have stuck with DDR3 with their mid-high end GPU given the constant insinuation of bandwidth being useless when bottlenecked by latency.

Agreed. It's a difference of nanoseconds that can easily be countered by a good memory controller and data management on the dev end. It's a complete non issue and will have no negative effect on PS4's performance.
 
Low latency for GPUs isn't a big deal, they have been working fine with high latency GDDR based RAM for donkey's years.

I can understand what people are saying, I just don't see how it leads to conclusions like this, "ESRAM+12 CU's~18 CU's" because that's just bollocks. eSRAM is not going to make that much difference, more likely it will allow for better real time calculations for Kinect, that's why Durango looks like a latency monster.

Something not being a big deal does not mean its not a deal at all. ERP stated that although modern gpu rendering pipelines resort to having a large register pool, queues and are highly multi-threaded to hide latency, they still stall when fetching instruction from memory and this was backed up by the fact that NV achieve higher utility/performance as compared to AMD despite having lower FLOP/s because of the relatively larger caches in their design. Besides that, the discussion was more about the effect of latency on compute performance which, despite the current performance on current gpus, is still highly latency sensitive. So the choice of eSRAM could lead to better utilization. Looking at the increased compute ACEs on the ps4 shows a need to increase efficiency when running/switching to compute shaders in order to provide better utilization of resources by having more compute queues and threads.

How effective will the eSRAM prove to be, we just have to wait and see while developers spend time with the system.
 
The kidbeta posts really weren't rumor, he's clearly a fan of Sony and likes to argue for their perspective. I see a trend of people throwing the word "Fact" around a lot when talking about rumors and that irks me when we don't have any actual facts just yet. (for both sides)

The oft-iterated rumors may turn out to be true (where there's smoke, there's fire), but I agree. That guy definitely seems to have an agenda, and he invests a LOT of time and energy to move it forward.
 
Well, this is one of those math questions that change according to which way the problem is tackled.

The apparent correct way of doing it is like so:

1.23 is 67.5% higher than 400
1.84 is 33% higher than 1.23

Doesn't look so bad now does it. Even on tech sites I have have seen them say that 54.4GB bandwidth is 82% higher than 29.8GB. Which is correct if you add 82% to 29.8. The 'correct' answer is apparently 46%.

Who came up with this idea?

It's 300%, 50% and 80% for all of those examples. The smaller number is used as a base for "faster" and the larger number is used as a base for "smaller". The Durango GPU has 66% of the performance of PS4 GPU, that doesn't mean the PS4 GPU is 33% faster.
 
Agreed. It's a difference of nanoseconds that can easily be countered by a good memory controller and data management on the dev end. It's a complete non issue and will have no negative effect on PS4's performance.

Is it not something that AMD are trying to customize the GPU for with 8 ASE (or was it ACE) and 64 instruction que (iirc) or something to that effect to increase GPU efficiency and eliminate any downtime for any CU be it for GPGPU or rendering related tasks?
 
The eSRAM does not make up for raw power deficiency but it might enable more utility of available resources, which is what we are discussing based on actual implementation and real world examples and thus why the choice of eSRAM instead of eDRAM might make a a different in the utility of the computing performance on the durango.

If you can't understand this then I can see how it can boggle your mind.

The way I see it a certain number of CUs requires a minimum bandwidth. If a 10CU AMD card is using 72GB/s GDDR5 and a 16CU card is using 153GB/s GDDR5 (they also scale the ROPS and texture units with CU number for balance). So the Orbis 18CUs sharing memory with the CPU has 176GB/s and the Durango has 12CUs and an effective bandwidth between 76GB/s and 102GB/s. It seems to me there is nothing magic going on here, the GCN architecture scales with GPU units (CUs, ROPS,etc) and memory bandwidth. 76GB/s is not enough for an APU with 12CUs, but they wanted 8GB and the cost effective way to boost bandwidth is with eDRAM/eSRAM, the same way the 360 and PS2 did.

Here is a little ratio (#CU/bandwidth) to illustrate it:

10/72= .139 CU/GB/s (AMD 7770)
8/72 = .111 (AMD 7750)
16/153= .104 (AMD 7850)

18/176= .102 (Orbis)
12/76 = .158 (Durango pure DDR3)
12/102= .118 (Durango pure eSRAM)

The higher the number, the more likely the GPU is bandwidth starved. The Orbis and Durango numbers are actaully a little worse than they look, they are APUs so they have to share the bandwidth with the CPU. Now of course this is simplified and ignores clockspeeds of the GPU, ROPS, etc. But illustrates a trend with the GCN hardware.
 
I think this point above all else drives me up the wall. I love the way some people constantly try to equalise the two console and clutch to that figure. They allegedly know how exactly GDDR5 would work in a customized APU environment. They also imply that this latency is apparently so big that it almost nullifies the bandwidth advantage. To me it seems like nVidia and AMD are run by idiots who clearly should have stuck with DDR3 with their mid-high end GPU given the constant insinuation of bandwidth being useless when bottlenecked by latency.



And what about the rumoured no. of ROPs and TMUs compared to its competition? Those are only another part of the equation, right?

I'm talking about the differences between Wii U and next Xbox. He is saying that the differences between Wii U and Durango are the same than Durango and PS4 because flops.

CPU and memory size are the same between PS4 and Durango, and GPU family is the same too.
 
Well, this is one of those math questions that change according to which way the problem is tackled.

The apparent correct way of doing it is like so:

1.23 is 67.5% higher than 400
1.84 is 33% higher than 1.23

Doesn't look so bad now does it. Even on tech sites I have have seen them say that 54.4GB bandwidth is 82% higher than 29.8GB. Which is correct if you add 82% to 29.8. The 'correct' answer is apparently 46%.


you are actually talking declines.

1.23 is 300% of 400
1.84 is 150% of 1.23

1.23 is 200% more than 400
1.84 is 50% more than 1.23

400 is 67% less than 1.23
1.23 is 33% less than 1.84
 
Something not being a big deal does not mean its not a deal at all. ERP stated that although modern gpu rendering pipelines resort to having a large register pool, queues and are highly multi-threaded to hide latency, they still stall when fetching instruction from memory and this was backed up by the fact that NV achieve higher utility/performance as compared to AMD despite having lower FLOP/s because of the relatively larger caches in their design. Besides that, the discussion was more about the effect of latency on compute performance which, despite the current performance on current gpus, is still highly latency sensitive. So the choice of eSRAM could lead to better utilization. Looking at the increased compute ACEs on the ps4 shows a need to increase efficiency when running/switching to compute shaders in order to provide better utilization of resources by having more compute queues and threads.

How effective will the eSRAM prove to be, we just have to wait and see while developers spend time with the system.

But with high bandwidth and extreme multi-threading (an environment game developers are already used to btw) the latency issue becomes a non-issue.

Look, it's pretty clear why MS chose to go with eSRAM, they wanted to guarantee a launch with 8GB of system RAM, and at the time the only way to do this was to go with DDR3 which left them with bandwidth problems, they stuck a bunch of eSRAM onto the die to counter it. Sony, otoh, seemed less bothered by the total amount of RAM and were happy with just 4GB then at the end of the development period 4Gb GDDR5 chip became available and they were able to double up from 4GB to 8GB without changing the internal design too much (especially given that they look to be using low power chips).

It's just two different strategies and Sony got lucky with the timing of the density increase. If the rumours are true and MS are going to reserve ~ 3GB for system functions, I can see why they absolutely wanted to guarantee that 8GB total. Not taking a risk and hoping for GDDR5 upgrades to make it in time and designing for 8GB early and making up for the bandwidth deficiency with other design features makes a lot of sense from this point of view.

Going for low latency also makes sense if they are sticking Kinect into every box, they would need good real time calculations of movements etc... and low latency would definitely help them.
 
The way I see it a certain number of CUs requires a minimum bandwidth. If a 10CU AMD card is using 75GB/s GDDR5 and a 16CU card is using 153GB/s GDDR5 (they also scale the ROPS and texture units with CU number for balance). So the Orbis 18CUs sharing memory with the CPU has 176GB/s and the Durango has 12CUs and an effective bandwidth between 76GB/s and 102GB/s. It seems to me there is nothing magic going on here, the GCN architecture scales with GPU units (CUs, ROPS,etc) and memory bandwidth. 76GB/s is not enough for an APU with 12CUs, but they wanted 8GB and the cost effective way to boost bandwidth is with eDRAM/eSRAM, the same way the 360 and PS2 did.

There certainly is no magic going on, but saying that the choice of eSRAM is simply to solve the bandwidth issue is wrong especially since; they could have gone for eDRAM at more capacity/bandwidth or could have gone for the same 32mb at cheaper cost in terms of die/BoM, or that the vgleaks and the durango summit papers specifically mention the lower latency of the memory choice and even gives an example of where it can be useful (ROP ops).
 
Low latency for GPUs isn't a big deal, they have been working fine with high latency GDDR based RAM for donkey's years.

I can understand what people are saying, I just don't see how it leads to conclusions like this, "ESRAM+12 CU's~18 CU's" because that's just bollocks. eSRAM is not going to make that much difference, more likely it will allow for better real time calculations for Kinect, that's why Durango looks like a latency monster.

Honest question: I know gpu are design to work with high latency ram, but how much of that benefits a highly programmable pipeline instead of a fixed one? I mean, okay, current gpus are overly memory efficient at running simple shaders like vertex transformations, normal maps etc... But how they fare when these shaders get more complex, like when you store your scene in a non regular data structure and use that to render and lit the scene? What about more complex screen space effects like local based reflections, or even complex post processing effects like object motion blur.

That's something gpus have been doing for years now, but are they really efficient on these types of workload? I know they usually have lots of threads in flight so when one needs to access memory there are other threads that the gpu can do to keep it occupied, but from what i see when gaming, simply overclocking the gpu's memory (but not the core itself) can make a game using some of these effects run faster. That means they are also memory bound right? And more importantly, that whatever the gpu does to hide latency not always can hide it, or at least only to a certain point.

What i mean is: Are gpus realy good at hiding latency at work current and future games have/are going to have or it's an area where you can come out with a new paradigm and drastically increase performance?
 
But with high bandwidth and extreme multi-threading (an environment game developers are already used to btw) the latency issue becomes a non-issue.

Look, it's pretty clear why MS chose to go with eSRAM, they wanted to guarantee a launch with 8GB of system RAM, and at the time the only way to do this was to go with DDR3 which left them with bandwidth problems, they stuck a bunch of eSRAM onto the die to counter it. Sony, otoh, seemed less bothered by the total amount of RAM and were happy with just 4GB then at the end of the development period 4Gb GDDR5 chip became available and they were able to double up from 4GB to 8GB without changing the internal design too much (especially given that they look to be using low power chips).

It's just two different strategies and Sony got lucky with the timing of the density increase. If the rumours are true and MS are going to reserve ~ 3GB for system functions, I can see why they absolutely wanted to guarantee that 8GB total. Not taking a risk and hoping for GDDR5 upgrades to make it in time and designing for 8GB early and making up for the bandwidth deficiency with other design features makes a lot of sense from this point of view.

Going for low latency also makes sense if they are sticking Kinect into every box, they would need good real time calculations of movements etc... and low latency would definitely help them.

I am not saying that the eSRAM is not there to provide more bandwidth, what I am saying is that it is not the sole reason why they are using it else they could have gone for eDRAM and have higher bandwidth/capacity or lower cost at the same capacity. The documents we have seen on the eSRAM on the durango specifically states the low latency of the memory choice and even gives example of its benefit.
 
There certainly is no magic going on, but saying that the choice of eSRAM is simply to solve the bandwidth issue is wrong especially since; they could have gone for eDRAM at more capacity/bandwidth or could have gone for the same 32mb at cheaper cost in terms of die/BoM, or that the vgleaks and the durango summit papers specifically mention the lower latency of the memory choice and even gives an example of where it can be useful (ROP ops).

They can be using eDRAM (1T SRAM), we don't have much hard data at this point. Lower latency doesn't say much either, it is not a number, any cache is lower than the DDR3 latency. The details will be interesting, I wonder how much money and risk MS wants to put into making their APU large and if the benefits are tangible.
 
I am not saying that the eSRAM is not there to provide more bandwidth, what I am saying is that it is not the sole reason why they are using it else they could have gone for eDRAM and have higher bandwidth/capacity or lower cost at the same capacity. The documents we have seen on the eSRAM on the durango specifically states the low latency of the memory choice and even gives example of its benefit.

Yes, going for low latency makes sense for decent real time calculations for Kinect 2.0. Taking the hit and going with 6T eSRAM rather than eDRAM makes sense to me if they are going to include Kinect 2.0 in every box and have it on all the time. That's what this box seems designed to do, be a low latency monster with lots of RAM in order to power Kinect 2.0 features.

Honest question: I know gpu are design to work with high latency ram, but how much of that benefits a highly programmable pipeline instead of a fixed one? I mean, okay, current gpus are overly memory efficient at running simple shaders like vertex transformations, normal maps etc... But how they fare when these shaders get more complex, like when you store your scene in a non regular data structure and use that to render and lit the scene? What about more complex screen space effects like local based reflections, or even complex post processing effects like object motion blur.

That's something gpus have been doing for years now, but are they really efficient on these types of workload? I know they usually have lots of threads in flight so when one needs to access memory there are other threads that the gpu can do to keep it occupied, but from what i see when gaming, simply overclocking the gpu's memory (but not the core itself) can make a game using some of these effects run faster. That means they are also memory bound right? And more importantly, that whatever the gpu does to hide latency not always can hide it, or at least only to a certain point.

What i mean is: Are gpus realy good at hiding latency at work current and future games have/are going to have or it's an area where you can come out with a new paradigm and drastically increase performance?

GPUs are very good at hiding latency, they use extreme multi-threading and that change was made so that the higher bandwidth of GDDR could be used properly. Going back to PS2 style rendering with low system memory and ultra low latency is a tough nut to crack. It won't lead to any revolution.
 
I'm talking about the differences between Wii U and next Xbox. He is saying that the differences between Wii U and Durango are the same than Durango and PS4 because flops.

CPU and memory size are the same between PS4 and Durango, and GPU family is the same too.

True. I do not believe that WiiU's GPU will be able to compete at the same level as Durango's GPU and that goes beyond just flop count. Its rumoured ROPs and TMU are at least double of Latte. Add to that the DX 11.x integration and consequent features and its a generation ahead.

Truthfully, in a closed box environment where GPUs are being customized to eke out performance to match their theoretical FLOP figure and games are limited by their resolution and frame rate (in general) these GPUs should be able to punch well above their weight. Plus, GPGPU functionality means we will be moving beyond visuals into the more physics related aspects to aid the sensation of immersion.

I look at GoWA, TLoU, Halo 4, GeoW3 etc and compare to what the first few games in the generation offered. Had anyone told me then that visuals like these would possible on current gen systems due continual learning and experimentation by dev teams, I probably would have dismissed it as pure BS. So, I have faith in what the devs can do with PS4 and XB3 in the next 3-4 years.
 
Who came up with this idea?

It's 300%, 50% and 80% for all of those examples. The smaller number is used as a base for "faster" and the larger number is used as a base for "smaller". The Durango GPU has 66% of the performance of PS4 GPU, that doesn't mean the PS4 GPU is 33% faster.

It came from bkilian over at B3D. I admit I may have misunderstood his post:

bkilian said:
Yes, you can, and you must. 50% "less" always refers to the percentage of the larger number. The change being referred to is in relation to the PS4, so the PS4 is the old value, and durango is the new value.
Percent change == (|old - new|)/old * 100 == (32 - 16)/32 * 100 == 50.
So the theoretical Durango has 50% less ROPS. If we were talking about what percent _more_ ROPS the PS4 has, then we are using Durango as old, and PS4 and new, so we get |16 - 32|/16 * 100 == 16/16 *100 = 100% more ROPS in PS4.
Percentage change is one of the most misunderstood parts of general statistics.

itsgreen said:
you are actually talking declines.

1.23 is 300% of 400
1.84 is 150% of 1.23

1.23 is 200% more than 400
1.84 is 50% more than 1.23

400 is 67% less than 1.23
1.23 is 33% less than 1.84

I bow down to your greater knowledge!
 
They can be using eDRAM (1T SRAM), we don't have much hard data at this point. Lower latency doesn't say much either, it is not a number, any cache is lower than the DDR3 latency. The details will be interesting, I wonder how much money and risk MS wants to put into making their APU large and if the benefits are tangible.

Or they might be using the 6t-sram as alluded by acert93 on b3d. All documents thus far has referred to it as eSRAM even when comparing it to the eDRAM on the 360, actually, especially when comparing both together.
 
Yes, going for low latency makes sense for decent real time calculations for Kinect 2.0. Taking the hit and going with 6T eSRAM rather than eDRAM makes sense to me if they are going to include Kinect 2.0 in every box and have it on all the time. That's what this box seems designed to do, be a low latency monster with lots of RAM in order to power Kinect 2.0 features.



GPUs are very good at hiding latency, they use extreme multi-threading and that change was made so that the higher bandwidth of GDDR could be used properly. Going back to PS2 style rendering with low system memory and ultra low latency is a tough nut to crack. It won't lead to any revolution.

Lets assume this is primarily there reason for using SRAM, it does not exclude it from being of benefit for other cases, especially when the one example they gave of its benefit has nothing to do with Kinect 2.0:

Durango has no video memory (VRAM) in the traditional sense, but the GPU does contain 32 MB of fast embedded SRAM (ESRAM). ESRAM on Durango is free from many of the restrictions that affect EDRAM on Xbox 360. Durango supports the following scenarios:

Texturing from ESRAM
Rendering to surfaces in main RAM
Read back from render targets without performing a resolve (in certain cases)


The difference in throughput between ESRAM and main RAM is moderate: 102.4 GB/sec versus 68 GB/sec. The advantages of ESRAM are lower latency and lack of contention from other memory clients—for instance the CPU, I/O, and display output. Low latency is particularly important for sustaining peak performance of the color blocks (CBs) and depth blocks (DBs).
 
even if durango is weaker on paper, even by an amount that would be noticable in games, I don't see MS panicking. Firstly they are probably happy with their offer overall and how it is balanced. Secondly, I expect they think they can steamroller Sony with marketing. I expect unprecedented amounts of spend on this launch - it'll make kinect look like an ad in your local paper.

Scary thing is it could work well too.
 
Reading this thread is like overhearing people talking nonsense about games in public. You want to jump in correct them so bad but you just can't won't.
You have the documents, but you choose not to talk about it. Instead, you choose to talk about fairydust and what 'might' happen. Honestly guys, give it a rest and wait for the damn thing to be revealed.
 
Or they might be using the 6t-sram as alluded by acert93 on b3d. All documents thus far has referred to it as eSRAM even when comparing it to the eDRAM on the 360, actually, especially when comparing both together.

You mean the guy who has no insider info and is yet another MS cheerleader? You need to be more discerning with who you read and believe. I'm inclined to take the leaked docs at face value and believe 6t (otherwise the would be misusing the name eSRAM).
 
Top Bottom