WiiU "Latte" GPU Die Photo - GPU Feature Set And Power Analysis

Status
Not open for further replies.
Firstly, it's all unverifiable speculation.

Second the point you made about Xenos not utilizing all shaders is obviously correct, but VLIW5 constantly ignores 1/5 of it's shaders, this means that while Xenos might only use 200 shaders from time to time, Wii U is only using 128. And while VLIW5 is obviously more capable shaders, you are suggesting that it is basically 100% more capable. sure the clock is 10% higher, and efficiency could maybe make up for 30% of that count. It's still far short of matching Xenos, lets not pretend that VLIW5 has no latency issues causing those 128 shaders to never go idle, there is ~100 cycle latency in R700, so yes if it's 160ALUs it is very different than my old HD 4870 with 4/5th of it's shaders removed.

For 160 shaders to match Xenos, Xenos has to only utilize about 180 ALUs out of the 240 it has for the majority of the frames it draws, this only allows Wii U to match a crippled Xenos, not exceed it as we have heard from developers that it does. It's also funny that the Tekken developers have said that Wii U's GPU is 1.5x Xenos. Obviously these numbers don't mean much, but assuming they are directly comparing the capabilities, it points to 256 ALUs in Wii U, not 160.



Simply because they expect those parts to be binned, even though they are created in the 100s of thousands for everything from PoS machines to casino machines. Also embedded designs similar to these are used in arcade machines as well. Still it's taboo to use these in serious discussions because they might seem optimistic.

There isn't enough SRAM for 256 ALUs, unless AMD has customized the registers, which seems to be a serious stretch.
 
Well the amd e6460 is 160 ALU part @ 600Mhz and TDP is 21 watts.

http://www.amd.com/us/Documents/E6460-MXM-ProductBrief.pdf

Again point toward 160alu part given the power consumption

Again do you have source for any of these numbers or just pulling most of them out the air?

I will have to look but i believe I read MS about the x360 was it only use 50% of the gpu because of it design. They were making changes in the 720 to use more of the gpu. Maybe someone know about the doc and give us some hard numbers.

it also has half a gig of gddr5 built in
 
There isn't enough SRAM for 256 ALUs, unless AMD has customized the registers, which seems to be a serious stretch.

random curveball here and i dont know what i'm talking about, could it be possible that the 1mb module of sram (that we know nothing of its purpose outside wii mode) could be used for registers?
 
Wow has no one look at the power consumption number of AMD cards at 40nm?

Its clear as day that 320 is just impossible. Given the math the card cannot use more than at the most 15 watts. That even high IMO.

HD 5550 320:16:8 shader card @ 550 MHz is 33 Watts 40 nm card
HD 6450 160:8:4 Shader card @ 625 Mhz is 13 watts 40 nm card

http://www.techpowerup.com/reviews/HIS/Radeon_HD_5550/27.html
http://www.techpowerup.com/reviews/Sapphire/HD_6450_Passive/25.html

The only thing that point against 160 is the size of the on the card itself. My theory is they spread out the shader core to reduce the heat output. That does sound like a really nintendo thing to do. Please note the rumored card in the wiiu is 160:8:8 so will draw little more power than the 6450. Just fit perfect....

One thing is clear, 320 is just too big to fit at 40nm and uses way too much power.... 320 is out...


I'm not sure if that's as clear as you make it out to be. For example, the Mobility Radeon HD 5650, based on the same RV830 chip as HD5550, but using all hardware units (400:20:8) has a TDP of 15/19W at 450/650mhz respectively. Which actually sounds like it's within the limits of the Wii U GPU. Again, that's 400 SPUs, i.e. more than we are seeing as the very best case for Wii U.
 
I will have to look but i believe I read MS about the x360 was it only use 50% of the gpu because of it design. They were making changes in the 720 to use more of the gpu. Maybe someone know about the doc and give us some hard numbers.

http://www.anandtech.com/show/4061/amds-radeon-hd-6970-radeon-hd-6950/4

Anandtech said:
We’ve already touched on how in games AMD is seeing an average of 3.4, which is actually pretty good but still is under 80% efficient. Ultimately extracting ILP from a workload is hard, leading to a wide delta between the best and worst case scenarios.

VLIW5 gives only ~70% efficiency on average, in a console setting it would achieve 80-81% maybe.

While your quote is probably ridiculously outdated, 50% of 240 shaders is still 120, something at best case would just barely be edged out by Wii U's 128 shaders @ 80% efficiency now they get a small 10% bump thanks to a higher clock, but this means that NFSMWU is pulling out all the stops and reaching it's limit in the launch window just to edge out Xenos' 50% efficiency. If you could pull a quote that is recent that points to Xenos only using 120 shaders on average, that would be something at least, but until I see that Xenos coded to the metal was far inferior to R700. There is just no way Wii U could keep up with a Xenos at a higher efficiency, certainly not reach a higher performance level. If it's VLIW with 160 shaders, it is a very custom design meant to push close to 100% efficiency, of that much I am certain.
 
Wow has no one look at the power consumption number of AMD cards at 40nm?

Its clear as day that 320 is just impossible. Given the math the card cannot use more than at the most 15 watts. That even high IMO.

HD 5550 320:16:8 shader card @ 550 MHz is 33 Watts 40 nm card
HD 6450 160:8:4 Shader card @ 625 Mhz is 13 watts 40 nm card

http://www.techpowerup.com/reviews/HIS/Radeon_HD_5550/27.html
http://www.techpowerup.com/reviews/Sapphire/HD_6450_Passive/25.html

The only thing that point against 160 is the size of the on the card itself. My theory is they spread out the shader core to reduce the heat output. That does sound like a really nintendo thing to do. Please note the rumored card in the wiiu is 160:8:8 so will draw little more power than the 6450. Just fit perfect....

One thing is clear, 320 is just too big to fit at 40nm and uses way too much power.... 320 is out...
http://www.techpowerup.com/gpudb/1730/mobility-radeon-hd-560v.html
550MHz, 15W, 55nm

:)
 
I'm not sure if that's as clear as you make it out to be. For example, the Mobility Radeon HD 5650, based on the same RV830 chip as HD5550, but using all hardware units (400:20:8) has a TDP of 15/19W at 450/650mhz respectively. Which actually sounds like it's within the limits of the Wii U GPU. Again, that's 400 SPUs, i.e. more than we are seeing as the very best case for Wii U.

The reason the Mobile parts are able to get this low is the parts are bin. So the top 10% or whatever number of chips can hit this power usage. The other chips are sent to desktop or lower clock parts.

This is why mobile chips have such low power usage. So to make this work for a mass product product you would have to throw away most of the chips you made.

http://www.anandtech.com/show/4061/amds-radeon-hd-6970-radeon-hd-6950/4



VLIW5 gives only ~70% efficiency on average, in a console setting it would achieve 80-81% maybe.

While your quote is probably ridiculously outdated, 50% of 240 shaders is still 120, something at best case would just barely be edged out by Wii U's 128 shaders @ 80% efficiency now they get a small 10% bump thanks to a higher clock, but this means that NFSMWU is pulling out all the stops and reaching it's limit in the launch window just to edge out Xenos' 50% efficiency. If you could pull a quote that is recent that points to Xenos only using 120 shaders on average, that would be something at least, but until I see that Xenos coded to the metal was far inferior to R700. There is just no way Wii U could keep up with a Xenos at a higher efficiency, certainly not reach a higher performance level. If it's VLIW with 160 shaders, it is a very custom design meant to push close to 100% efficiency, of that much I am certain.
Not out dated they were from the xbox summit last year. The docs leaked online. Maybe it wasnt that. I try searching but didnt find anything so it maybe wrong.
 
There isn't enough SRAM for 256 ALUs, unless AMD has customized the registers, which seems to be a serious stretch.

http://www.neogaf.com/forum/showpost.php?p=57639606&postcount=4736 Or the professionals in this post know more about this than you or I, and have come to the conclusion that those registers could be more dense than others on the die which often happens with hand layouts according to them.

They might be 6kb or even 8kb, what the people who actually know what the hell they are talking about has said as much, so this point is completely suspect.
 
http://www.neogaf.com/forum/showpost.php?p=57639606&postcount=4736 Or the professionals in this post know more about this than you or I, and have come to the conclusion that those registers could be more dense than others on the die which often happens with hand layouts according to them.

They might be 6kb or even 8kb, what the people who actually know what the hell they are talking about has said as much, so this point is completely suspect.

Fair points. I missed that post somehow.
 
I know it has been mentioned before but with cod, i think they were taking the same approach as they did with the ps3/360 and didn't even bother to optimize the code for the wii u. Not sure how to word it but i think the way we would see the true power of the wii u is to just wait for games like x, metroid prime... first party games that take into account the strength and weakness of the console and optimize it.

On another note, it's kind of silly to reference this but the 100YOG has been right with some of his Nintendo predictions in the past someone asked a question about retros project:


Let's wait a couple more weeks.



If I make a bunch of vague predictions about Nintendo, you can bet some of them can be interpreted as right in hindsight.
 
Fair points. I missed that post somehow.

It's ok this thread is moving fast again and that was posted this morning so it's just an interesting point and shows that BG's assessment of the GPU is just as grounded as anyone elses (256 ALUs) of course that means they aren't VLIW5, but it could be custom or 30 ALUs per SPU, though that seems like an odd numbering, it would give Wii U 240 ALUs and allow their more efficient design to directly compete with 8 year old hardware.
 
It's ok this thread is moving fast again and that was posted this morning so it's just an interesting point and shows that BG's assessment of the GPU is just as grounded as anyone elses (256 ALUs) of course that means they aren't VLIW5, but it could be custom or 30 ALUs per SPU, though that seems like an odd numbering, it would give Wii U 240 ALUs and allow their more efficient design to directly compete with 8 year old hardware.
The only thing that really goes against it flat out is why would they move from VLIW5 yet stay at dx10.1. Makes no sense....
 
Do we know enough about the API to say what features are missing from being DX11 equivalent?

Only USC-fan knows these things. btw USC-fan did you find that efficiency quote? as you've said maybe you are wrong, but if you are right, it could mean that Wii U is 160 ALUs even VLIW5 and that would be fine. it would still outperform Xenos handily IF that is the case.
BTW VLIW5 is also DX11, it is about the shaders used and we know that it has features beyond DX10.1 so who knows if this is even a knock against it. (hope you didn't mind the jab usc-fan, I just had to point out that your point is factless)
 
Firstly, it's all unverifiable speculation.

Second the point you made about Xenos not utilizing all shaders is obviously correct, but VLIW5 constantly ignores 1/5 of it's shaders, this means that while Xenos might only use 200 shaders from time to time, Wii U is only using 128. And while VLIW5 is obviously more capable shaders, you are suggesting that it is basically 100% more capable. sure the clock is 10% higher, and efficiency could maybe make up for 30% of that count. It's still far short of matching Xenos, lets not pretend that VLIW5 has no latency issues causing those 128 shaders to never go idle, there is ~100 cycle latency in R700, so yes if it's 160ALUs it is very different than my old HD 4870 with 4/5th of it's shaders removed.

For 160 shaders to match Xenos, Xenos has to only utilize about 180 ALUs out of the 240 it has for the majority of the frames it draws, this only allows Wii U to match a crippled Xenos, not exceed it as we have heard from developers that it does. It's also funny that the Tekken developers have said that Wii U's GPU is 1.5x Xenos. Obviously these numbers don't mean much, but assuming they are directly comparing the capabilities, it points to 256 ALUs in Wii U, not 160.



Simply because they expect those parts to be binned, even though they are created in the 100s of thousands for everything from PoS machines to casino machines. Also embedded designs similar to these are used in arcade machines as well. Still it's taboo to use these in serious discussions because they might seem optimistic.

So basically you are saying Xenos' configuration is a better setup than VLIW5 even though VLIW5 came after and is obviously more advanced? What makes you think that Xenos has it topped in efficiency when its shader architecture is very similar yet less advanced? You are just confusing people by throwing all these shader numbers and percentages around in your posts. It is clearly apparent that you have a pretty cursory understanding of these things as you are speaking in ways that make broad generalizations and oversimplify what is an extremely complicated topic in hardware engineering and game engine programming. That's ok, but don't speak as if you are any sort of authority on the matter. I can almost guarantee I've read most of the same articles you have concerning VLIW5/VLIW4/GCN, but I acknowledge the limits of my expertise. You ought to try doing the same.

You also really believe that AMD employed an unheard of architecture, that you designed over a known design that developers are familiar with? The whole thing reeks of grasping at straws, I dare say because we were getting closer to the truth and you didn't like the sound of it. It just couldn't be true, so now we are latching on to anything that is remotely possible and claiming it more likely. The same thing happened when arkam came around and was subsequently run out of town, yet he was ultimately vindicated.
 
The reason the Mobile parts are able to get this low is the parts are bin. So the top 10% or whatever number of chips can hit this power usage. The other chips are sent to desktop or lower clock parts.

This is why mobile chips have such low power usage. So to make this work for a mass product product you would have to throw away most of the chips you made.

This is correct, folks. And AMD's embedded MCMs are based on the mobile chips. Probably the same chips just hooked up to an MCM rather than a mobile card.

king for proof in the pudding. If you're going to say based on these things that it's 160 ALUs then those 160 ALUs also have to be able to do what is being accomplished. It's saying if X = 2, and Z = 2, then X has to equal Z.

Building a strawman argument is misrepresenting somebody else's in order to counter it with one of your own. That's exactly what has been going on. I never said that either Latte or Xenos was being maxed out by NFS. Anyone who claims that any game is using this % of the shaders or that % of the shaders is talking out of their ass. It's far more complicated than that, and nobody is qualified to say anything on the matter except the devs. What I did say was that Criterion could probably get better efficiency out of Latte than they did Xenos. This is due to the more modern shaders, higher clock, more RAM, and fast eDRAM.
 
So basically you are saying Xenos' configuration is a better setup than VLIW5 even though VLIW5 came after and is obviously more advanced? What makes you think that Xenos has it topped in efficiency when its shader architecture is very similar yet less advanced? You are just confusing people by throwing all these shader numbers and percentages around in your posts. It is clearly apparent that you have a pretty cursory understanding of these things as you are speaking in ways that make broad generalizations and oversimplify what is an extremely complicated topic in hardware engineering and game engine programming. That's ok, but don't speak as if you are any sort of authority on the matter. I can almost guarantee I've read most of the same articles you have concerning VLIW5/VLIW4/GCN, but I acknowledge the limits of my expertise. You ought to try doing the same.

You also really believe that AMD employed an unheard of architecture, that you designed over a known design that developers are familiar with? The whole thing reeks of grasping at straws, I dare say because we were getting closer to the truth and you didn't like the sound of it. It just couldn't be true, so now we are latching on to anything that is remotely possible and claiming it more likely. The same thing happened when arkam came around and was subsequently run out of town, yet he was ultimately vindicated.

7 and a half years of experence with Xenos + coding to the metal + the architecture doesn't have THAT limitation.

I don't really care what the reality is of Wii U, I've tried to grow up from the whole Fan bashing bs that is passed around on message boards, my reasoning for Wii U not having 160 shaders is because that many shaders with VLIW5's set up just doesn't have enough efficiency to overcome 50% more shaders Xenos has, certainly not to a level higher than Xenos that we hear often from developers.

What is your point in even attacking me? debates that turn this way are often because one side is loosing ground and feels that they need to slander the other in order to make any statements from that person discredited.

As to the points I've mentioned 180 ALU efficiency that Xenos would have to have in order for 160 ALUs for Wii U to make since. if given 4 or 5 years with the architecture I'm sure VLIW5 could maybe get to 90% efficiency meaning that they are constantly hitting 4.5 slots instead of the 3.4 they did on PC, but that is unlikely to be found out of the box and would still leave Wii U at ~144 ALUs on average, playing catch up to xenos hardly makes sense with what we have seen and known about the poor development tools and unfinished hardware. Nothing I am saying is really that deep or technical it's all surface explanations more or less and I'm a security guard for a living not an engineer, but I didn't just learn about VLIW yesterday, I've been studying it since 2008.

Building a strawman argument is misrepresenting somebody else's in order to counter it with one of your own. That's exactly what has been going on. I never said that either Latte or Xenos was being maxed out by NFS. Anyone who claims that any game is using this % of the shaders or that % of the shaders is talking out of their ass. It's far more complicated than that, and nobody is qualified to say anything on the matter except the devs. What I did say was that Criterion could probably get better efficiency out of Latte than they did Xenos. This is due to the more modern shaders, higher clock, more RAM, and fast eDRAM.

The problem with this is the amount of efficiency they would have to achieve out of the box with no prior experience with the hardware, poor tools and unfinished SDKs. Compared to their 7+ years of experience with Xenos and the obvious limitations of VLIW5. It's not so much that Xenos architecture is better, its that it has more to work with. we aren't saying Xenos has 160 ALUs and Wii U has 160 ALUs, we are saying Xenos is a V6 and Wii U is a V4, there is limitations found in that alone, not to mention that Xenos is deeply understood.
 
7 and a half years of experence with Xenos + coding to the metal + the architecture doesn't have THAT limitation.

I don't really care what the reality is of Wii U, I've tried to grow up from the whole Fan bashing bs that is passed around on message boards, my reasoning for Wii U not having 160 shaders is because that many shaders with VLIW5's set up just doesn't have enough efficiency to overcome 50% more shaders Xenos has, certainly not to a level higher than Xenos that we hear often from developers.

What is your point in even attacking me? debates that turn this way are often because one side is loosing ground and feels that they need to slander the other in order to make any statements from that person discredited.

As to the points I've mentioned 180 ALU efficiency that Xenos would have to have in order for 160 ALUs for Wii U to make since. if given 4 or 5 years with the architecture I'm sure VLIW5 could maybe get to 90% efficiency meaning that they are constantly hitting 4.5 slots instead of the 3.4 they did on PC, but that is unlikely to be found out of the box and would still leave Wii U at ~144 ALUs on average, playing catch up to xenos hardly makes sense with what we have seen and known about the poor development tools and unfinished hardware. Nothing I am saying is really that deep or technical it's all surface explanations more or less and I'm a security guard for a living not an engineer, but I didn't just learn about VLIW yesterday, I've been studying it since 2008.

Does 1 Xenos Shader = 1 Latte Shader in terms of performance? I would find that hard to believe. It's not impossible that their are more ALU's than meet the eye, though.

What the previous poster said about density and hand layouts makes sense. We are operating on incomplete information and trying to deduce from there. No doubt we could be making calculation errors that make our other math inaccurate.
 
So basically you are saying Xenos' configuration is a better setup than VLIW5 even though VLIW5 came after and is obviously more advanced? What makes you think that Xenos has it topped in efficiency when its shader architecture is very similar yet less advanced? You are just confusing people by throwing all these shader numbers and percentages around in your posts. It is clearly apparent that you have a pretty cursory understanding of these things as you are speaking in ways that make broad generalizations and oversimplify what is an extremely complicated topic in hardware engineering and game engine programming. That's ok, but don't speak as if you are any sort of authority on the matter. I can almost guarantee I've read most of the same articles you have concerning VLIW5/VLIW4/GCN, but I acknowledge the limits of my expertise. You ought to try doing the same.

You also really believe that AMD employed an unheard of architecture, that you designed over a known design that developers are familiar with? The whole thing reeks of grasping at straws, I dare say because we were getting closer to the truth and you didn't like the sound of it. It just couldn't be true, so now we are latching on to anything that is remotely possible and claiming it more likely. The same thing happened when arkam came around and was subsequently run out of town, yet he was ultimately vindicated.
im sorry to call you on this, but you are equally as adamant as he is. this is a discussion, hes allowed (and you) to throw theories around and debate them - why you getting so defensive all of a sudden? you haven't really written anything new or debated, just belittled, but used to contribute loads.
bolded above are really pointless sentences that dont contribute much, or go faaaaaar out of there way to the point.
edit: any comments on this PM me, since dont wanna take up any more space of this thread
 
7 and a half years of experence with Xenos + coding to the metal + the architecture doesn't have THAT limitation.

I don't really care what the reality is of Wii U, I've tried to grow up from the whole Fan bashing bs that is passed around on message boards, my reasoning for Wii U not having 160 shaders is because that many shaders with VLIW5's set up just doesn't have enough efficiency to overcome 50% more shaders Xenos has, certainly not to a level higher than Xenos that we hear often from developers.

What is your point in even attacking me? debates that turn this way are often because one side is loosing ground and feels that they need to slander the other in order to make any statements from that person discredited.

As to the points I've mentioned 180 ALU efficiency that Xenos would have to have in order for 160 ALUs for Wii U to make since. if given 4 or 5 years with the architecture I'm sure VLIW5 could maybe get to 90% efficiency meaning that they are constantly hitting 4.5 slots instead of the 3.4 they did on PC, but that is unlikely to be found out of the box and would still leave Wii U at ~144 ALUs on average, playing catch up to xenos hardly makes sense with what we have seen and known about the poor development tools and unfinished hardware. Nothing I am saying is really that deep or technical it's all surface explanations more or less and I'm a security guard for a living not an engineer, but I didn't just learn about VLIW yesterday, I've been studying it since 2008.



The problem with this is the amount of efficiency they would have to achieve out of the box with no prior experience with the hardware, poor tools and unfinished SDKs. Compared to their 7+ years of experience with Xenos and the obvious limitations of VLIW5. It's not so much that Xenos architecture is better, its that it has more to work with. we aren't saying Xenos has 160 ALUs and Wii U has 160 ALUs, we are saying Xenos is a V6 and Wii U is a V4, there is limitations found in that alone, not to mention that Xenos is deeply understood.

First of all, very few games are "coded to the metal" these days. Everything goes through an API, whether it be on console or PC. PCs may have more somewhat more overhead but that's pretty much the extent of the difference in most cases (devs like Naughty Dog and Factor 5 have gone a bit further). And you're still doing it by saying these things are however much "shader efficient." Why did you not acknowledge what I wrote regarding PC games being programmed to work on a variety of cards from both AMD and NVidia (with Nvidia being the dominant force in that space, it bears noting)? Of course PC cards are going to end up being less efficient. But you are basically saying developers aren't taking the proposed VLIW5 of Latte into account when programming? "We are only using 128 shaders at a time, but whatever." Similarly, where are you getting the idea that Xenos wouldn't have the same limitation as a VLIW5 part if both were given equal effort from developers?

I've tried to stay away from personal attacks, but you have practically hijacked the thread with a bunch of pseudoscience, and it's a bit aggravating because the level of discussion has taken a nosedive. It's sad that there are probably some people out there that can probably help clarify things further (Marcan has provided alot of information for us that I never would have guessed) but they have either moved on or never had any interest in the first place. What that says about us, well, I won't go there.

I actually have more tidbits that might aid our analysis, but the waters have been so muddied at this point, I wonder if it's even worth it.
 
Does 1 Xenos Shader = 1 Latte Shader in terms of performance? I would find that hard to believe. It's not impossible that their are more ALU's than meet the eye, though.

What the previous poster said about density and hand layouts makes sense. We are operating on incomplete information and trying to deduce from there. No doubt we could be making calculation errors that make our other math inaccurate.

Which is why I never try to be exact with this, too many unknowns to speak with absolutes. even my ALU counts are not set in stone, in fact I've pointed to 160, 240, 256 and 320 all as possibilities. Some posters here are more sure about it than they should be and the reality is they are basing their conclusions on as much assumptions as the rest of us and while I am grateful for fourth storm's analyst of the die. He isn't the authority on this and has said himself that he doesn't understand VLIW enough to compare it to Xenos properly, BGassassin disagrees with him, stating the minimum ALU count is 256. My take is that it could be lower but would need to be custom, this isn't some fan boy conclusion, if it was I would never "allow" 160 ALUs to even be suggested, just as some refuse to allow 320 ALUs to be.

So no, 1 Latte shader should not = 1 Xenos shader, but 1 Latte shader has = 1.5 Xenos shaders to even put it on the same level, except out of the box Latte won't be using it's 5th shader often it was purposefully left aside in VLIW5 to be used as a FPU and is a very wide architecture especially when xenos and PCs typically only use 4 or less shaders in a given wavefront (thus VLIW4 was created) this means Xenos is better for handling Xenos code than Wii U would be at it, limiting Wii U's architecture often to only 128 shaders, making the comparison even worse, of course both architectures have idling shaders, but Xenos understanding and being the lead platform would likely make it as efficient or better than GPU7.
 
I actually have more tidbits that might aid our analysis, but the waters have been so muddied at this point, I wonder if it's even worth it.

Well, to whatever extent the waters are muddied, new information is always welcome. Your analysis is the best we've seen on this forum, so please don't be discouraged.
 
First of all, very few games are "coded to the metal" these days. Everything goes through an API, whether it be on console or PC. PCs may have more somewhat more overhead but that's pretty much the extent of the difference in most cases (devs like Naughty Dog and Factor 5 have gone a bit further). And you're still doing it by saying these things are however much "shader efficient." Why did you not acknowledge what I wrote regarding PC games being programmed to work on a variety of cards from both AMD and NVidia (with Nvidia being the dominant force in that space, it bears noting)? Of course PC cards are going to end up being less efficient. But you are basically saying developers aren't taking the proposed VLIW5 of Latte into account when programming? "We are only using 128 shaders at a time, but whatever." Similarly, where are you getting the idea that Xenos wouldn't have the same limitation as a VLIW5 part if both were given equal effort from developers?

I've tried to stay away from personal attacks, but you have practically hijacked the thread with a bunch of pseudoscience, and it's a bit aggravating because the level of discussion has taken a nosedive. It's sad that there are probably some people out there that can probably help clarify things further (Marcan has provided alot of information for us that I never would have guessed) but they have either moved on or never had any interest in the first place. What that says about us, well, I won't go there.

I actually have more tidbits that might aid our analysis, but the waters have been so muddied at this point, I wonder if it's even worth it.

Well, I'm interested, but that doesn't mean much since I don't really have anything to contribute anyway. I just find it all very interesting. Hopefully there are similar breakdowns for the other two upcoming consoles, but given how much is already out there I doubt it.

And it is unfortunate where this topic has gone, I doubt the level of discussion will ever really pick back up.
 
First of all, very few games are "coded to the metal" these days. Everything goes through an API, whether it be on console or PC. PCs may have more somewhat more overhead but that's pretty much the extent of the difference in most cases (devs like Naughty Dog and Factor 5 have gone a bit further). And you're still doing it by saying these things are however much "shader efficient." Why did you not acknowledge what I wrote regarding PC games being programmed to work on a variety of cards from both AMD and NVidia (with Nvidia being the dominant force in that space, it bears noting)? Of course PC cards are going to end up being less efficient. But you are basically saying developers aren't taking the proposed VLIW5 of Latte into account when programming? "We are only using 128 shaders at a time, but whatever." Similarly, where are you getting the idea that Xenos wouldn't have the same limitation as a VLIW5 part if both were given equal effort from developers?

I've tried to stay away from personal attacks, but you have practically hijacked the thread with a bunch of pseudoscience, and it's a bit aggravating because the level of discussion has taken a nosedive. It's sad that there are probably some people out there that can probably help clarify things further (Marcan has provided alot of information for us that I never would have guessed) but they have either moved on or never had any interest in the first place. What that says about us, well, I won't go there.

I actually have more tidbits that might aid our analysis, but the waters have been so muddied at this point, I wonder if it's even worth it.

I think you know what I meant by coding to the metal, it is quite well known that Xenos has far less overhead that PCs thanks to DX11 being used to be compatible with all cards. Your points about developers having put the same effort forward is simply impossible because of unfinished hardware and poor tool sets. as to your point about this thread taking a nosedive, I'd say it's a lot better than a few days ago when people were talking about the possibility of a 3.24GHz clock on espresso via update.

What you know, you don't know, you assume, we have a few facts and a ton of assumptions, what makes your analyst of the Wii U die more factual than BG's? What makes your assumption of performance from 160 ALUs much more of an expert opinion than BG's 256 ALUs? You point at me, but I'm not the only one who doesn't think your analyst of the Wii U GPU is a fact. This thread isn't about one person giving the answers, it is about trying to figure out as a group what Wii U's GPU is.
 
First of all, very few games are "coded to the metal" these days. Everything goes through an API, whether it be on console or PC. PCs may have more somewhat more overhead but that's pretty much the extent of the difference in most cases (devs like Naughty Dog and Factor 5 have gone a bit further). And you're still doing it by saying these things are however much "shader efficient." Why did you not acknowledge what I wrote regarding PC games being programmed to work on a variety of cards from both AMD and NVidia (with Nvidia being the dominant force in that space, it bears noting)? Of course PC cards are going to end up being less efficient. But you are basically saying developers aren't taking the proposed VLIW5 of Latte into account when programming? "We are only using 128 shaders at a time, but whatever." Similarly, where are you getting the idea that Xenos wouldn't have the same limitation as a VLIW5 part if both were given equal effort from developers?

I've tried to stay away from personal attacks, but you have practically hijacked the thread with a bunch of pseudoscience, and it's a bit aggravating because the level of discussion has taken a nosedive. It's sad that there are probably some people out there that can probably help clarify things further (Marcan has provided alot of information for us that I never would have guessed) but they have either moved on or never had any interest in the first place. What that says about us, well, I won't go there.

I actually have more tidbits that might aid our analysis, but the waters have been so muddied at this point, I wonder if it's even worth it.

If you have information to share its always welcome, some of the best discussion is disagreement so please don't feel put out just because things aren't harmonious
 
Not sure if this helps, but looking over wikipedia on the radeon families of GPUs, 4000 series did not support Dolby digital and only LPCM and DTS. 5000 family does. WiiU only puts out LPCM, which is what 4000 family puts out. If this makes any difference or narrows what we're looking at down at all great. If this is already known fact shun me :)
 
Which is why I never try to be exact with this, too many unknowns to speak with absolutes. even my ALU counts are not set in stone, in fact I've pointed to 160, 240, 256 and 320 all as possibilities. Some posters here are more sure about it than they should be and the reality is they are basing their conclusions on as much assumptions as the rest of us and while I am grateful for fourth storm's analyst of the die. He isn't the authority on this and has said himself that he doesn't understand VLIW enough to compare it to Xenos properly, BGassassin disagrees with him, stating the minimum ALU count is 256. My take is that it could be lower but would need to be custom, this isn't some fan boy conclusion, if it was I would never "allow" 160 ALUs to even be suggested, just as some refuse to allow 320 ALUs to be.

So no, 1 Latte shader should not = 1 Xenos shader, but 1 Latte shader has = 1.5 Xenos shaders to even put it on the same level, except out of the box Latte won't be using it's 5th shader often it was purposefully left aside in VLIW5 to be used as a FPU and is a very wide architecture especially when xenos and PCs typically only use 4 or less shaders in a given wavefront (thus VLIW4 was created) this means Xenos is better for handling Xenos code than Wii U would be at it, limiting Wii U's architecture often to only 128 shaders, making the comparison even worse, of course both architectures have idling shaders, but Xenos understanding and being the lead platform would likely make it as efficient or better than GPU7.

This thread is getting derailed and the old adage about arguing on the internet keeps popping into my mind. Let's just agree to disagree and one day when these things are finally leaked, one of us will buy the other a virtual beer.

Well, to whatever extent the waters are muddied, new information is always welcome. You're analysis is the best we've seen on this forum, so please don't be discouraged.

Thanks dude. Btw, I just dug up this old quote from CVG. I know, "anonymous devs" and all but with NDA's under effect, it's really the only way to speak out unless you're like Harada and like to play with fire. :P

"It's not actually a problem getting things up and running because the architecture is pretty conventional, but there are constraints with stuff like physics and AI processing because the hardware isn't quite as capable."

So that's that dude, Marcan, and the leaked features sheet all pointing to a pretty conventional Radeon architecture. There may even be more that I'm forgetting.
 
If you have information to share its always welcome, some of the best discussion is disagreement so please don't feel put out just because things aren't harmonious

Fourth Storm: Right, it isn't like I don't like your analyst for a while I even believed it, which is why I figured it had to be custom considering VLIW5's limitations putting it on similar footing with Xenos limitations.

And it isn't like I haven't brought up any valid points, which is the main reason I am defending my posts, they have merit, I'm not an engineer but I pointed out that 160 shaders isn't 160 shaders all the time, something that I feel is being overlooked and stopping discussion in this thread only to be left with whatever conclusions the last person who tried has come to.
 
Not sure if this helps, but looking over wikipedia on the radeon families of GPUs, 4000 series did not support Dolby digital and only LPCM and DTS. 5000 family does. WiiU only puts out LPCM, which is what 4000 family puts out. If this makes any difference or narrows what we're looking at down at all great. If this is already known fact shun me :)

Nintendo use a different audio hardware in there not amd's
 
Nintendo use a different audio hardware in there not amd's

You say that like Nintendo made the chip in their own chip factory. Whatever parts that are in it and wherever they are from, AMD and Renesas put them there. So you can't really say its not AMD's since they are the chips manufactoring.

Do we know exactly who manufactured the DSP?
 
This thread is getting derailed and the old adage about arguing on the internet keeps popping into my mind. Let's just agree to disagree and one day when these things are finally leaked, one of us will buy the other a virtual beer.

So that's that dude, Marcan, and the leaked features sheet all pointing to a pretty conventional Radeon architecture. There may even be more that I'm forgetting.

It's not off topic, we are discussing Wii U's GPU and the possibility of the architecture and ALU count. I don't mind disagreeing with you, it's not that I think you are wrong out right, just that your conclusion might be wrong.

As for that quote, yeah what I'm suggesting would still be VLIW and thus conventional, the only real difference is that stuff wouldn't be hanging to be scheduled and the cycle latency would drop drastically to more match Wii U's set up. r700 has a cycle latency that is ridiculously high, Cayman which was much improved is 44 cycles and is still extremely nasty to work with compared to GCN, largely because not enough schedulers closer to the hardware.

You say that like Nintendo made the chip in their own chip factory. Whatever parts that are in it and wherever they are from, AMD and Renesas put them there. So you can't really say its not AMD's since they are manufacturing and shipping the chip.

Actually iirc the manufacture of the DSP is known they wrote a document and is the reason we know it is clocked at 121.5 mhz (again iirc)
 
This thd is getting derailed and the old adage about arguing on the internet keeps popping into my mind. Let's just agree to disagree and one day when these things are finally leaked, one of us will buy the other a virtual beer.



Thanks dude. Btw, I just dug up this old quote from CVG. I know, "anonymous devs" and all but with NDA's under effect, it's really the only way to speak out unless you're like Harada and like to play with fire. :P



So that's that dude, Marcan, and the leaked features sheet all pointing to a pretty conventional Radeon architecture. There may even be more that I'm forgetting.

I remember that cvg anonymous devs quote, feel free to correct me but isnt AI generally handled with general purpose code not floating point thus espresso should outperform xennon at that so making that quote a bit fishy
 
http://www.techpowerup.com/reviews/Sapphire/HD_6450_Passive/9.html

This was brought up in the beyond3D thread by function. I initially rejected the comparison but now I believe it to be somewhat valid. The green bar represents a card with a 160:8:4 core layout. It runs at 625 Mhz (a bit higher than Wii U) but it is also pushing 44.1 fps in the lowest configuration (1024x768). Mind you, Wii U is running BLOPSII with around that same fps average but at an 880x720 resolution. Now, of course this is not a rock solid comparison since it's CoD4, and while you could say that BLOPSII pushes the engine more, the point could also be made that they've refined their engine over time to handle it. In fact, the 360 hits 60fps for respectable amounts of time in both games. Also keep in mind that the card in question has only a gig of DDR3 at 10.67 GB/s, no eDRAM, and less ROPs than proposed for Latte.
 
I remember that cvg anonymous devs quote, feel free to correct me but isnt AI generally handled with general purpose code not floating point thus espresso should outperform xennon at that so making that quote a bit fishy

Probably on early kits too, which were hot garbage. I'm sure it's improved since the quote.
 
http://www.techpowerup.com/reviews/Sapphire/HD_6450_Passive/9.html

This was brought up in the beyond3D thread by function. I initially rejected the comparison but now I believe it to be somewhat valid. The green bar represents a card with a 160:8:4 core layout. It runs at 625 Mhz (a bit higher than Wii U) but it is also pushing 44.1 fps in the lowest configuration (1024x768). Mind you, Wii U is running BLOPSII with around that same fps average but at an 880x720 resolution. Now, of course this is not a rock solid comparison since it's CoD4, and while you could say that BLOPSII pushes the engine more, the point could also be made that they've refined their engine over time to handle it. In fact, the 360 hits 60fps for respectable amounts of time in both games. Also keep in mind that the card in question has only a gig of DDR3 at 10.67 GB/s, no eDRAM, and less ROPs than proposed for Latte.

Plus it has the disadvantages of not being on die with the processor, and going through the presumably less efficient directx api.
 
Plus it has the disadvantages of not being on die with the processor, and going through the presumably less efficient directx api.

Correct. Now, allow me to draw your attention to the llano die shot in the OP. One huge reason why I think there are 160 shaders is because in my analysis there are 8 TMUs, and a 320:8:8 configuration would be lopsided and ridiculous.

http://images.anandtech.com/reviews/cpu/amd/llano/review/desktop/49142A_LlanoDie_StraightBlack.jpg

Notice the rectangular blocks in the top middle. Those are the L1 texture caches. Now look at the T blocks and S blocks on Latte. Cache is comprised of large pools of SRAM to store the data and other tiny pools of SRAM to function as tags. The left of the L1 caches on Llano looks remarkably like what we see in the S blocks of Latte - a pool of 8 kB and one tag to go along with it. The small SRAM blocks on the right side of the Llano caches seem to have been moved into the T blocks (texture units) on Latte. Both have exactly 16. Coincidence or smoking gun? I'm thinking they probably have something to do with data transfer between TMU and L1 cache; thus they appear in slightly different locations in the slightly different architectures.
 
Plus it has the disadvantages of not being on die with the processor, and going through the presumably less efficient directx api.
It's an apple to oranges comparison, other games are listed in that review, cod4 is cherry picked to show a comparison to black ops 2 its the most unscientific thing I've seen, and I saw it first on beyond 3D, we could compare crysis 2 from there or facry 2 and find that it doesn't handle those games as well as xenos, but instead of a more similar comparison which is still matched up with a faster CPU and a higher clock speed, we are given cod4 a game that was released in 2007? With one that was released 6 months ago and expected to assume it isn't taxing the system differently. Check benchmarks for cod4 through black ops and you'll find performance does get worse.
 
Hello! I certainly welcome other reasonable takes on the Wii U's innards, even if they come from evil globo-mega-corp employees! So, thanks for running my post by those folks. While I agree that we must be careful when drawing any hard conclusions due to the high amount of variation in chip layouts, I don't agree that it's a useless endeavor. No, we can't say we know everything for sure, but we can make some damn good guesses. For instance, it's pretty much certain that what we've identified as shader blocks, are just in fact, that.
Absolutely, not disputing that. I should have made it clearer that I was not disagreeing with the identification of functional blocks based on comparison with other images. Similarly I'm not in a position to disagree with your reasoning on 160 based on the results we've seen from various games. I am making no attempt to argue one way or another as to how many shaders there are or the relative efficiency of modern GPU architectures. Any knowledge I once had in this domain is long gone and dates from a time when men were men and men wrote their own shaders using nothing but a lump of coal and a stick. Or ARB assembly as it was also known.

I agree a combination of the parameter (ALU count, power draw etc) ranges and real world results is as good as we can get until someone leaks. Thought I'd try and help out narrow the parameter range side of the equation by diverting the energy of globo-mega-corp employees from world crushing to the great Wii U hardware debate! Unfortunately they weren't much use at the narrowing bit...

On power consumption, this is again influenced by the layout but can vary dramatically within the same node size depending on the fab process. The Wii U also has the advantage of working to a fixed (relatively low) clock. Most fabs have low power and high performance processes for a given lambda. Do we know the fab and core voltage of the GPU?
 
Is BG the guy with no tech knowledge but has info and analysis from his insider friend?

Yeah that turn not to be so wrong he quit posting....

TO be fair I was wayyy off too... From sept 14, 2012
The best case was 550 or so but that power consumption is even too high and that used the latest and greatest form amd. it 350-500 glfop would be the correct range and likely around 400-450. AS from design gpu improvement this will be even greater with next gen system seeing the wiiu uses a 2008 design. Compare to a 2013 design in the next gen system going by the leaks.
http://67.227.255.239/forum/showpost.php?p=42128123&postcount=2747

Jan of this year.

"Wuu would be lucky to hit over 350 glfop. The 40nm 5550 uses 39 watts and is at 352 gflops.

http://67.227.255.239/forum/showpost.php?p=46777838&postcount=1746

Wasnt too far off. Still think we are looking at 160ALU chip or something in between. Still is funny thinking about how I got flamed so hard in these threads. Too funny now...
 
Here's a quote from Marcan's last post:

Marcan said:
Even on the Wii U itself, the gamepad is managed by an independent Broadcom SoC that has its own firmware and communicates with the rest of the system via bog-standard USB and one of the video output heads on the Radeon.

http://fail0verflow.com/blog/2013/espresso.html

Hmm, so Latte has multiple video output heads on it (presumably 2 - one for TV and one for Gamepad). Where could they be? How about the Q blocks? Odd location perhaps, but they look remarkably like the two small and identical blocks on the very left of the LLano die, which coincidentally also supports 2 video outs (unlike it's successors, which feature true Eyefinity support).
 
Here's a quote from Marcan's last post:



http://fail0verflow.com/blog/2013/espresso.html

Hmm, so Latte has multiple video output heads on it (presumably 2 - one for TV and one for Gamepad). Where could they be? How about the Q blocks? Odd location perhaps, but they look remarkably like the two small and identical blocks on the very left of the LLano die, which coincidentally also supports 2 video outs (unlike it's successors, which feature true Eyefinity support).

shouldn't there be 3 though seeing as 2 gamepads are supported?
 
Yeah that turn not to be so wrong he quit posting....

He came back but admitted he had no engineering or tech background to speak of and how he was just reiterating stuff from his friend who may or may not have an engineering background themselves.

Which is why I'm confused as to why, after he's been factually wrong so many times, he's still being held in high regard. Is it because he writes huge posts, and too often people just believe what little point he bolded (Regardless of the unlikelihood of the rest of the material)?

I stopped going to these threads, mostly because I was chased out by witch hunters who refused to be realistic in their expectations. Fourth Storm is pretty realistic, and pretty much started these 'investigations' into the hardware, so it's almost hilarious that some of the rambling users have snapped back and begun to reject him, also.

shouldn't there be 3 though seeing as 2 gamepads are supported?

Couldn't they halve the framerate and send alternating frames to each controller?

Is it confirmed that the Wi Fi unit can connect to multiple devices? I'm sure it can if it's very modern, but I forgot where to find the writeup on it.
 
Here's a quote from Marcan's last post:



http://fail0verflow.com/blog/2013/espresso.html

Hmm, so Latte has multiple video output heads on it (presumably 2 - one for TV and one for Gamepad). Where could they be? How about the Q blocks? Odd location perhaps, but they look remarkably like the two small and identical blocks on the very left of the LLano die, which coincidentally also supports 2 video outs (unlike it's successors, which feature true Eyefinity support).
Doesn't Wii u support 3? 2 gamepads and all...
 
Status
Not open for further replies.
Top Bottom