WiiU "Latte" GPU Die Photo - GPU Feature Set And Power Analysis

Status
Not open for further replies.
I don't see why the ISA should have impact on that. 99.99% of the time you don't see assembler code anyway. And on all current consoles devs are used to PPC, not x86.

Yet all modern middleware and even publisher specific engines are being written for PC architecture. That's probably where Nintendo fucked up the most. I don't know if they expected developers to jump through hoops to try and port engines to their custom system architecture. I think they did, and recent comments by Iwata (admitting their failure to get 3rd parties on board) seem to reflect that.

Sony and MS seem to have put a much more humble approach to this, by selecting stuff that was in line with trends in both hardware and software development pipelines.

The PS4 APU seems to be fairly custom, too. Of course not in the same way earlier Sony consoles were, but you can say the same about Wii U.

It's customized to their specific needs, but nothing major was changed about it. From what we know right now, they've been optimized more than anything to work as efficiently as possible in a closed architecture. That alone is a massive advantage over stock chips, while it still should keep costs relatively reasonable.
 
I don't remember complaining about it. All I said is that it makes no sense to switch from PPC to x86 unless you absolutely have to.

No. It made sense for Sony and MS to switch to x86 because PCs run on x86 and development for console games always occurs on PCs. Switching to x86 removed the extra step of developing on PC and then moving your code to a devkit based on another architecture.

It also made sense to switch because after Apple abandoned PPC for x86, IBM started targeting POWER exclusively towards server and HPC applications and consoles can't use CPUs which dissipate 100W+ from the die. So that pretty much ruled out anything based on POWER4 through POWER7. This is why they couldn't use Intel and AMD's desktop CPUs either. The PS4 and Nextbox are using APUs derived from AMD's mobile x86 CPU line, with smaller heat output intended for laptop use.

Sony could have used another Cell but it was such a boondoggle for Sony, Toshiba, and IBM which never got used in anything besides PS3 that it didn't make sense to develop it further. MS was at a dead end with Xenos. Because Cell and Xenos were already multi-core designs, simply stacking more of them together would also run into Amdahl's Law. Neither Sony nor MS wanted to simply duct-tape more Cells and Xenos together, so they had to move on from PPC. Nintendo on the other hand preferred to duct-tape some Broadways together to create Espresso. So far Nintendo's method is not paying off for them. We will see how the PS4 and Nextbox fare when they launch later this year.
 
No. It made sense for Sony and MS to switch to x86 because PCs run on x86 and development for console games always occurs on PCs. Switching to x86 removed the extra step of developing on PC and then moving your code to a devkit based on another architecture.
What you develop on is irrelevant, it's about what you develop for. Though granted, most games and middleware are multiplatform and developed on and for PC. Nice for 3rd parties, 1st parties don't care.

It also made sense to switch because after Apple abandoned PPC for x86, IBM started targeting POWER exclusively towards server and HPC applications and consoles can't use CPUs which dissipate 100W+ from the die. So that pretty much ruled out anything based on POWER4 through POWER7. This is why they couldn't use Intel and AMD's desktop CPUs either. The PS4 and Nextbox are using APUs derived from AMD's mobile x86 CPU line, with smaller heat output intended for laptop use.
Power isn't just POWER. There are also the 4xx and A2 lines and the Freescale e6500 series.
 
Nintendo could have kept just one PPC for BC and used it as IO processor in native mode if they wanted to switch to x86.

Also, don't forget that Jaguar has a pipeline more than four times as deep as Espresso. Better OoO capabilities and branch prediction are not so much features as they are necessities.

That's true of course. I just wouldn't be too quick in assuming Jaguar is less efficient than Espresso just because its x86.

edit: I just noticed we're in the wrong thread here..
 
No. It made sense for Sony and MS to switch to x86 because PCs run on x86 and development for console games always occurs on PCs. Switching to x86 removed the extra step of developing on PC and then moving your code to a devkit based on another architecture.

It also made sense to switch because after Apple abandoned PPC for x86, IBM started targeting POWER exclusively towards server and HPC applications and consoles can't use CPUs which dissipate 100W+ from the die. So that pretty much ruled out anything based on POWER4 through POWER7. This is why they couldn't use Intel and AMD's desktop CPUs either. The PS4 and Nextbox are using APUs derived from AMD's mobile x86 CPU line, with smaller heat output intended for laptop use.

Sony could have used another Cell but it was such a boondoggle for Sony, Toshiba, and IBM which never got used in anything besides PS3 that it didn't make sense to develop it further. MS was at a dead end with Xenos. Because Cell and Xenos were already multi-core designs, simply stacking more of them together would also run into Amdahl's Law. Neither Sony nor MS wanted to simply duct-tape more Cells and Xenos together, so they had to move on from PPC. Nintendo on the other hand preferred to duct-tape some Broadways together to create Espresso. So far Nintendo's method is not paying off for them. We will see how the PS4 and Nextbox fare when they launch later this year.

In the end which CPU each of them uses is going to mean next to nothing wrt how they fare in the market.

Prices, features and games will be the deciding factors - oh and marketting.
 
I'm not qualified to explain the intricacies of VLIW5 vs the Vec4+scalar config of Xenos, but I can volunteer a few simple explanations as to how a 160 shader Latte could get the results we see in ports. For one, shaders are very important, but they're not everything that goes into a visual. It's quite reasonable to say that third party cross-platform games are not exploiting all 240 shaders of Xenos to the fullest. Not every game looks like Gears of War. And what would happen if you try to do more than the shaders can handle anyway? Slower framerate - which we've seen in places. However, also keep in mind that Latte is hooked up to some high bandwidth/low latency eDRAM, which is read/write capable, so it's probably saving a bunch of clock cycles just by that alone.

So it could be the GPU hasn't come under direct fire, because it's actually performing above expectations given the numbers on paper. However, we actually have heard some comments about the GPU that don't paint it in a great light. That one Kotaku article likened its performance to DirectX9 (an odd way of putting it, but the point can be extracted that it's pretty much in line with current gen) and also lherre said way back that the GPU lacked horsepower despite being decent in terms of features.

The truth is, we are not going to hear many specific criticisms at all, because devs are under NDAs. There was the "not as many shaders...not as capable" comment by the anonymous dev that was written off as bs, but perhaps prematurely. The Metro devs probably spoke out because they made that choice that they are not interested in Wii U development. They didn't care about burning that bridge.

This is a strawman's argument IMHO. There are effects that are directly related to pure processing power of the shaders. Things like reflections. Which we can compare in a recent release. The NFS dev said they have the Wii U version reflecting way more of the world than the PS360 version. That's not going to be a boost from more ram, or higher bandwidth. That's going to be really processing heavy, and for 1/3 less shaders to do significantly more on a 6 month old system vs a 7 1/2 year old system means to me those shaders need to either be doing significantly more or there's more than 160 shaders there.

To me for your 160 shader theory to work then VLIW would have to be doing significantly more per shader than Vec4+scalar, or the shaders aren't VLIW.

The Kotaku article sounded like it came from some one who didn't know what they were talking about though. Saying it performed like DX9 is commenting on feature set (which since have learned is above DX9) and not performance.

Again anonymous sources and well maybe this is why isn't valid to basis an argument on IMHO. I'll give llhere credit in that the GPU could lack horsepower despite a good feature set, but is that lacking horsepower compared to the PS360 or vs the PS4720.

I'm not trying to say the thing has 640 shaders or is a beast or anything like that. What I'm trying to say is that you have to go beyond "well it would be able to fit this many ALU's so that's my theory". Since we now have to say ok if it's 160 ALU's then they need to be doing X for them to be doing more reflections than the PS360 in NFS. IF VLIW's jump over Vec4+scalar isn't equal to X, then we're either looking at more than 160 ALU's or we're looking at something other than VLIW and trying to count shader cores by comparing to VLIW images is folly.

Some things you can definitely chalk up to faster memory, more memory, but some effects are still processing bound and reflections are one of those. I'm not going to buy that Criterion was able to reach the theoretical limit of a 6 month old GPU more than a 7 1/2 year old one they've been working with for awhile.

Maybe I'm explaining myself wrong, but I see a glaring flaw in the 160 shader theory when something that is more processor bound than memory is performing better on the Wii U (ie reflections). It's not like Criterion sacrificed some other effect to do more reflections, visually the Wii U release is a step up from the PS360 release.
 
This is a strawman's argument IMHO. There are effects that are directly related to pure processing power of the shaders. Things like reflections. Which we can compare in a recent release. The NFS dev said they have the Wii U version reflecting way more of the world than the PS360 version. That's not going to be a boost from more ram, or higher bandwidth. That's going to be really processing heavy, and for 1/3 less shaders to do significantly more on a 6 month old system vs a 7 1/2 year old system means to me those shaders need to either be doing significantly more or there's more than 160 shaders there.

To me for your 160 shader theory to work then VLIW would have to be doing significantly more per shader than Vec4+scalar, or the shaders aren't VLIW.

The Kotaku article sounded like it came from some one who didn't know what they were talking about though. Saying it performed like DX9 is commenting on feature set (which since have learned is above DX9) and not performance.

Again anonymous sources and well maybe this is why isn't valid to basis an argument on IMHO. I'll give llhere credit in that the GPU could lack horsepower despite a good feature set, but is that lacking horsepower compared to the PS360 or vs the PS4720.

I'm not trying to say the thing has 640 shaders or is a beast or anything like that. What I'm trying to say is that you have to go beyond "well it would be able to fit this many ALU's so that's my theory". Since we now have to say ok if it's 160 ALU's then they need to be doing X for them to be doing more reflections than the PS360 in NFS. IF VLIW's jump over Vec4+scalar isn't equal to X, then we're either looking at more than 160 ALU's or we're looking at something other than VLIW and trying to count shader cores by comparing to VLIW images is folly.

Some things you can definitely chalk up to faster memory, more memory, but some effects are still processing bound and reflections are one of those. I'm not going to buy that Criterion was able to reach the theoretical limit of a 6 month old GPU more than a 7 1/2 year old one they've been working with for awhile.

Maybe I'm explaining myself wrong, but I see a glaring flaw in the 160 shader theory when something that is more processor bound than memory is performing better on the Wii U (ie reflections). It's not like Criterion sacrificed some other effect to do more reflections, visually the Wii U release is a step up from the PS360 release.

This is what I was trying to say, 160 ALUs is simply impossible with VLIW. Most likely Nintendo either has changed to VLIW4 and is using 32ALUs in each SPU. Stayed with VLIW5 and is using 40ALUs in each SPU or moved on to something closer to GCN with atleast 256 ALUs total. I think it is custom with a design closer to GCN for latency reasons, Cayman has a 44 cycle latency which seems very high for Wii U's design but R700 is more than twice that iirc.

I'm throwing 160 ALUs out completely from my personal theories, it just doesn't have the ability to match up to last gen in the way we are seeing from titles being released now. 256ALUs as BG has said is my new minimum and matches up with what we are seeing and have heard about Wii U a whole lot better. 320ALUs is not out of the question either, especially if it is VLIW5 which is impossible with 256ALUs since it has to stay divisible by 5, however VLIW5 is pretty inefficient so I still think that is a no go as well.
 
We have no idea what factor lead to better reflection-mapping. To say it directly related to only alu performance is silly.

The debate has now come to the something like the CPU. X clock speed is this so there is no way it can keep up with y at a much higher clockspeed. Now you look at q game and b part of the game clearly was held back by the clockspeed of the CPU.
 
Sorry but ports were FINISHED on incomplete hardware. What we have seen from stuff like X would suggest that it is actually achieving a bit more than Xenos, and that efficiency in VLIW5 just isn't there... in fact if it is VLIW5, at least some of the time, only 4 of every 5 shaders could possibly fire, you'd have scheduling issues and Xenos doesn't actually have that problem so you'd have 240 shaders vs only 128 shaders (a majority of the time), it's virtually impossible for it to be 160 and stay R700.

The reality is your assessment hinges on the idea that ports were pushing 125% of what the shaders are capable of, and that is just impossible.

How can you keep a straight face and use that tiny bit of footage we have from X to make a general conclusion about the SPU count of Latte? And while it's true that the hardware was unfinished to a certain extent, devs would have known pretty much what to expect. We have some that came out and even said the architecture (as in the shaders) is pretty standard. Devs definitely got to reap the benefits of the clock bumps over time as well. True, people have come out and said the tools provided by Nintendo weren't great, but that's not a catch-all. Here's a question: If Latte has 320 shaders, why wasn't NSMBU in 1080p? We know the eDRAM should be enough to hold the framebuffer and the amount of effects going on isn't astronomical. With 1080p being the buzzword that it is for them, wouldn't they have wanted to go for it?
I think it is possible Nintendo went with a VLIW4 set up, or they did what I suggested above and went with a custom thread level parallel instead of the instruction level parallel we see from VLIW parts.
Your logic is that since a VLIW4 or custom setup would be better than VLIW5, Nintendo must've gone for it. They just never would have settled for vanilla VLIW5! Yet, time and time again we have seen Nintendo blow our collective minds with how low they can go. Nobody, including myself, was predicting a 64-bit bus to main memory, and yet here we are. Plus, AMD themselves saw VLIW4 as a failed experiment and reverted back to VLIW5 cards before finalizing GCN.
In this case I think BG is correct and they went with a 32ALU or 40ALU per SPU reaching 256 or 320 ALUs. Trinity has 256ALU parts and honestly I think those parts could easily be compared to what we see in these games, especially if we clock the cpu down to 1.8GHz-2GHz and limit the system ram to 2GB DDR3. Of course this isn't a science but it is a lot better than assuming that launch ports were able to extract 100% of the Wii U's GPU power while developers didn't use 240 of xenos shaders. I mean that is what you are saying in the above statement and it still doesn't take into account VLIW5's efficiency problems.

Where did I say that Wii U is attaining 100% efficiency? If you're trying to imply that I don't think there's room for any improvement in Wii U games, you're wrong. What I said was that in most cases, neither Latte nor Xenos are getting the most they can out of the SPUs. Latte is probably more efficient though, thanks to its more modern design and on-chip eDRAM. So those cycles which would have been wasted on Xenos can theoretically be used for more shading on Latte. You are focusing too much on what you've read about VLIW5 in a PC setting. Those games are coded for the least common denominator among many cards from both AMD and NVidia (a point I made before but was conveniently ignored). Plus, Nintendo invested in a very good compiler which should help devs optimize their code even more for ILP.
 
Isn't RSX on par with 7600GT and not 7900GT?
Out of curiosity, is there any benchmarks of cross platform games (Xbox 360 & PC) done with similar GPUs?

Previously in this thread, people were using the PC version of Wii U games to guage how the performance compares (regards to shaders). But do PC to Console comparisons ever work out consistently? The Xbox (and PS3) both have GPU's that are well known, in terms of shaders and specs, so theoretically, we could get a benchmark of a GPU with similar specs, and see how it lines up. If even those don't line up, then it makes everything even less concrete. Frankly, given the different caching and other nintendo tweaks, I'd side on the side of less shaders, but working more efficiently than the PC counterparts.
 
Wow has no one look at the power consumption number of AMD cards at 40nm?

Its clear as day that 320 is just impossible. Given the math the card cannot use more than at the most 15 watts. That even high IMO.

HD 5550 320:16:8 shader card @ 550 MHz is 33 Watts 40 nm card
HD 6450 160:8:4 Shader card @ 625 Mhz is 13 watts 40 nm card

http://www.techpowerup.com/reviews/HIS/Radeon_HD_5550/27.html
http://www.techpowerup.com/reviews/Sapphire/HD_6450_Passive/25.html

The only thing that point against 160 is the size of the on the card itself. My theory is they spread out the shader core to reduce the heat output. That does sound like a really nintendo thing to do. Please note the rumored card in the wiiu is 160:8:8 so will draw little more power than the 6450. Just fit perfect....

One thing is clear, 320 is just too big to fit at 40nm and uses way too much power.... 320 is out...
 
Out of curiosity, is there any benchmarks of cross platform games (Xbox 360 & PC) done with similar GPUs?

Previously in this thread, people were using the PC version of Wii U games to guage how the performance compares (regards to shaders). But do PC to Console comparisons ever work out consistently? The Xbox (and PS3) both have GPU's that are well known, in terms of shaders and specs, so theoretically, we could get a benchmark of a GPU with similar specs, and see how it lines up. If even those don't line up, then it makes everything even less concrete. Frankly, given the different caching and other nintendo tweaks, I'd side on the side of less shaders, but working more efficiently than the PC counterparts.

Most PC games that gets ported from PS3-360 recommend a Nvidia 8800 GPU and a minimum 6600GT/7800GT most of the times. You can't make a conclusion because the PC GPUs aren't utilised fullly (consoles can be coded to the metal etc)

Close architecture will always win on same specs with PC

note: most= most that I have seen
 
Wow has no one look at the power consumption number of AMD cards at 40nm?

Its clear as day that 320 is just impossible. Given the math the card cannot use more than at the most 15 watts. That even high IMO.

HD 5550 320 shader card @ 550 MHz is 33 Watts 40 nm card
HD 6450 160 Shader card @ 625 Mhz is 13 watts 40 nm card

http://www.techpowerup.com/reviews/HIS/Radeon_HD_5550/27.html
http://www.techpowerup.com/reviews/Sapphire/HD_6450_Passive/25.html

The only thing that point against 160 is the size of the on the card itself. My theory is they spread out the shader core to reduce the heat output. That does sound like a really nintendo thing to do....

Plus you have to factor in more power hungry memory and the uncore PC related stuff like PCI Express and extra video decoding hardware that Wii U might not need. Either part could fit if it was customized.
 
Wow has no one look at the power consumption number of AMD cards at 40nm?

Its clear as day that 320 is just impossible. Given the math the card cannot use more than at the most 15 watts. That even high IMO.

HD 5550 320:16:8 shader card @ 550 MHz is 33 Watts 40 nm card
HD 6450 160:8:4 Shader card @ 625 Mhz is 13 watts 40 nm card

http://www.techpowerup.com/reviews/HIS/Radeon_HD_5550/27.html
http://www.techpowerup.com/reviews/Sapphire/HD_6450_Passive/25.html

The only thing that point against 160 is the size of the on the card itself. My theory is they spread out the shader core to reduce the heat output. That does sound like a really nintendo thing to do. Please note the rumored card in the wiiu is 160:8:8 so will draw little more power than the 6450. Just fit perfect....

One thing is clear, 320 is just too big to fit at 40nm and uses way too much power.... 320 is out...

using desktop cards to prop up your argument now
 
This is a strawman's argument IMHO. There are effects that are directly related to pure processing power of the shaders. Things like reflections. Which we can compare in a recent release. The NFS dev said they have the Wii U version reflecting way more of the world than the PS360 version. That's not going to be a boost from more ram, or higher bandwidth. That's going to be really processing heavy, and for 1/3 less shaders to do significantly more on a 6 month old system vs a 7 1/2 year old system means to me those shaders need to either be doing significantly more or there's more than 160 shaders there.

To me for your 160 shader theory to work then VLIW would have to be doing significantly more per shader than Vec4+scalar, or the shaders aren't VLIW.

The Kotaku article sounded like it came from some one who didn't know what they were talking about though. Saying it performed like DX9 is commenting on feature set (which since have learned is above DX9) and not performance.

Again anonymous sources and well maybe this is why isn't valid to basis an argument on IMHO. I'll give llhere credit in that the GPU could lack horsepower despite a good feature set, but is that lacking horsepower compared to the PS360 or vs the PS4720.

I'm not trying to say the thing has 640 shaders or is a beast or anything like that. What I'm trying to say is that you have to go beyond "well it would be able to fit this many ALU's so that's my theory". Since we now have to say ok if it's 160 ALU's then they need to be doing X for them to be doing more reflections than the PS360 in NFS. IF VLIW's jump over Vec4+scalar isn't equal to X, then we're either looking at more than 160 ALU's or we're looking at something other than VLIW and trying to count shader cores by comparing to VLIW images is folly.

Some things you can definitely chalk up to faster memory, more memory, but some effects are still processing bound and reflections are one of those. I'm not going to buy that Criterion was able to reach the theoretical limit of a 6 month old GPU more than a 7 1/2 year old one they've been working with for awhile.

Maybe I'm explaining myself wrong, but I see a glaring flaw in the 160 shader theory when something that is more processor bound than memory is performing better on the Wii U (ie reflections). It's not like Criterion sacrificed some other effect to do more reflections, visually the Wii U release is a step up from the PS360 release.

What exactly did I say that was a straw man argument? Z0M3Ie arguing against a 160 shader Latte by saying that I claimed it's already maxed out is a strawman argument. What I bolded in your post is a straw man argument, and quite frankly, insulting given the pains I've taken to provide a legit analysis. The improvements in reflections to NFS are so slight, that you have to squint your eyes to see them. And even so, Criterion didn't talk up the increased strength of the GPU. You know Nintendo lets devs talk about the good technical aspects of Wii U, because we are always hearing about it being "more modern" and having "more memory." Criterion talk about refinements they made in their own lighting engine, however. Sheer programming skill shouldn't be brushed aside in analyzing these matters, and if what we saw is the best they could squeeze out of a 320 shader part, I'd be very surprised.
 
How can you keep a straight face and use that tiny bit of footage we have from X to make a general conclusion about the SPU count of Latte? And while it's true that the hardware was unfinished to a certain extent, devs would have known pretty much what to expect. We have some that came out and even said the architecture (as in the shaders) is pretty standard. Devs definitely got to reap the benefits of the clock bumps over time as well. True, people have come out and said the tools provided by Nintendo weren't great, but that's not a catch-all. Here's a question: If Latte has 320 shaders, why wasn't NSMBU in 1080p? We know the eDRAM should be enough to hold the framebuffer and the amount of effects going on isn't astronomical. With 1080p being the buzzword that it is for them, wouldn't they have wanted to go for it?
X I assume is keeping the highest level of effects we saw in the trailer and bringing the low end stuff up to that same level. Considering all that is going on and just how big the world is with effects going on at a distance but still clearly showing off transparencies and dof as well as reflections on that scale shows in my opinion more than we have seen from any other game in the current generation from shader effects.

Nsmbu could likely still run 1080p even with 160alu to point to that as the reason it is not a better part is silly since Nintendo clearly designed the console for 720p. And while it might be a buzz word here, it doesn't seem to matter much to Nintendo so they would rather get a completely solid 720p with vsync and aa while pushing 480p to the game pad at the same time.

Your logic is that since a VLIW4 or custom setup would be better than VLIW5, Nintendo must've gone for it. They just never would have settled for vanilla VLIW5! Yet, time and time again we have seen Nintendo blow our collective minds with how low they can go. Nobody, including myself, was predicting a 64-bit bus to main memory, and yet here we are. Plus, AMD themselves saw VLIW4 as a failed experiment and reverted back to VLIW5 cards before finalizing GCN.


Where did I say that Wii U is attaining 100% efficiency? If you're trying to imply that I don't think there's room for any improvement in Wii U games, you're wrong. What I said was that in most cases, neither Latte nor Xenos are getting the most they can out of the SPUs. Latte is probably more efficient though, thanks to its more modern design and on-chip eDRAM. So those cycles which would have been wasted on Xenos can theoretically be used for more shading on Latte. You are focusing too much on what you've read about VLIW5 in a PC setting. Those games are coded for the least common denominator among many cards from both AMD and NVidia (a point I made before but was conveniently ignored). Plus, Nintendo invested in a very good compiler which should help devs optimize their code even more for ILP.

I've explained why vliw5 just can't reach the efficiency you are saying it can. Literally vliw5 would only give gpu7 128alus the majority of the time to use, and it can't be designed around because the 5th shader isn't like the rest it is used for heavier tasks but can't simply do what the others can, it is a 4+1 design. Also those 4 can't do what the 5th is capable of doing which is why they made vliw4, and you might call it a failed experiment but 1536alus in hd 6970 beat out the 1600 ALUs in hd 5870 even at the same clocks, thanks to all sharers being able to run the same code, because other than the rare occasion that the 5th shader was used, hd5870 had only 1440 ALUs to work with, which is why hd 6950s with that 1440 shader count handled code so well. I'm not trying to compare consoles to pc though, xenos is coded closer to the metal than any pc gpu, which is why vliw5 can't be used in Wii u at only 160 ALUs it just has problems right in its design to ignore 1/5th of the shaders at least some of the time. Frame rates are likely seen dropping in launch ports thanks to CPU and memory utilizations not a low count of ALUs which would realistically not allow cod to maintain a fairly solid frame rate while keeping up perfectly in resolution and effects that xenos handles in the multiplayer.

you are right though, I am using my knowledge of vliw5 to come to these conclusions, however it seems you are stuck on retro titles and launch software to come to yours. Do you really believe 128 ALUs can drive what we see in cod on Wii u? I think it is more likely pushing 256 ALUs and just not reaching its full potential at launch, since they are making 7 year old hardware that has constantly been seeing growth since launch, it is just impossible imo that Wii u could keep up with xenos 240 shaders with only around half that number for the majority of the frames drawn.

ps I'm on my phone so this post might be hard to read.
 
Plus you have to factor in more power hungry memory and the uncore PC related stuff like PCI Express and extra video decoding hardware that Wii U might not need. Either part could fit if it was customized.

and the wiiu has 1.5GB more of ram to power.

Plus it has BC on the chip that may have to be powered at all times. Maybe someone could confirm or deny that.

To say you can drop the power consumption more than half is just silly...
 
and the wiiu has 1.5GB more of ram to power.

Plus is has BC on the chip that my be powered at all times. Maybe someone could confirm or deny that.

To say you can drop the power consumption more than half is just silly...

PCI express is actually pretty power hungry. The main RAM in the Wii U is also lower power than what you would find on PC graphics cards. Putting it on an MCM with the processor will also shave a few watts, as well as whatever silicon they could cut out that they don't need.

There is a lot we don't know. I think comparing directly to PC parts is probably unwise, as we know it's highly custom and doesn't likely match up to them in terms of performance per watt.
 
PCI express is actually pretty power hungry. The main RAM in the Wii U is also lower power than what you would find on PC graphics cards. Putting it on an MCM with the processor will also shave a few watts, as well as whatever silicon they could cut out that they don't need.

There is a lot we don't know. I think comparing directly to PC parts is probably unwise, as we know it's highly custom and doesn't likely match up to them in terms of performance per watt.

So lets compare it to consoles then. What is Xenons power consumption for instance ?
 
Wow has no one look at the power consumption number of AMD cards at 40nm?

Its clear as day that 320 is just impossible. Given the math the card cannot use more than at the most 15 watts. That even high IMO.

HD 5550 320:16:8 shader card @ 550 MHz is 33 Watts 40 nm card
HD 6450 160:8:4 Shader card @ 625 Mhz is 13 watts 40 nm card

http://www.techpowerup.com/reviews/HIS/Radeon_HD_5550/27.html
http://www.techpowerup.com/reviews/Sapphire/HD_6450_Passive/25.html

The only thing that point against 160 is the size of the on the card itself. My theory is they spread out the shader core to reduce the heat output. That does sound like a really nintendo thing to do. Please note the rumored card in the wiiu is 160:8:8 so will draw little more power than the 6450. Just fit perfect....

One thing is clear, 320 is just too big to fit at 40nm and uses way too much power.... 320 is out...

If we ignore the fact that you are using full scale retail cards that are about about the size of the Wii U itself and have tons of extra components meant to do PC related thing to compare for power, and the fact the 160ALU card you listed has a higher clock speed than Latte, Shader Modal 5.0, DX11 support, OpenGL 4.1 support, Accelerated Multithreading, and GDDR5 Memory.........its made with technology that is two generations ahead of the RV7XX in capability/efficiency.

To even begin to factor that, we would have to discount the possibility that the tech in the Wii U is RV7XX, in which case I would immediately lean towards z0m3le's GCN analysis as being more likely.
 
Out of curiosity, is there any benchmarks of cross platform games (Xbox 360 & PC) done with similar GPUs?

You mean benchmarks from GPUs that are roughly about what we expect to be in the PS3/360 to what we roughly expect to be in the Wii U?

No matter what, even if we found GPUs with 160SPs in a mobile configuration, it'd absolutely smash the 7600GT we roughly equate the PS3 GPU to. I don't know why we are ignoring how fast tech has moved post 2006. A 160SP machine would run circles around the PS3.
 
If we ignore the fact that you are using full scale retail cards that are about about the size of the Wii U itself and have tons of extra components meant to do PC related thing to compare for power, and the fact the 160ALU card you listed has a higher clock speed than Latte, Shader Modal 5.0, DX11 support, OpenGL 4.1, support,....its made with technology that is two generations ahead of the RV7XX in capability/efficiency.
.

So if its 2 generation ahead of the wiiu GPU and more efficient that would mean the card in the wiiu use more power? Yeahhh now you are getting it.

Now if you have some or really any hard numbers for what would be removed from a "full scale retail" card to get it power consumption cut more than half love to hear it. I doubt it... Unless you cut half the logic out...hmm 320 to 160. lol

To even begin to factor that, we would have to discount the possibility that the tech in the Wii U is RV7XX, in which case I would immediately lean towards z0m3le's GCN analysis as being more likely.
Sure throw out everything we have known about the gpu....sure lol

But sadly that par for the course in the wiiu thread. That how you get 600+ glfop being a "fact or worse case" in these threads.
 
You mean benchmarks from GPUs that are roughly about what we expect to be in the PS3/360 to what we roughly expect to be in the Wii U?

No matter what, even if we found GPUs with 160SPs in a mobile configuration, it'd absolutely smash the 7600GT we roughly equate the PS3 GPU to. I don't know why we are ignoring how fast tech has moved post 2006. A 160SP machine would run circles around the PS3.

Maybe, but we are comparing it to 360's GPU primarily aren't we?
 
and the wiiu has 1.5GB more of ram to power.

Plus it has BC on the chip that may have to be powered at all times. Maybe someone could confirm or deny that.

To say you can drop the power consumption more than half is just silly...

Well seeing as it now looks like BC is controlled by a tiny 8 bit processor I doubt any power use from that is more than negligible
 
Maybe, but we are comparing it to 360's GPU primarily aren't we?

Doesn't change much. The 360 runs a more advanced featureset than the PS3's GPU. Still would be far outclassed.

Which is why I say if it was a large factor more powerful as some people here think it is, there would be some code that once ported over would show a superfluous improvement in performance, "bad port" or lack of documentation or "Pre-25% overclock" or what have you. The end result would show a significant improvement.
 
Wow has no one look at the power consumption number of AMD cards at 40nm?

Its clear as day that 320 is just impossible. Given the math the card cannot use more than at the most 15 watts. That even high IMO.

HD 5550 320:16:8 shader card @ 550 MHz is 33 Watts 40 nm card
HD 6450 160:8:4 Shader card @ 625 Mhz is 13 watts 40 nm card

http://www.techpowerup.com/reviews/HIS/Radeon_HD_5550/27.html
http://www.techpowerup.com/reviews/Sapphire/HD_6450_Passive/25.html

The only thing that point against 160 is the size of the on the card itself. My theory is they spread out the shader core to reduce the heat output. That does sound like a really nintendo thing to do. Please note the rumored card in the wiiu is 160:8:8 so will draw little more power than the 6450. Just fit perfect....

One thing is clear, 320 is just too big to fit at 40nm and uses way too much power.... 320 is out...

Weren't you pounding the table emphatically for the case made that it's actually 55nm? If that was someone else I apologize.
 
Well seeing as it now looks like BC is controlled by a tiny 8 bit processor I doubt any power use from that is more than negligible

He knows this, the entire point of his discussion points to ignoring key parts so that he can make his theory fit. For instance he didn't take into account 40nm process used today is not the same that was used in 2009 where those parts were designed from. He also fails to mention that MCM should cut down the wattage as well as cutting out some very useless components for a console, the higher power draw of PCIe and more expensive memory from a wattage perspective. On top of that he also neglects to mention the card is clocked much higher than GPU7.

If all those things were taken into account, Wii U's GPU would end up at 9 to 10 watts when it could pull nearly 20watts by itself. (nearly is not 20, it is close like 17 to 20, there are variables that have been counted wrong about the parts which is why it is impossible to point to) However seeing as how 9 to 10 watts is possible for a 160 ALU part, 320 ALUs is also possible in a 15-17watt part considering a couple of those watts would be going to other things like the edram.

I do think BGassassin's estimation that the GPU is at least 256 ALUs makes a lot of sense.

And do we know that the GPU7 has 4 SIMDs? it is something BG said in his phone in iirc, that means each SIMD would only account for 40 ALUs, quite low. 64 ALUs is what AMD calls a CU in GCN and VLIW4 would also give every 2 blocks 64 ALUs which then could be considered SIMDs.

PS my "GCN" theory is that Nintendo is NOT using GCN but modified VLIW to work closer to GCN to bring latency down, mostly by adding schedulers to more efficiency choose different wave fronts to use, this again is not OoOe it is just allowing the ALUs to deal with code that fits its current pipeline better when possible. it should also nearly half VLIW4's cycle latency which would put it in line with the rest of Wii U's design. It's not really that custom and is perfectly standard in most other ways.
 
So if its 2 generation ahead of the wiiu GPU and more efficient that would mean the card in the wiiu use more power? Yeahhh now you are getting it.

Now if you have some or really any hard numbers for what would be removed from a "full scale retail" card to get it power consumption cut more than half love to hear it. I doubt it... Unless you cut half the logic out...hmm 320 to 160. lol

No, I'm not getting it. What you are saying is making no sense. 2 generations ahead aspect would likely lead to higher cost and that was just the problems at the door. There are still all of the problems I mentioned before that, and the fact that we know the GPU in the Wii U is not a stock GPU. Its a custom design. A customized HD 5550 seems more likely than RV7XX with HD6000 series parts shoehorned onto at a lower clock speed somehow achieving shading enhancements over the Xenos.
 
Well seeing as it now looks like BC is controlled by a tiny 8 bit processor I doubt any power use from that is more than negligible

Not true. I have said time and time again that there is additional logic to run the compatibility layer which translates tev instructions into shader code. The tiny cpu is only for the video signal conversion. USC-fan is not the one doing the ignoring of key facts here.
 
Not true. I have said time and time again that there is additional logic to run the compatibility layer which translates tev instructions into shader code. The tiny cpu is only for the video signal conversion. USC-fan is not the one doing the ignoring of key facts here.

Let's not get carried away. This isn't mutually exclusive.
 
Not true. I have said time and time again that there is additional logic to run the compatibility layer which translates tev instructions into shader code. The tiny cpu is only for the video signal conversion. USC-fan is not the one doing the ignoring of key facts here.

This only runs in Wii mode, right? Or do we not know that?
 
Do you really believe 128 ALUs can drive what we see in cod on Wii u? I think it is more likely pushing 256 ALUs and just not reaching its full potential at launch, since they are making 7 year old hardware that has constantly been seeing growth since launch, it is just impossible imo that Wii u could keep up with xenos 240 shaders with only around half that number for the majority of the frames drawn.

ps I'm on my phone so this post might be hard to read.

I know it has been mentioned before but with cod, i think they were taking the same approach as they did with the ps3/360 and didn't even bother to optimize the code for the wii u. Not sure how to word it but i think the way we would see the true power of the wii u is to just wait for games like x, metroid prime... first party games that take into account the strength and weakness of the console and optimize it.

On another note, it's kind of silly to reference this but the 100YOG has been right with some of his Nintendo predictions in the past someone asked a question about retros project:
Graphically, if we seen Retro's game today without the platforms or company names being disclosed in the trailer, what system(s) would most assume it's running on?
100 Year Old Gamer said:
Let's wait a couple more weeks.
 
What exactly did I say that was a straw man argument?

That any improvements aren't coming from the hardware but from Criterion maxing out the hardware that's only 6 months old more than they're maxing out the hardware that's 7 1/2 years old.

Z0M3Ie arguing against a 160 shader Latte by saying that I claimed it's already maxed out is a strawman argument.

That's what I say as well.

What I bolded in your post is a straw man argument, and quite frankly, insulting given the pains I've taken to provide a legit analysis.


No it's not a strawman argument. It's looking for proof in the pudding. If you're going to say based on these things that it's 160 ALUs then those 160 ALUs also have to be able to do what is being accomplished. It's saying if X = 2, and Z = 2, then X has to equal Z.

The improvements in reflections to NFS are so slight, that you have to squint your eyes to see them. And even so, Criterion didn't talk up the increased strength of the GPU.

I had no problem with seeing the difference. Criterion even talked about how the increasing of more geo being reflected changed the overall level of ambient lighting.

You know Nintendo lets devs talk about the good technical aspects of Wii U, because we are always hearing about it being "more modern" and having "more memory." Criterion talk about refinements they made in their own lighting engine, however. Sheer programming skill shouldn't be brushed aside in analyzing these matters, and if what we saw is the best they could squeeze out of a 320 shader part, I'd be very surprised.

I'm not trying to downplay programming but you're literally saying the programmers were able to squeeze more out of a 160 shader part than a 240 shader part, and all I'm saying is if that's the case then it's not 160 VLIW shaders. They also did it in 6 months, vs the how many years have they had on the 360.


Do you get what I'm saying, that for it to be 160 VLIW shaders, then Criterion had to max out the Wii U in 6 months with unfinished tools, but still hasn't maxed out the 240 Vec4+Scalar shaders in the 360 in 7 1/2 years.

I mean you're basically saying that the console has been maxed out with in it's launch window, when has that ever happened?
 
On another note, it's kind of silly to reference this but the 100YOG has been right with some of his Nintendo predictions in the past

What has he been right about? Not counting the claim that he predicted LTTP2; that one is very tenuous.

This is about the third of fourth time I've asked this question of people saying that "he's been right in the past" between here and gamefaqs and have yet to get a reply of any kind other than the aforementioned LTTP2 bull.
 
Let me clarify, I'm not attacking Fourth Storm or any one else.

I'm just saying based on what we've seen if it is 160 ALU's then they can not be VLIW and need to be something else.
 
That any improvements aren't coming from the hardware but from Criterion maxing out the hardware that's only 6 months old more than they're maxing out the hardware that's 7 1/2 years old.



That's what I say as well.




No it's not a strawman argument. It's looking for proof in the pudding. If you're going to say based on these things that it's 160 ALUs then those 160 ALUs also have to be able to do what is being accomplished. It's saying if X = 2, and Z = 2, then X has to equal Z.



I had no problem with seeing the difference. Criterion even talked about how the increasing of more geo being reflected changed the overall level of ambient lighting.



I'm not trying to downplay programming but you're literally saying the programmers were able to squeeze more out of a 160 shader part than a 240 shader part, and all I'm saying is if that's the case then it's not 160 VLIW shaders. They also did it in 6 months, vs the how many years have they had on the 360.


Do you get what I'm saying, that for it to be 160 VLIW shaders, then Criterion had to max out the Wii U in 6 months with unfinished tools, but still hasn't maxed out the 240 Vec4+Scalar shaders in the 360 in 7 1/2 years.

I mean you're basically saying that the console has been maxed out with in it's launch window, when has that ever happened?

That's the way it came off to me as well. Like he was trying the say the Wii U hardware wasn't better, it was just that they were hardly utilizing the 360.

That seemed kind of ridiculous to me. It wasn't something I expected from Fourth Storm.
 
Let me clarify, I'm not attacking Fourth Storm or any one else.

I'm just saying based on what we've seen if it is 160 ALU's then they can not be VLIW and need to be something else.

Based off what? What you think performance should be for a 160 alu part?


I have already shown benchmarks of a 160 alu part vs a 7900gt that is more powerful than the ps3 gpu. It outperform it by 20% even after you subtract clock rate differences.... Then add the fact the wiiu gpu has more edram and more bw.

A lot of fuzzy math in this thread lately.
 
Not all shaders are equal. The 160 number in the Wii U would still trounce Xenos and RSX, especially with the architectural advantages. The RSX never lived up to Nvidias numbers, and developers were never able to get the most out of Xenos either due to the architecture. Lots of reads and writes out to main memory meant a lot of idle shaders on Xenos. That, and the shaders in Wii U are far more capable than what was available for Xenos.

If Developers use the memory architecture in the Wii U well, it should be able to outperform both the PS3 and 360 in the GPU department. It doesn't need to be more than 160 to get the performance we saw in Need For Speed, it was just closer to full utilization than we saw from launch titles.

Could it be more if it is some custom configuration? Sure. But we're grasping at straws to come up with reasons why, and Occam's Razor applies. I think Fourth Storm's analysis is the best one we have until we get more information. I'm not 100% convinced all of our assumptions are right, but that doesn't mean we can fill in the gaps with unverifiable speculation.
 
Based off what? What you think performance should be for a 160 alu part?


I have already shown benchmarks of a 160 alu part vs a 7900gt that is more powerful than the ps3 gpu. It outperform it by 20% even after you subtract clock rate differences.... Then add the fact the wiiu gpu has more edram and more bw.

A lot of fuzzy math in this thread lately.

Based on games like AC3 and BO2 running essentially on-par at launch with unfinished tools and likely suboptimal optimization.

Something just doesn't seem to add up. PC benchmarks can't be used as any kind of smoking gun. At the same time it doesn't sound like a 160 shader part is definitively incapable of what we're seeing, but it would be nice to get more clarity, which may not come.


Also, I interpreted Fourth Storm's post(s) the same way (even if that's not how hit was intended): The implication seemed to be that Criterion was squeezing blood from a stone with NFSMWU while the 360 is underutilized in any case other than Gears of War (he specifically called that game out).
 
I see lots of comparisons to mobile or desktop HD4XXX and HD5XXX chips. Maybe I missed a reply somewhere but wouldn't Nintendo choose to modify an Embedded design given the low power requirement?

Why didn't we start looking at E4690 Discrete GPU (55nm), or E6460/E6760 GPUs (40nm) for comparisons? The wattage is variable on the E4690 (8W-25W) while the power consumption on the E6460 is roughly 20W. I'm guessing it's some hybrid between the two embedded generations.

http://www.amd.com/us/products/embedded/graphics-processors/Pages/ati-radeon-e4690.aspx

http://www.amd.com/us/Documents/E6460GPU_Product_Brief.pdf
 
Not all shaders are equal. The 160 number in the Wii U would still trounce Xenos and RSX, especially with the architectural advantages. The RSX never lived up to Nvidias numbers, and developers were never able to get the most out of Xenos either due to the architecture. Lots of reads and writes out to main memory meant a lot of idle shaders on Xenos. That, and the shaders in Wii U are far more capable than what was available for Xenos.

If Developers use the memory architecture in the Wii U well, it should be able to outperform both the PS3 and 360 in the GPU department. It doesn't need to be more than 160 to get the performance we saw in Need For Speed, it was just closer to full utilization than we saw from launch titles.

Could it be more if it is some custom configuration? Sure. But we're grasping at straws to come up with reasons why, and Occam's Razor applies. I think Fourth Storm's analysis is the best one we have until we get more information. I'm not 100% convinced all of our assumptions are right, but that doesn't mean we can fill in the gaps with unverifiable speculation.

Firstly, it's all unverifiable speculation.

Second the point you made about Xenos not utilizing all shaders is obviously correct, but VLIW5 constantly ignores 1/5 of it's shaders, this means that while Xenos might only use 200 shaders from time to time, Wii U is only using 128. And while VLIW5 is obviously more capable shaders, you are suggesting that it is basically 100% more capable. sure the clock is 10% higher, and efficiency could maybe make up for 30% of that count. It's still far short of matching Xenos, lets not pretend that VLIW5 has no latency issues causing those 128 shaders to never go idle, there is ~100 cycle latency in R700, so yes if it's 160ALUs it is very different than my old HD 4870 with 4/5th of it's shaders removed.

For 160 shaders to match Xenos, Xenos has to only utilize about 180 ALUs out of the 240 it has for the majority of the frames it draws, this only allows Wii U to match a crippled Xenos, not exceed it as we have heard from developers that it does. It's also funny that the Tekken developers have said that Wii U's GPU is 1.5x Xenos. Obviously these numbers don't mean much, but assuming they are directly comparing the capabilities, it points to 256 ALUs in Wii U, not 160.

I see lots of comparisons to mobile or desktop HD4XXX and HD5XXX chips. Maybe I missed a reply somewhere but wouldn't Nintendo choose to modify an Embedded design given the low power requirement?

Why didn't we start looking at E4690 Discrete GPU (55nm), or E6460/E6760 GPUs (40nm) for comparisons? The wattage is variable on the E4690 (8W-25W) while the power consumption on the E6460 is roughly 20W. I'm guessing it's some hybrid between the two embedded generations.

http://www.amd.com/us/products/embedded/graphics-processors/Pages/ati-radeon-e4690.aspx

http://www.amd.com/us/Documents/E6460GPU_Product_Brief.pdf

Simply because they expect those parts to be binned, even though they are created in the 100s of thousands for everything from PoS machines to casino machines. Also embedded designs similar to these are used in arcade machines as well. Still it's taboo to use these in serious discussions because they might seem optimistic.
 
I see lots of comparisons to mobile or desktop HD4XXX and HD5XXX chips. Maybe I missed a reply somewhere but wouldn't Nintendo choose to modify an Embedded design given the low power requirement?

Why didn't we start looking at E4690 Discrete GPU (55nm), or E6460/E6760 GPUs (40nm) for comparisons? The wattage is variable on the E4690 (8W-25W) while the power consumption on the E6460 is roughly 20W. I'm guessing it's some hybrid between the two embedded generations.

http://www.amd.com/us/products/embedded/graphics-processors/Pages/ati-radeon-e4690.aspx

http://www.amd.com/us/Documents/E6460GPU_Product_Brief.pdf

Well the amd e6460 is 160 ALU part @ 600Mhz and TDP is 21 watts.

http://www.amd.com/us/Documents/E6460-MXM-ProductBrief.pdf

Again point toward 160alu part given the power consumption

Firstly, it's all unverifiable speculation.

Second the point you made about Xenos not utilizing all shaders is obviously correct, but VLIW5 constantly ignores 1/5 of it's shaders, this means that while Xenos might only use 200 shaders from time to time, Wii U is only using 128. And while VLIW5 is obviously more capable shaders, you are suggesting that it is basically 100% more capable. sure the clock is 10% higher, and efficiency could maybe make up for 30% of that count. It's still far short of matching Xenos, lets not pretend that VLIW5 has no latency issues causing those 128 shaders to never go idle, there is ~100 cycle latency in R700, so yes if it's 160ALUs it is very different than my old HD 4870 with 4/5th of it's shaders removed.

For 160 shaders to match Xenos, Xenos has to only utilize about 180 ALUs out of the 240 it has for the majority of the frames it draws, this only allows Wii U to match a crippled Xenos, not exceed it as we have heard from developers that it does. It's also funny that the Tekken developers have said that Wii U's GPU is 1.5x Xenos. Obviously these numbers don't mean much, but assuming they are directly comparing the capabilities, it points to 256 ALUs in Wii U, not 160.
Again do you have source for any of these numbers or just pulling most of them out the air?

I will have to look but i believe I read MS about the x360 was it only use 50% of the gpu because of it design. They were making changes in the 720 to use more of the gpu. Maybe someone know about the doc and give us some hard numbers.
 
I don't understand comparing this chip to retail product, when someone in the business of giving you die shots to examine said its custom and should not be compared to other die shots directly although there are some similarities.
 
Status
Not open for further replies.
Top Bottom