WiiU "Latte" GPU Die Photo - GPU Feature Set And Power Analysis

Status
Not open for further replies.
Wait, so does that mean the DS, Vita, 3DS are tablets? That would mean that Nintendo started the tablet boom. Who would have thought.

No, of course. But it's hard to believe you can't figure out why someone who look at a piece of electronics covered 70% by a touchscreen would be tempted to call it a tablet. It looks like one.
 
No, of course. But it's hard to believe you can't figure out why someone who look at a piece of electronics covered 70% by a touchscreen would be tempted to call it a tablet. It looks like one.

Doesn't look like any tablet I've ever seen. It looks like a big-ass controller with a touch screen on it, which is what it is.

Honestly, it looks more like those touch-screen credit card machines you find in super markets than any kind of consumer electronics device.
 
Why has this EA tweet thing spread to the technical discussion threads? He said nothing useful to technical discussion whatsoever.

Because Untalkative-bunny decided to make it the topic of conversion and a lot of others jumped on it to encourage it. The bolded was the same thing I said.

Based on Mickey Mouse and Mario/Sonic. Also this could be the reason why they cant just dump all the Wii Virtual Console games to the Wii U at once. They might need optimization. I experiened some terrible framerate drops in that Kirby game for example. How the hell does this happen? Its a NES Game.

Edit: Also, wasnt there a trailer analysis for Pikmin 3 and it turned it, it was locked at 30 fps?

A trailer analysis? I don't know. If it was then it was probably just that. The trailers analysis. Maybe it was a youtube trailer. All youtube videos are scaled down to 30 FPS.
http://nintychronicle.wordpress.com/2013/04/20/60-fps-pikmin-3-trailer/
http://nintendoeverything.com/90761...20p-at-native-resolution-60-fps-for-pikmin-3/


This video covers my main point nicely though. A single game or even a handful not looking/running good doesn't mean the hardware it weak.
http://www.youtube.com/watch?v=bajJJ-vkBZY
 
Given what we know about the WiiU, is it safe to assume it could run a game like Earth Defense Force or Ninety-Nine Knights smoothly (locked 30fps or without much stutter or lag) on 2 different screens simaltaneously? By two different I mean each players in a different part of the stage or map dealing with their own demons or bugs.
 
He did comment on the power of the system. Given the discussion at hand with 160 alu.....

Only problem is that the 160 shaders has little to do with power. That has to do with shading capability which is usually the only thing that is really improved in Wii U upports besides texture detail.

http://www.nintengen.com/2013/02/need-for-speed-most-wanted-wii-u-vs.htm

hqdefault.jpg




160 shaders does not produce these types of results.
 
There are plenty of examples

There was also an update much later that further improved the shading in Trine 2 I believe. So, that would intel that the Wii U did get some kind of boost after launch, but also that it was already better before.

What I really want to know is why do assumptions that the Wii U is weaker always take precedence as opposed to the game not being properly ported when comparing it to the other launch ports?
 
BG with all these processes and numbers being thrown around, what about the 160 vs 320 shaders debate? Which one is most likely in your opinion?

The Wii U in my point of view can't have so much "secret" functions that Nintendo is not sharing it with developers at this point. Nintendo should be showing exactly how to get the most out of the GPU to Third Parties. Simply showing how great their First Party software looks is not enough.

It's been pointed out that I gave my thoughts. I'll just add I think there is merit to both sides of the debate.

Doesnt matter what I think the blocks are or not. It all silly guess work. It only leads to baseless speculation. After all these month and many people a lot smarter than me have look at this thing and we are nowhere closer than what fourth storm has found and was posted many weeks ago.

It would be one thing if anyone here had any idea what most of the these could even be or be proven to be correct. The big picture things are the easiest to work with[ALUs, TMUs, or ROPs] things you dont seem to want to debate.

Then don't you think it's unfair to knock down other people's views when you don't think it's worth the effort to give your own view? You mention Fourth, but even he doesn't claim his view as fact. Just for clarification are you saying all of this is silly guesswork, or all of it except for Fourth's? And you may not realize it, but to say it's silly guess work and baseless speculation means Fourth Storm wasted his time to even get the die shot in the first place. Why do we need to see the die shot and have this thread if it's only silly guesswork and baseless speculation? Even the whole back and forth about shader count is pointless since no one can say with 100% certainty what they are.

Speaking of which unless Fourth has changed his view or I'm not remembering correctly he was talking about 160 larger ALUs. In other words he's also proposing a different (or modified) architecture. It's similar to one of the ideas I had if they shifted to thread level parallelism which is what Nvidia uses and their GPUs have larger ALUs. But I digress.

How can I seem to not want to debate about them when I've talked about them? Also I don't think you seem to realize that the other blocks are very important. In fact let's look at what you are considering as the main parts of the GPU. Since you seem to be relying on Fourth, Fourth's analysis says this about the ROPs and L2 cache.

L2 caches: U blocks. 32(?) kB each. Seem to resemble the L2 on RV770 to an extent. I'm only counting the long SRAM banks as actual storage space. There should be a fair amount of other logic/SRAM for cache functionality, so it's hard to say how much is for actual texture data.
ROPs: W blocks: Seem to look like the ROPs on RV770 and their positioning next to L2 and DDR3 I/O makes sense.

His reasoning is that Block W resembles the ROPs in RV770 and are near the DDR3 I/O. They also very closely resemble blocks in Llano and Brazos that are nowhere close to the memory I/O. If these really are ROPs, then why is it that two out of three of these GPUs not have that block in similar proximity to their mem I/O? In Llano it's on the complete opposite side. And we know it's important that the ROPs have a position like that. It suggests that Latte's W are there by coincidence and are not ROPs. And from there if W is not ROPs, then U is also not L2 Caches because in Llano the same block is also on the opposite side of the mem I/O. So it can be said we now have two sets of duplicates that are free to be considered as something else in the pipeline.

Now let's look at the Fourth's view about the TMUs.

TMUs: T blocks. Again, I've already explained why I think this, but it also makes sense for them to be close to the DDR3 I/O.
L1 caches: S blocks. 8 kB each.

Going again back to Llano there is also a block like T with similar SRAM going horizontally instead of vertically like in Latte. This block is also on the opposite side of the mem I/O. This would also suggest S are not L1 caches. That means that Latte more than likely does not have 8 TMUs. And to consider T as the TMUs would mean that this is the first GPU I've seen where the TMUs are not with the SIMDs.

In turn this would mean four of the duplicates he labeled are up for debate as to what they are. And then brings the question of where are the ROPs and TMUs located then?

So I hope you see that even with the main components of the GPU, the other blocks are important.
 
I think 160 ALUs depends on the efficiency Wii U achieves and just how bad Xenos' efficiency is on it's 240 ALUs. Otherwise the debate has gotten to Wii U is pushing near the best it can do simply because Xenos is and they are very comparable, this would be perfectly natural to assume if both boxes were on the market for 4 or 5 years, but we know for a fact that Wii U has development issues on all the titles released thanks to bad dev tools and final hardware not being given until these launch titles have already gone gold.

The argument has gotten down to whether Wii U can achieve better performance with those negatives and 160 ALUs or if it needs 256 ALUs+ to achieve these things. Those being realistic is in the eye of the beholder because no one here is actually knowledgeable enough to really tell us what can be achieved on 160 ALUs, we have some bad benchmarks that are apple to oranges comparisons but lack real comparisons because games, resolution, AA and in game settings are different across those comparisons.

As I've been saying 160 ALUs is probably possible but requires some customization to achieve those results, at least with the development environment that Wii U developers had to deal with, at least up until GDC this year (when the SDK was updated)
 
Only problem is that the 160 shaders has little to do with power. That has to do with shading capability which is usually the only thing that is really improved in Wii U upports besides texture detail.

160 shaders does not produce these types of results.

You what? Processor count in a GPU is a massive determinant of power. What do you think Crossfire/SLI is? It's also a little strange how you believe you can determine from a screenshot the number of shader processors required to render it. The lighting changes in the NFS screens aren't especially drastic and look to be variable tweaks more than anything else. If the 360's GPU can do it a 160ALU Latte could be able to. The update to Trine was a fix for a gamma issue I believe, not a real increase in rendering fidelity.
 
You what? Processor count in a GPU is a massive determinant of power. What do you think Crossfire/SLI is? It's also a little strange how you believe you can determine from a screenshot the number of shader processors required to render it. The lighting changes in the NFS screens aren't especially drastic and look to be variable tweaks more than anything else. If the 360's GPU can do it a 160ALU Latte could be able to. The update to Trine was a fix for a gamma issue I believe, not a real increase in rendering fidelity.

I think the more obvious comparison is 360 games that have been ported to Wii U with identical graphics, while splitting resources to the gamepad as well as dealing with the Wii U development environment I mention in my last post. Something like AC3 for instance being on Wii U does prove IMO that Wii U is more capable than 360. Power isn't the discussion we are having, it is whether or not 160 ALUs is enough to achieve that with unfinished hardware and bad development tools and APIs. It does have to overcome quite a bit of a disadvantage some say that SM4 with VLIW5 is enough to achieve that based on higher clocked SM5 GPUs with VLIW5 running similar but different software without AA and different settings, but I am personally hesitant to use that example as what Wii U could do with 160 ALUs unless it is fairly different from VLIW5 as it stands in R700.
 
I think the more obvious comparison is 360 games that have been ported to Wii U with identical graphics, while splitting resources to the gamepad as well as dealing with the Wii U development environment I mention in my last post. Something like AC3 for instance being on Wii U does prove IMO that Wii U is more capable than 360. Power isn't the discussion we are having, it is whether or not 160 ALUs is enough to achieve that with unfinished hardware and bad development tools and APIs. It does have to overcome quite a bit of a disadvantage some say that SM4 with VLIW5 is enough to achieve that based on higher clocked SM5 GPUs with VLIW5 running similar but different software without AA and different settings, but I am personally hesitant to use that example as what Wii U could do with 160 ALUs unless it is fairly different from VLIW5 as it stands in R700.

Krizzx said ALU count has to do with shading capability but little to do with power, which is meaningless nonsense. A debate on whether 160 ALUs is enough is a debate about power: if you don't believe devs with bad tools could produce AC3 on a 160 shader Latte you believe the chip must be more powerful than that.

I'm not coming down on one side or the other with regards to the ALU count because I don't know, personally I believe the results we've seen so far imply GPU is a bit beefier the Xenos/RSX but that could be wrong and it could just be the memory architecture allowing whatever small improvements have been made.
 
Krizzx said ALU count has to do with shading capability but little to do with power, which is meaningless nonsense. A debate on whether 160 ALUs is enough is a debate about power: if you don't believe devs with bad tools could produce AC3 on a 160 shader Latte you believe the chip must be more powerful than that.

I'm not coming down on one side or the other with regards to the ALU count because I don't know, personally I believe the results we've seen so far imply GPU is a bit beefier the Xenos/RSX but that could be wrong and it could just be the memory architecture allowing whatever small improvements have been made.

I did? I thought my post was solely about shading capabilities and this constant insistence by some posters on proving the claim made by the person identified as "an anonymous dev" who said the Wii U had less shaders and was less capability.



I really like that last comment from Criterion on their NFS:MW interview though.
http://www.eurogamer.net/articles/digitalfoundry-need-for-speed-most-wanted-wii-u-behind-the-scenes

"The Wii U has had a bit of a bad rap - people have said it's not as powerful as 360, this, that and the other. That, by and large, has been based on apples to oranges comparisons that don't really hold water. Hopefully we'll go some way to proving that wrong," he says.

Also
A selection of in-game screenshots taken from the Wii U version of Most Wanted. High-res PC textures are the headline addition, but Criterion has also improved night-time lighting after employing new staff who previously worked in the motion picture business. In contrast to the other versions, the game begins at night perhaps to highlight the difference.
Verified improved shading, whether you want to acknowledge it or not.

"The difference with Wii U was that when we first started out, getting the graphics and GPU to run at an acceptable frame-rate was a real struggle. The hardware was always there, it was always capable. Nintendo gave us a lot of support - support which helps people who are doing cross-platform development actually get the GPU running to the kind of rate we've got it at now.
Does this not mirror the issues brought up with Mario and Sonic at the Olympics and Epic Mickey 2?

I base my assessments on the facts. I will take a verified, experienced devs professional statements over some unverified anonymous person who calls himself a dev and talks downward about hardware in a fanboyish way that shows no technical understanding and completely lacking in specific details any day of the week and any time of year.

Should I need to bring up Deus Ex Director's Cut as well? Though, I will quote the dev as making the statement "We could have easily ported the game, washed out hands and that its" I have no doubt in my mind that this was the case with the majority of the Wii U ports.
http://www.denofgeek.com/games/deus...an-revolution-directors-cut-behind-the-scenes



Now, as for the 160 shaders. I do not see it. I am with BG on the 252 minimum. I don't believe in making assessments off of pure ethereal data that I've conjured up on my own without some way of verifying.

If 160 shaders can outperform 240, how? All I ask is detail. Even a simple comparative explaation with other parts would suffice. I've seen none of this. All I see is "because its more modern". That doesn't mean anything. What aspect of it being more modern makes it achieve these results? Is it using a more modern form of magical electricity that cause silicon to achieve near 2X performance?
 
Krizzx said ALU count has to do with shading capability but little to do with power, which is meaningless nonsense. A debate on whether 160 ALUs is enough is a debate about power: if you don't believe devs with bad tools could produce AC3 on a 160 shader Latte you believe the chip must be more powerful than that.

I'm not coming down on one side or the other with regards to the ALU count because I don't know, personally I believe the results we've seen so far imply GPU is a bit beefier the Xenos/RSX but that could be wrong and it could just be the memory architecture allowing whatever small improvements have been made.

Just pointing out, I didn't mention Krizzx and wasn't talking to his points, it was to show that Wii U is likely more powerful than Xenos, the current debate in this thread is whether or not 160 ALUs is enough to achieve that, and my statement is that it depends on how efficient Xenos' 240 ALUs are. The clocks don't speak to much of a difference either way, so you have to put forward the architectures to see if R700 can achieve superior performance to Xenos with only 2/3rd of the ALU count.
 
I did? I thought my post was solely about shading capabilities and this constant insistence by some posters on proving the claim made by the person identified as "an anonymous dev" who said the Wii U had less shaders and was less capability.
You did. Perhaps you worded the sentence poorly.
Only problem is that the 160 shaders has little to do with power. That has to do with shading capability.
Shading capability is power. If the anonymous dev meant that it has 160 ALUs to the Xenos' 240 then that's a comment on the chip's power. Similarly, if they had said that it had more shaders (320) that would imply that Latte has more processing power (352gflops). However, I would take all such comments with a grain of salt and even if the ALU count is 160 (a figure which is looking more and more concrete) I don't think it alone tells the full story.

Verified improved shading, whether you want to acknowledge it or not.
Once again, such improvements are entirely possible by changing existing lighting variables without overhauling the shading system. Not saying they haven't, but if the improvements were more dependent on tech than they were art I imagine they would have trumpeted the fact more. I'd also imagine that the differences between the 360/Wii-U screens would be a bit more impressive.

Do I need to bring up Deus Ex Director's Cut as well? Though, I will quote the dev as making the statement "We could have easily ported the game, washed out hands and that its" I have no doubt in my mind that this was the case with the majority of the Wii U ports.
http://www.denofgeek.com/games/deus-...ind-the-scenes

This has been mentioned before. The shading system was improved for the DLC and the team backported these improvements to the original game for the Wii-U version. It's a bit like how for a time Half-Life 2 had HDR on PS3 and 360 whilst the PC version retained it's original lighting, despite being more than capable of it.

Just pointing out, I didn't mention Krizzx and wasn't talking to his points, it was to show that Wii U is likely more powerful than Xenos, the current debate in this thread is whether or not 160 ALUs is enough to achieve that, and my statement is that it depends on how efficient Xenos' 240 ALUs are. The clocks don't speak to much of a difference either way, so you have to put forward the architectures to see if R700 can achieve superior performance to Xenos with only 2/3rd of the ALU count.

My post was a direct response to Krizzx though and you quoted it, so the context got a bit convoluted. But yeah, I understand/agree. There's also the question of how close the processors are to those found in a vanilla R700. We may never fully know. Nintendo has had the "don't release specs" policy for a while now (you could say it started with the 'real world' figures given for the GCN) but the Wii-U is the first system where it's really played out. The specs released for the Cube laid the groundwork for the Wii (and even Espresso) and the DS/3DS had specs and chip names given out, but now the public is stuck with leaks and die shots.
 
Just pointing out, I didn't mention Krizzx and wasn't talking to his points, it was to show that Wii U is likely more powerful than Xenos, the current debate in this thread is whether or not 160 ALUs is enough to achieve that, and my statement is that it depends on how efficient Xenos' 240 ALUs are. The clocks don't speak to much of a difference either way, so you have to put forward the architectures to see if R700 can achieve superior performance to Xenos with only 2/3rd of the ALU count.

66% gain from Xenos to 720 gpu.

http://www.neogaf.com/forum/showthread.php?p=46707523#post46707523
 
From this thread: http://www.neogaf.com/forum/showthread.php?t=559196

Silicon studio comments on developing with the Wii U. Not sure if you guys read it yet. They said, "...Wii U has very specific characteristics. Some game designers will like it. Some others will have a hard time to port their game. There are pros and cons. We are very close to Nintendo, so we were working on Wii U for a long time. We almost got the maximum performance with the hardware. Since we are working closely with the Nintendo support team they gave us a lot of useful information."
 
He thought they we talking about ps4 compare to x720. Thats why he said the same amd base.

Oh ok. Aegies' clarification and Durante's response was on the next page.

"WTF, this changes everything.

I'd hope it's way more efficient than Xenos! But so should every GPU AMD puts out in 2013. And put out in 2012.
"

I'm now curious on what is his opinion on this subject.
 
From this thread: http://www.neogaf.com/forum/showthread.php?t=559196

Silicon studio comments on developing with the Wii U. Not sure if you guys read it yet. They said, "...Wii U has very specific characteristics. Some game designers will like it. Some others will have a hard time to port their game. There are pros and cons. We are very close to Nintendo, so we were working on Wii U for a long time. We almost got the maximum performance with the hardware...".

I consider this part interesting as well.

Ian Graham, Principle Engineer: I think it was a bit of having a headstart and there was a lot of continuity from the Wii in terms of architecture. They added a significant amount of horsepower, but there was no revolution needed at the engine level to take advantage of it.

As expected, those with experience working on the Wii seems to have a major advantage over others who did not. At a design level, Wii U is really like "Gamecube 2"
 

Thanks, that gives us some solid numbers to go buy. If Wii U was using the durango GPU/Architecture (it isn't obviously) which is based on GCN (HD 7000 series) 160 ALUs @ 550MHz would produce a very similar performance to 292 ALUs from Xenos.

Obviously the problem here is Wii U is likely not using GCN, but 160 ALUs is possible to out perform Xenos, thanks to Xenos actually being fairly inefficient in today's standards, even after 8 years of digging deep into the architecture.

Wii U needs at least a 40% efficiency jump from Xenos to perform at the same level (that includes the 10% higher clock) of course Wii U probably needs a bit more than that considering it does lose resources to the gamepad, figuring for that it would be easy enough to guess that it needs to actually have a 50% efficiency improvement over Xenos's ALUs, which we should assume is probably the peak VLIW could offer (otherwise AMD would not of bothered with GCN) so with these new numbers.

Minimum Wii U has 160 ALUs or more that perform over 40% better than Xenos ALUs. Assuming they have 50% increase over Xenos, we are looking at the equivalent of 264 Xenos ALUs. That should more or less be the bare minimum of what Wii U can have IMO considering exact ports pushing extra shader resources to the gamepad.
 
Precisely. I have dropped the logic that "X way of doing things has been show to be demonstrably better than Y, thus Nintendo must have gone with X in Latte!" This applies to things like hardware interpolation and the VLIW5 architecture in general. It might not make sense to us as technology enthusiasts, but it's not unfathomable that it made sense to Nintendo at some point in time, for whatever Nintendo reasons they have (price, simplicity, "good enough" mentality).

This is an interesting observation, and hints to why tech speculation regarding WiiU has been so wrong again and again. The main problem is the huge time gap between the R700 base and today. It's 2013 and we're applying years of experience and second guessing Nintendo's engineering decisions. AMD's gone from VLIW5 to VLIW4 to GCN and already laying out GCN 2.0.

We have the benefit from both hindsight (lessons learned about the efficiency of VLIW5 versus VLIW4) and foresight (looking ahead to PS4/XB720 and the needs of future games). Using that information, it's hard to imagine coming up with what Nintendo did. But they didn't know then what we know now!

Everything is a tradeoff, and engineering constraints certainly change with the times. in 2008, VLIW5 was pretty good. To compete with PS3/X360, it was a sensible decision. VLIW5-style vertex shaders are ideally suited to DirectX9 era games. For DX10+, not so much. And for GPGPU, it's very troublesome compared to SIMD GCN.

Timing matters. Gamecube was an efficient and powerful design due in no small part to having the 180nm process node ready. PS2 had to launch with 250nm design rules and the huge eDRAM left little room on the GS for advanced effects. GC was eDRAM done right. NEC could fit twice the RAM at 180nm in the same space as Sony could at 250nm. WiiU could have been much more if designed for a 32nm or 28nm node.

It seems to me like something went wrong with Nintendo's project management. WiiU's hardware hints at a console that should have launch earlier than holiday 2012. IMO, it was feature-locked much earlier than past console designs. Maybe Nintendo had trouble coming up with a gameplay hook and went through many ideas before committing to the gamepad. Or they expected another Wii phenomenon and wanted to ensure enough units could be produced.
 
This is an interesting observation, and hints to why tech speculation regarding WiiU has been so wrong again and again. The main problem is the huge time gap between the R700 base and today. It's 2013 and we're applying years of experience and second guessing Nintendo's engineering decisions. AMD's gone from VLIW5 to VLIW4 to GCN and already laying out GCN 2.0.

We have the benefit from both hindsight (lessons learned about the efficiency of VLIW5 versus VLIW4) and foresight (looking ahead to PS4/XB720 and the needs of future games). Using that information, it's hard to imagine coming up with what Nintendo did. But they didn't know then what we know now!

Everything is a tradeoff, and engineering constraints certainly change with the times. in 2008, VLIW5 was pretty good. To compete with PS3/X360, it was a sensible decision. VLIW5-style vertex shaders are ideally suited to DirectX9 era games. For DX10+, not so much. And for GPGPU, it's very troublesome compared to SIMD GCN.

Timing matters. Gamecube was an efficient and powerful design due in no small part to having the 180nm process node ready. PS2 had to launch with 250nm design rules and the huge eDRAM left little room on the GS for advanced effects. GC was eDRAM done right. NEC could fit twice the RAM at 180nm in the same space as Sony could at 250nm. WiiU could have been much more if designed for a 32nm or 28nm node.

It seems to me like something went wrong with Nintendo's project management. WiiU's hardware hints at a console that should have launch earlier than holiday 2012. IMO, it was feature-locked much earlier than past console designs. Maybe Nintendo had trouble coming up with a gameplay hook and went through many ideas before committing to the gamepad. Or they expected another Wii phenomenon and wanted to ensure enough units could be produced.

Yeah, this is actually known from the Iwata asks, that they had trouble with the wireless streaming, it was likely suppose to be a 2011 console and it wouldn't of surprised me if that was delayed to 2012 because of software and technical problems with the streaming that wasn't ironed out until 2012 iirc. The drought for that would be pretty intense, however 3rd parties might have came on board earlier and Nintendo might of delayed some Wii software to put it on Wii U like they had done with Twilight princess.
 
I apologize if I have been a bit touchy lately in this thread. GPU speculation is nothing more than a fun hobby, afterall. You could even call it a somewhat bizarre (nerdy) offshoot of gaming in general. Admittedly, it is slightly frustrating that the possibility of a 160 shader Latte is dismissed so hastily (without proposing some type of massive architectural overhaul). I, personally, would love for there to be more going on there, but the deeper I have dug, the less likely it seems.
I just proposed an architectural change that would work very well with the 160 SU theory: Thread interleaving. Running 320 or 640 concurrent threads on 160 shader units. See this presentation, starting at page 31: http://s08.idav.ucdavis.edu/fatahalian-gpu-architecture.pdf

If that's even necessary with all the embedded memory available. Stalls typically occur during VRAM reads after all, and with ultra low latency local storage, there shouldn't be all that many stalls in the first place compared to traditional GPUs.
 
This is an interesting observation, and hints to why tech speculation regarding WiiU has been so wrong again and again. The main problem is the huge time gap between the R700 base and today. It's 2013 and we're applying years of experience and second guessing Nintendo's engineering decisions. AMD's gone from VLIW5 to VLIW4 to GCN and already laying out GCN 2.0.

We have the benefit from both hindsight (lessons learned about the efficiency of VLIW5 versus VLIW4) and foresight (looking ahead to PS4/XB720 and the needs of future games). Using that information, it's hard to imagine coming up with what Nintendo did. But they didn't know then what we know now!

Everything is a tradeoff, and engineering constraints certainly change with the times. in 2008, VLIW5 was pretty good. To compete with PS3/X360, it was a sensible decision. VLIW5-style vertex shaders are ideally suited to DirectX9 era games. For DX10+, not so much. And for GPGPU, it's very troublesome compared to SIMD GCN.

Timing matters. Gamecube was an efficient and powerful design due in no small part to having the 180nm process node ready. PS2 had to launch with 250nm design rules and the huge eDRAM left little room on the GS for advanced effects. GC was eDRAM done right. NEC could fit twice the RAM at 180nm in the same space as Sony could at 250nm. WiiU could have been much more if designed for a 32nm or 28nm node.

It seems to me like something went wrong with Nintendo's project management. WiiU's hardware hints at a console that should have launch earlier than holiday 2012. IMO, it was feature-locked much earlier than past console designs. Maybe Nintendo had trouble coming up with a gameplay hook and went through many ideas before committing to the gamepad. Or they expected another Wii phenomenon and wanted to ensure enough units could be produced.

Had Nintendo been managing their projects correctly, the follow-up to Wii would have been released only 4 years after the Wii. Yes, at the height of it's popularity. Instead Nintendo waited until the Wii's corpse was already cold and the PS4/Nextbox were just on the horizon. No lessons were learned from the Dreamcast there, you do NOT launch a console one year out of sync with the cycle! Especially one which only apparently brings you to parity with what exists in the current cycle, just as the next cycle is about to bring a significant hardware upgrade! Sega got killed by being out of sync on the Dreamcast, and MS nearly got killed too by the RRoD when they rushed the 360 out a year before the hardware was truly ready. It cost MS $1 billion to recover from RRoD, and they might still finish out the generation in last place worldwide because of PS3's miraculous turnaround.
 
Had Nintendo been managing their projects correctly, the follow-up to Wii would have been released only 4 years after the Wii. Yes, at the height of it's popularity. Instead Nintendo waited until the Wii's corpse was already cold and the PS4/Nextbox were just on the horizon. No lessons were learned from the Dreamcast there, you do NOT launch a console one year out of sync with the cycle! Especially one which only apparently brings you to parity with what exists in the current cycle, just as the next cycle is about to bring a significant hardware upgrade!
I half agree, but releasing early did nothing but good for the 360. You're right about everything else though and it should have been released earlier (2010 maybe).
 
Even if the GPU was ready to launch in 2011, what good would it do when they couldn't even get the software ready for late 2012?

Hell, we're approaching mid-2013 and they're not even there yet.


As TKM said, something went very, very wrong. There's signs of genius all over the place, but they couldn't bring it all together into an effective product.
 
Honestly? Because you're talking about no single part of the console being at parity with the 360. Every part to a degree coming up lacking in comparison. The thing would struggle to run a 360 port at all. Let alone outpace it.

Whats wrong with this:

Minimum Wii U has 160 ALUs or more that perform over 40% better than Xenos ALUs. Assuming they have 50% increase over Xenos, we are looking at the equivalent of 264 Xenos ALUs. That should more or less be the bare minimum of what Wii U can have IMO considering exact ports pushing extra shader resources to the gamepad.

Is that not possible? I'm just trying to understand why 160 is impossible.
 
Whats wrong with this:



Is that not possible? I'm just trying to understand why 160 is impossible.

That is still getting very close to XB3's efficiency in a console with GCN vs Wii U's efficiency with VLIW5... Honestly vanilla VLIW5 is impossible to reach a similar efficiency to GCN in a console and as someone else pointed out, VLIW5 is fine for DX9 type of effects, but when things became more complex, VLIW5 was inefficient at those tasks and were better handled with VLIW4 and much better when moved to thread level parallelism which is what I think they would have to do with customization to hit that sort of efficiency.

The fact is though that that is what is needed with 160 shaders. It's possible but its unlikely.
 
The line of thinking seems to be that, if the Wii U version of a GPU heavy game outperforms the Xbox360 version, it needs at least as many GFLOPS. And to get there, one would need a certain amount of shader units. Makes sense, right?

Except it doesn't, because traditional GPUs are apparently quite inefficient. Reportedly mostly as a result from branching issues, which can reduce the overall real world performance of a GPU by as much as ~85%, and stalls during texture reads, which can take hundreds or thousands of cycles. That's the problem with GFLOPS figures - they're highly theoretical and nowhere near the actual performance you'll get under real workloads. So essentially, if Nintendo managed to eliminate just one of those bottlenecks, the GFLOPS comparison becomes pretty much meaningless.
 
The line of thinking seems to be that, if the Wii U version of a GPU heavy game outperforms the Xbox360 version, it needs at least as many GFLOPS. And to get there, one would need a certain amount of shader units. Makes sense, right?

Except it doesn't, because traditional GPUs are apparently quite inefficient. Reportedly mostly as a result from branching issues, which can reduce the overall real world performance of a GPU by as much as ~85%, and stalls during texture reads, which can take hundreds or thousands of cycles. That's the problem with GFLOPS figures - they're highly theoretical and nowhere near the actual performance you'll get under real workloads. So essentially, if Nintendo managed to eliminate just one of those bottlenecks, the GFLOPS comparison becomes pretty much meaningless.

This, in a general sense, is how I understand modern cpus to be evolving. Efficiency is a huge focus for a company like Intel, and they get a hell of a lot more out of the same number of transistors now than they did in 2005. So I'm just guessing that it is possible the same can be said for AMD's gpus. I'm not saying that is the case, but 4thstorm's analyses maybe shouldn't be discounted as impossible so quickly.
 
The line of thinking seems to be that, if the Wii U version of a GPU heavy game outperforms the Xbox360 version, it needs at least as many GFLOPS. And to get there, one would need a certain amount of shader units. Makes sense, right?

Except it doesn't, because traditional GPUs are apparently quite inefficient. Reportedly mostly as a result from branching issues, which can reduce the overall real world performance of a GPU by as much as ~85%, and stalls during texture reads, which can take hundreds or thousands of cycles. That's the problem with GFLOPS figures - they're highly theoretical and nowhere near the actual performance you'll get under real workloads. So essentially, if Nintendo managed to eliminate just one of those bottlenecks, the GFLOPS comparison becomes pretty much meaningless.

Right, which is why we are trying to look at efficiency, USC-fan just posted on the last page that Durango ALUs will achieve 66% better efficiency over Xenos, that gives us something to actually go on. It points to Wii U needing at least 40% better efficiency to match Xenos and realistically it needs something like 50% efficiency to achieve identical performance on TV while leaving some shader resources for the gamepad. @ 50% better efficiency that leaves the equivalent of 24 Xenos shaders for the gamepad. Reasonable that may be, it is getting very close to durango's efficiency stated by Microsoft and that is using Thread level parallelism and GCN architecture to achieve that. I'm not so sure VLIW5 can achieve those numbers.
 
Honestly? Because you're talking about no single part of the console being at parity with the 360. Every part to a degree coming up lacking in comparison. The thing would struggle to run a 360 port at all. Let alone outpace it.
Depends how far you think 2x as much RAM, 3x as much eDRAM and a more modern featureset takes you.
 
Depends how far you think 2x as much RAM, 3x as much eDRAM and a more modern featureset takes you.
To a place with better texturing, potentially better transparency effects, and better lighting solutions while coming up short in triangles, shadow resolution (while still being higher precision), all around similar performance in a purely visual sense, but potentially rendering to a second set of 307,000 pixels in need of shading visual effects yada. That could potentially knee cap the console.
 
Depends how far you think 2x as much RAM, 3x as much eDRAM and a more modern featureset takes you.

Before we had Durango efficiency that was a guessing game. However Durango also has 3x as much eDRAM, an even more modern featureset, a more efficient architecture and only achieves 66% better efficiency from it's ALUs. Making this much less of a guessing game. We have to assume Nintendo basically overcame all of those negatives and achieved ~50% better efficiency on VLIW5 in order to just edge out Xenos with 160 ALUs. I'd assume that is possible but you are still dealing with bad dev tools and unfinished hardware when it comes to launch titles like AC3 and CoD... It does seem a bit of a stretch so I don't think a more custom design is out of the question. A developer has already said as much.
 
been thinking about the ram bandwidth (and i dare say this thoughts already been brought up before but i'm bored) now the 360's gddr3 is a lot faster than the U's DDR3 but the 360 is split 50/50 between read and write whereas the wiiu is completely flexible, in real world situations wouldn't there be far more reads from it than writes (though the 360's limited edram would necessitate extra writes the wiiu doesn't need) thus negating most (if not all) of this bandwidth issue?
 
We've never seen confirmation that Latte is in fact VLIW5 as far as I know. Could be an early GCN like design for all we know. Just like Xenos was some weird in-between thing.
 
been thinking about the ram bandwidth (and i dare say this thoughts already been brought up before but i'm bored) now the 360's gddr3 is a lot faster than the U's DDR3 but the 360 is split 50/50 between read and write whereas the wiiu is completely flexible, in real world situations wouldn't there be far more reads from it than writes (though the 360's limited edram would necessitate extra writes the wiiu doesn't need) thus negating most (if not all) of this bandwidth issue?
Probably why it never really comes up as an issue.
 
Status
Not open for further replies.
Top Bottom