AMD Polaris architecture to succeed Graphics Core Next

I'm going to be upset if SLI isn't utilized more in VR


I'm always going to SLI the cut down Titan regardless of VR, so I hope to get more use from it
 
I'm going to be upset if SLI isn't utilized more in VR


I'm always going to SLI the cut down Titan regardless of VR, so I hope to get more use from it

Standard Sli itself is entirely unsuitable for VR. You're going to see more different mechanisms of multi gpu from now on (if multi gpu is supported).
 
This is definitely disappointing. 4096 is Fiji's number and this basically mean that P10 will likely be a Hawaii "port". Well, I'll still hope for faster h/w for now since nothing is officially confirmed yet.

The only way it wouldn't be disappointing is if this 4096 sp Vega GPU turns out to be the smaller Vega 11, rather than Vega 10.
 
The only way it wouldn't be disappointing is if this 4096 sp Vega GPU turns out to be the smaller Vega 11, rather than Vega 10.
The numbers are totally worthless.
Vega11 could be smaller, could be bigger, according to Koduri they have nothing to do with the potential size.

Right now I think Vega 10 = Greenland = small Vega (2 HBM stacks, 2048-Bit Interface)
Later on Vega 11 = big Vega ( 4 HBM Stacks, 4096-Bit Interface)
 
ATW SLI is probably the most obvious use of SLI.

ATW? Asynchronous Time Warp?

That is somewhat independent to vr sli, you need to do that regardless. VR SLI, makes SFR basis upon which you get scaling across GPUs.

The key difference here is that it requires developers to support the api unlike standard sli, where it is a "pure" driver hack.
 
I don't know why you keep saying it's disappointing simply based on the number of shaders. Maxwell GPUs had less CUDA Cores than their Kepler equivalents but in practice were much faster, due to higher clocks and shader efficiency gains. It's not out of the question to believe the same will probably happen with Polaris.

Not really true. It kinda seemed that way with how Maxwell was rolled out to the market but in practice the complexity of Maxwell chips has even increased per SP when compared to Kepler:

GK104 - 1536 SPs - 3540M transistors | GM204 - 2048 SPs - 5200M transistors
1.33 more SPs for 1.47 more transistors.

GK110 - 2880 SPs - 7080M transistors | GM200 - 3072 SPs - 8000M transistors
1.07 more SPs for 1.13 more transistors.

You are correct in saying that Maxwell SPs while essentially being the same number have provided considerably more performance - as can be seen from how GK110 compares to GM200. This is a possible scenario but thing is I don't really trust AMD to be able to increase the GCN performance that much by mere optimizations. +10-20% sure but this is exactly what I'd call a disappointing result for a card which will be launched nearly two years after Fiji.


You're correct in that we don't know yields, but we can infer a lot from what we do know, which is the timescale of confirmed product launches on 14nm/16nm, and likely pricing information. The point of my post is that all the publicly confirmed information we have points to a very slow yield growth and a process which will still be expensive even for relatively small dies for the rest of the year.
The problem of newer processes is not yields per se but the cost of wafers in general. So even if you have 100% working chips the cost of each one will still be very high. That's the reason for both IHVs waiting even though both processes are already used for production of smaller chips since last year - the cost of wafers go down with time making the production of bigger chips economically sound. This isn't about yields as much as it is about wafer costs.

The first is Nvidia's G92, which was first produced on a 65nm process, and later shrunk to a 55nm process. The die was 324mm² on 65nm and 260mm² on 55nm, so that's a 19.8% reduction in size over a single node jump (28nm to 14nm is of course a two node jump). The second is AMD's RV790, which was a 282mm² die on 55nm. It was replaced by Juniper, which wasn't a direct die shrink, but shared the same architecture (Terascale 1) and ALU configuration (although it did have a narrower 128-bit GDDR5 interface, which would have saved some die space). Juniper was 170mm² on a 40nm process, so that was a 39.7% shrink (probably closer to 35% accounting for the smaller memory interface).
Both G92 and RV790 (and GT200B and some other GPUs) were pure "die shrinks" though as 55nm process was just that to 65nm - you could take the existing 65nm design and produce it on 55nm enjoying the size and power benefits straight away. This isn't how it's going to be with 28->14/16, they'll have to redesign the chips to produce them on these new lines.

Based on the evidence available to us, a 44CU Polaris die would likely exceed 200mm², perhaps by a large margin. It would certainly outperform Hawaii, although to what extent is impossible to predict.
The issue here is that I don't see much benefit for them to actually leave the same unit numbers for P10 and V10 as the chips will still have to be heavily redesigned for both the new production process and the updated GCN4 architecture. Why would they leave the same number of CUs/SPs while doing all this? This just seems like a missed opportunity right there as even a modest increase in the number of SIMDs would grant them performance wins across the board in addition to whatever the optimizations will bring (which may well be load specific and not showing up in all games on the market).

What AMD usually start on a new process with is irrelevant, as this isn't 28nm or 40nm or any older process. If it was, we'd already have a full range of Polaris/Vega and Pascal GPUs on store shelves.
I think it's quite relevant as it shows the sweet spot for the last several generations. Unless 14nm will be completely different to the several previous generations for some reason I don't see why it should result in a different chip sizes at first. So expecting ~300mm^2 is actually based on how it was historically and expecting ~200mm^2 is something which is rather new and hasn't really happened often previously (RV670 and G92 are the only two dies on a new process lines which were smaller than 200nm^2 I believe).

That "marketing slide" is pretty much the only piece of confirmed information we have (i.e. not rumour) on the timescale for the rollout of AMD's new GPUs. And that slide appears to show Polaris releasing in late 2016 and Vega in early/mid 2017.

(Regarding the perf/W, I would imagine the higher figure for Vega is largely due to its use of HBM2 compared to more power-hungry GDDR5(X) for Polaris).
That's the issue right there. We know that there will be two Polaris GPUs and two Vega GPUs. If both Vega GPUs are using HBM2 then it would mean that both of them will be faster than both Polaris GPUs - otherwise it makes no sense. So either both Vegas will be above P10 - which is possible if P10 is just on Hawaii's level of performance - or one Vega won't actually use HBM2 and won't really provide any perf/watt increase compared to Polaris.

The second option seems more likely to me and that's why I'm saying that we shouldn't read too much into this slide as it is pure marketing.

The only way it wouldn't be disappointing is if this 4096 sp Vega GPU turns out to be the smaller Vega 11, rather than Vega 10.

On the contrary that would be good as that would mean that Vega 10 will be a bigger chip with more than 4096 SPs.
Edit: misread your post - yeah, you're right, that's also a possibility.
 
The problem of newer processes is not yields per se but the cost of wafers in general. So even if you have 100% working chips the cost of each one will still be very high. That's the reason for both IHVs waiting even though both processes are already used for production of smaller chips since last year - the cost of wafers go down with time making the production of bigger chips economically sound. This isn't about yields as much as it is about wafer costs.

Wafer costs are certainly an issue, but yields are most definitely the major sticking point for sub 20nm FinFET processes. If it was just wafer costs rather than yields, then we wouldn't be seeing mobile SoCs for some time, as they're a highly competitive, low-margin industry. They're the first out of the gate on Samsung's 14nm and TSMC's 16nm, though, because they're small dies, and on a low-yield process that means they're a lot more viable than larger dies, even if they don't command anywhere near the price per mm². If yields weren't an issue on 14nm, then big server CPUs like IBM's POWER9 would be the very first chips off the production lines, as they effectively command the highest revenue per wafer of anything you'll come across once yields come out of the equation. With yields in the equation, though, they're not due for another year, because with poor yields dies that large just aren't an option, regardless of how much money you can charge for them.

Intel is worth looking at as well. They're working up against the same physical limits as Samsung, Global Foundries and TSMC, they're using largely the same 193nm lithography methods and they seem to be having exactly the same yield issues. Intel has no problem selling small (sub 100mm², although I haven't found a confirmed figure) 14nm FinFET dies in the form of Core i3 processors for a $117 tray price, but they're unable to put out 8 core Xeons which may be three to four times the size but sell for ten times the price. This wouldn't be the case if wafer costs were their primary issue over yields.

Even on the A9X, looking at the benchmarks of the new 10" iPad Pro today, the GPU performance has dropped 35% compared to the 13" model, which is a lot more than would be expected even with a reduced clock speed to accommodate the smaller battery. It seems possible (although it's very difficult to confirm) that Apple are actually binning their A9X dies by disabling GPU cores to increase yields. This would be pretty unusual for a sub 150mm² die, but points to particularly low yields on TSMC's 16nm process.

Both G92 and RV790 (and GT200B and some other GPUs) were pure "die shrinks" though as 55nm process was just that to 65nm - you could take the existing 65nm design and produce it on 55nm enjoying the size and power benefits straight away. This isn't how it's going to be with 28->14/16, they'll have to redesign the chips to produce them on these new lines.

Well that's not really true. Even on a single node jump with the same fab, it's not quite trivial even to do a straight die shrink.

Or course this is rather off the point, anyway, as Polaris, Vega and Pascal are updated architectures designed for FinFET nodes in the first place. My point was rather that if we're trying to estimate the die size of a chip on 14nm there are two factors:

(a) The shrink down to 14nm from 28nm, which will obviously decrease the size

and

(b) Architectural changes, which will typically increase the size of the chip

For (b) we have pretty much zero evidence, but for (a) we do have evidence in the form of similar ICs (i.e. other GPUs) which are direct (or near-direct) die shrinks, as this eliminates the architectural variable. Obviously there's a wide margin of error to be applied from one die shrink to the next, but it's the only hard evidence we have of what kind of scaling we might expect from 14nm.

The issue here is that I don't see much benefit for them to actually leave the same unit numbers for P10 and V10 as the chips will still have to be heavily redesigned for both the new production process and the updated GCN4 architecture. Why would they leave the same number of CUs/SPs while doing all this? This just seems like a missed opportunity right there as even a modest increase in the number of SIMDs would grant them performance wins across the board in addition to whatever the optimizations will bring (which may well be load specific and not showing up in all games on the market).

I should have been more clear on that, I didn't mean they would use another 44CU part, I was just using it as an example. Judging by the videocardz.com report it seems a 40CU part would be the most likely for Polaris 10, but obviously it could be anywhere around that. (AMD have thus far always used multiples of 4 CUs on all but their entry-level GCN dies, though, so I'd tend to assume they'll do the same again. This is due to the way CU's share cache, although that could in theory change with Polaris/Vega).

I think it's quite relevant as it shows the sweet spot for the last several generations. Unless 14nm will be completely different to the several previous generations for some reason I don't see why it should result in a different chip sizes at first. So expecting ~300mm^2 is actually based on how it was historically and expecting ~200mm^2 is something which is rather new and hasn't really happened often previously (RV670 and G92 are the only two dies on a new process lines which were smaller than 200nm^2 I believe).

The evidence suggests that 14nm/16nm are different, though. For Intel, it's obviously a slow-maturing node, and is their first node to be stretched over 3 years rather than 2. For Samsung/GF/TSMC we're seeing small-die mobile SoCs long before CPUs, GPUs or server chips, which we've never seen on any previous node.

That's the issue right there. We know that there will be two Polaris GPUs and two Vega GPUs. If both Vega GPUs are using HBM2 then it would mean that both of them will be faster than both Polaris GPUs - otherwise it makes no sense. So either both Vegas will be above P10 - which is possible if P10 is just on Hawaii's level of performance - or one Vega won't actually use HBM2 and won't really provide any perf/watt increase compared to Polaris.

The second option seems more likely to me and that's why I'm saying that we shouldn't read too much into this slide as it is pure marketing.

Why leave one of the Vega GPUs until 2017 if it's less powerful than Polaris 10? And, for that matter, why would they call two of them Polaris and two of them Vega if the two Polaris aren't related and the two Vega aren't related? Logically there's something the two Polaris chips have in common and something the two Vega chips have in common. The most obvious would be use of GDDR5(X) on Polaris and HBM2 on Vega, but it could be that Polaris are being made on GF's 14nm process and Vega are being made on TSMC's 16nm process.

I'd imagine the most likely scenario is:

Polaris 11: Desktop 470/470X, mid-range and ultra-thin laptop, GDDR5, somewhere between Tonga and Hawaii in performance

Polaris 10: Desktop 480/480X, high-end laptop, GDDR5(X) somewhere between Hawaii and Fiji in performance

Vega 11: Desktop 490/490X, HBM2, somewhere above Fiji in performance

Vega 10: Desktop Fury, HBM2, somewhere above Vega 11 in performance

This obviously assumes that Vega 10 doesn't arrive until quite a bit after Vega 11, as a counter to GP200 based cards.

Of course, even if we know the number of CUs that still leaves quite a bit of variance in potential performance. Despite the increase in raw computational power, 980Ti and Fury X are only about 20% faster than 390X at 1440p. A 40CU part could match them if AMD manages to squeeze an extra 30% performance out of Polaris over GCN1.2 from architectural improvements and clock increases, which isn't completely impossible, although I'd keep my expectations in check.
 
Wafer costs are certainly an issue, but yields are most definitely the major sticking point for sub 20nm FinFET processes. If it was just wafer costs rather than yields, then we wouldn't be seeing mobile SoCs for some time, as they're a highly competitive, low-margin industry. They're the first out of the gate on Samsung's 14nm and TSMC's 16nm, though, because they're small dies, and on a low-yield process that means they're a lot more viable than larger dies, even if they don't command anywhere near the price per mm². If yields weren't an issue on 14nm, then big server CPUs like IBM's POWER9 would be the very first chips off the production lines, as they effectively command the highest revenue per wafer of anything you'll come across once yields come out of the equation. With yields in the equation, though, they're not due for another year, because with poor yields dies that large just aren't an option, regardless of how much money you can charge for them.
On quite the contrary it really is an indication of high wafer costs that what we have are mobile SoCs on these new processes as they are small but are sold in products with healthy margins meaning that it is economically feasible to produce them even at a high wafer costs as each such wafer grants you many SoCs which can be sold in a $500 smartphone unit. Yields is an unknown factor which we don't really know much about to include it in here. But I don't remember hearing about any issues with yields on these newer processes.

Well that's not really true. Even on a single node jump with the same fab, it's not quite trivial even to do a straight die shrink.
55 was a pure optical shrink of 65 afaik so it was trivial.

Or course this is rather off the point, anyway, as Polaris, Vega and Pascal are updated architectures designed for FinFET nodes in the first place. My point was rather that if we're trying to estimate the die size of a chip on 14nm there are two factors:

(a) The shrink down to 14nm from 28nm, which will obviously decrease the size

and

(b) Architectural changes, which will typically increase the size of the chip

For (b) we have pretty much zero evidence, but for (a) we do have evidence in the form of similar ICs (i.e. other GPUs) which are direct (or near-direct) die shrinks, as this eliminates the architectural variable. Obviously there's a wide margin of error to be applied from one die shrink to the next, but it's the only hard evidence we have of what kind of scaling we might expect from 14nm.
Well, actually, whatever evidence we have on Polaris and Pascal point to (b) and not (a). This is also logical as making a FinFET GPU is something completely new so they'd have to rebuild them anyway - why not use the opportunity to tweak and expand the architecture? So linear scaling is out of the question obviously but still I'd wager that whatever guesses we've made so far will be rather close to the resulting figures - a 5B GPU on 14nm should not be bigger than 200mm^2 in any case and it's quite possible that a 6B GPU will not be bigger than 250mm^2. These are both rather conservative sizes for GPUs.

The evidence suggests that 14nm/16nm are different, though. For Intel, it's obviously a slow-maturing node, and is their first node to be stretched over 3 years rather than 2. For Samsung/GF/TSMC we're seeing small-die mobile SoCs long before CPUs, GPUs or server chips, which we've never seen on any previous node.
How are they different though? The costs are higher, sure, but hence the wait as both 14 and 16 were ready for production a year ago. We'll be getting the GPUs off them more than a year later precisely because the costs are down enough to allow the production of _bigger_ chips than mobile SoCs. A9X is 150mm^2 and it's being produced on 16nm since last summer. I'd expect the GPU coming off the same process a year later to be able to hit a bigger die size than just 200mm^2 as a 50mm^2 increase per year would mean that we'll have to wait till 2024 for a 16nm chip of the same size as GM200. So yeah, I think you're overly conservative in your die size approximations and I think that the usual ~300+-50mm^2 launch size will again be the rule on both 14nm and 16nm. But this takes us back to the proposed P10 specs which seems kinda low for such die size possibility.


Why leave one of the Vega GPUs until 2017 if it's less powerful than Polaris 10? And, for that matter, why would they call two of them Polaris and two of them Vega if the two Polaris aren't related and the two Vega aren't related? Logically there's something the two Polaris chips have in common and something the two Vega chips have in common. The most obvious would be use of GDDR5(X) on Polaris and HBM2 on Vega, but it could be that Polaris are being made on GF's 14nm process and Vega are being made on TSMC's 16nm process.
Who knows? This is AMD we're talking about. Look at their current mix of GCN versions in 300 series - less than a year have passed since they've finally retired the GCN1 Tahiti GPUs from the market. I don't expect them to fare any differently with Polaris and Vega and Navi really. Chances are that by the end of 2019 we'll have all three these GPU generations mixed up on the market simultaneously with some of them being slower but newer and some older but faster. I'm not really seeing how else they would be able to fill all market segments with just two GPU launches each year for 2016 and 2017.
 
VideoCardz posted that Hardware Battle is reporting AMD/RTG will launch the Radeon R9 490 and 490X in late June, with a paper launch a month earlier at Computex in late May. Not a surprise to anyone but the rumor states that this will be based on Polaris 10.

RedGamingTech caught the demo of Polaris 10 at GDC and took pictures of the ports and chassis it was used in. They report it's a small form factor card a la the R9 Nano and it ran Hitman at DX12 max settings 1440p 60fps. It had five ports: 3x DisplayPort 1.3, 1 HDMI 2.0 and 1 DVI Dual-Link.
 
Hopefully, its really competitive with the Nvidia cards and we get some consumer friendly price competition.
 
VideoCardz posted that Hardware Battle is reporting AMD/RTG will launch the Radeon R9 490 and 490X in late June, with a paper launch a month earlier at Computex in late May. Not a surprise to anyone but the rumor states that this will be based on Polaris 10.

RedGamingTech caught the demo of Polaris 10 at GDC and took pictures of the ports and chassis it was used in. They report it's a small form factor card a la the R9 Nano and it ran Hitman at DX12 max settings 1440p 60fps. It had five ports: 3x DisplayPort 1.3, 1 HDMI 2.0 and 1 DVI Dual-Link.

Small form factor and pretty high end 490X naming with GDDR5(X) used sounds kinda wrong though. How sure are they that it was P10 and not P11 they've seen at GDC?
 
I mean, if it was Polaris 11 running Hitman at 1440p 60fps max that would be kinda insane. That's an 85W card last we heard.

The small for factor might mean it's not using GDDR5/X.
 
For Polaris 10 to be branded as a future R9 490 and 490X, it must be pretty potent chip... I can't imagine it behind Fury X in terms of performance. On top of IPC improvements per SP there has to be a sizable frequency clock improvement (instead of 1Ghz -> 1.3Ghz or even 1.4Ghz).
 
85 Watts? seriously? thats insane.

I think that number was for the whole system. Polaris 11 is designed for laptops, so the designs probably vary between 15-60W, with the high end for desktop models. But this chip isn't what was running Hitman at 1440p.

We've seen smaller form factor GDDR5 cards like 970 mITX models, so even if Polaris 10 is supposedly a small card, it doesn't mean it won't use GDDR5. Of course fulfilling the bandwidth requirements of a 6-7B transistor chip working at (assumedly) much higher clockspeeds than AMD's current offerings is a tall order for GDDR5. AMD's designs have had a lot more bandwidth than Nvidia's equivalents, so one would assume they also need a lot of it. It's going to very interesting to see how all this shakes up.
 
Hitman demo was with Polaris 10.
1440p/60 can't be reach with Polaris 11 since the latter being more for laptop segment.
 
Making small cards is easier with GDDR5X, since the chips are higher density. They start at 8 Gb (1 GB) and go all the way to 16, meaning smaller PCBs due to fewer chips.
 
Making small cards is easier with GDDR5X, since the chips are higher density. They start at 8 Gb (1 GB) and go all the way to 16, meaning smaller PCBs due to fewer chips.

Hardly an issue as you still need at least four chips for 256 bit bus and with lower density 8 chips were simply installed on both sides of the PCB. I.e. it won't help with size. What can help is a severely simplified power circuitry but that would mean that at least this P10 board wasn't targeted at high performance / OC / enthusiasts.

Eh, well, "small form factor" can mean lots of things anyway. "Smaller than R390X" isn't really small for example.
 
Here's what WCCFTech had to say about the size:

The Polaris 10 GPU was featured in a tiny Cooler Master Elite case that comes in ITX form factor. That case has limited airflow and its quite impressive that the Polaris 10 GPU maintained more than 60 FPS in the small demo. We know from the Hitman benchmarks published last week that only the top-end cards such as the Radeon R9 Fury X are able to maintain more than 60 FPS at 1440p resolution.

One of the interesting details that also got posted was that the sample board was comparable to the Radeon R9 Nano in terms of size so it could be possible that AMD is again making approach for compact boards. Could this be the Nano 2 in the making? It seems so and also kind of gives a hint at the memory type being used. Sure the card could possibly use GDDR5 memory but knowing AMD, they might have a Polaris 10 interposer with 1st gen HBM memory on board. That seems more of an obvious choice if this card is going to be a Fiji replacement. The other die, Polaris 11, uses GDDR5 and will be featured on a range of notebook and low-power PCs.
 
Well we know now that Polaris 10 doesnt have any interposer. Never made sense for that particular segment.

They have to be using HBM for something. They mentioned to PCper that they were maximizing their investment in HBM1 for the time being.

So unless they're using HBM1 in Vega, it stands to reason that they're using it in one of the polaris chips.
 
We won't see an interposer + HBM2 on the AMD side until Vega 10 around Q1 2017.

That's the monster formerly known as Greenland and would be going up against GP100.
 
They have to be using HBM for something. They mentioned to PCper that they were maximizing their investment in HBM1 for the time being.

So unless they're using HBM1 in Vega, it stands to reason that they're using it in one of the polaris chips.

Are you sure it wasn't Pcper own speculation?
 
They have to be using HBM for something. They mentioned to PCper that they were maximizing their investment in HBM1 for the time being.

So unless they're using HBM1 in Vega, it stands to reason that they're using it in one of the polaris chips.
They were not (9:50 in the video):
http://www.pcper.com/news/Graphics-Cards/AMDs-Raja-Koduri-talks-moving-past-CrossFire-smaller-GPU-dies-HBM2-and-more

I'm quite sure that AMD will not use HBM1 anymore.

That's the monster formerly known as Greenland and would be going up against GP100.
The "little monster".
 
They have to be using HBM for something. They mentioned to PCper that they were maximizing their investment in HBM1 for the time being.

So unless they're using HBM1 in Vega, it stands to reason that they're using it in one of the polaris chips.

They'll be using HBM1 for dual Fiji card for the most part of this year.
 
I'm quite sure that AMD will not use HBM1 anymore.
Unless they can get HBM2 at near-HBM1 prices, they'll probably keep using in in their mid-tier cards.

They probably won't be using GDDR5 anymore, though. The price difference between it and GDDR5X is nowhere near double, whereas the performance (and density) is.
 
It's not looking like it if the spec leaks are real. Even the 4GB card seems to be on GDDR5.

http://videocardz.com/58639/amd-polaris-10-gpu-specifications-leaked
Well, it's not sure at this point whether any of the two manufacturers' designs are locked-in as of yet. We might be looking at test mules.

As commenters on AnandTech have noted, there's little reason for GDDR5 to stick around at this point. X offers fewer chips (lower power requirements), fewer traces (again, lower power) and higher bandwidth. Straight-up replacing the 4x 8Gbit chips in a (new-)low-end 4GB GPU (nearly) doubles bandwidth with no increase in consumption.
 
Well, it's not sure at this point whether any of the two manufacturers' designs are locked-in as of yet. We might be looking at test mules.

As commenters on AnandTech have noted, there's little reason for GDDR5 to stick around at this point. X offers fewer chips (lower power requirements), fewer traces (again, lower power) and higher bandwidth. Straight-up replacing the 4x 8Gbit chips in a (new-)low-end 4GB GPU (nearly) doubles bandwidth with no increase in consumption.

Yes in theory everyone should be using GQDR (GDDR5X) instead of plain GDDR5, but when AMD first demoed their chip GQDR wasn't even sampling and when AMD launches Polaris 11 and 10 GQDR might not even have started mass production.
 
Yes in theory everyone should be using GQDR (GDDR5X) instead of plain GDDR5, but when AMD first demoed their chip GQDR wasn't even sampling and when AMD launches Polaris 11 and 10 GQDR might not even have started mass production.
We know X is available right now (albeit in limited engineering quantities). Micron have said as much. We also know the first X chips will be mass-produced in 8 Gbit size, which happens to mesh with AMD's transition away from triple-3 bandwidths (384) - which would require specialty 6 and 12 Gbit chips - to regular base-2 ones (128, 256, 512). AMD might've simply pulled a Sony and bet on having the chips ready at the exact right time.
 
Unless they can get HBM2 at near-HBM1 prices, they'll probably keep using in in their mid-tier cards.
I see no reasons to do that.
For mid-tier cards you will get away with GDDR5(X) cheaper and I wouldn't believe we are at the performance and consumption point, where the higher cost for HBM would be really worth it, especially for HBM1 with the limitations right now.
 
I see no reasons to do that.
For mid-tier cards you will get away with GDDR5(X) cheaper and I wouldn't believe we are at the performance and consumption point, where the higher cost for HBM would be really worth it, especially for HBM1 with the limitations right now.
If that were true, depending on cost of HBM1 vs 2 interposer and chips, there's a likelihood we'll end up with the discrete graphics market sharply divided between 'peon-level' GPUs running 5X and ultra-high-end running HBM2. That's... worrying. I don't like it when options are limited between 'peasant' and 'swagadelic' tiers, with nothing in between. An 8 GB HBM2 card would absolutely obliterate its 8 GB 5X counterpart in every single benchmark.
 
We know X is available right now (albeit in limited engineering quantities). Micron have said as much. We also know the first X chips will be mass-produced in 8 Gbit size, which happens to mesh with AMD's transition away from triple-3 bandwidths (384) - which would require specialty 6 and 12 Gbit chips - to regular base-2 ones (128, 256, 512). AMD might've simply pulled a Sony and bet on having the chips ready at the exact right time.

Yes GQDR is available NOW in sampling quantities, but wasn't available in December or earlier when AMD was testing their chips. (EDIT: Jedec published the GDDR5X standard January 21st.) And you can't just plop GQDR memory into a chip that wasn't designed for it. AMD didn't hedge a bet on the unknown availability of GQDR and instead it seems that they are going with GDDR5 with two of the first chips. Why would a non exponent two wide memory controller require a 6 or 12 Gbit chips, did 7970 require it, does 980Ti require it?

Ps. multiplying three divisible by 1.5 doesn't give you exponent of two.


If that were true, depending on cost of HBM1 vs 2 interposer and chips, there's a likelihood we'll end up with the discrete graphics market sharply divided between 'peon-level' GPUs running 5X and ultra-high-end running HBM2. That's... worrying. I don't like it when options are limited between 'peasant' and 'swagadelic' tiers, with nothing in between. An 8 GB HBM2 card would absolutely obliterate its 8 GB 5X counterpart in every single benchmark.

Maybe GQDR will be cheaper than HBM2 and therefore the cards would be cheaper. And presumably AMD or Nvidia would use GQDR with chips that doesn't require as much bandwidth, they have less computational power, and HBM2 with cards that does require more bandwidth. Less computational power = smaller chip = cheaper card. Say GTX 1080 and GTX 1070 with GQDR for $500 and $350 and 1080TI with HBM2 for $650. And there has always been different tiers in GPUs with different performance points and price points.
 
YWhy would a non exponent two wide memory controller require a 6 or 12 Gbit chips, did 7970 require it, does 980Ti require it?

Ps. multiplying three divisible by 1.5 doesn't give you exponent of two.
No, they use a multiple-3 number of multiple-2 sized chips. I was led to believe there are some efficiency gains in using multiple-3 sized chips with a multiple-3 bus compared to the current layout.

Maybe GQDR will be cheaper than HBM2 and therefore the cards would be cheaper.
Well it will obviously be cheaper, considerably so (given that Micron are confident of 5X being taken up almost immediately), but that wasn't my point.

For a long time, card performance had differed in small increments going up the price tiers. With HBM2, there's going to be quite a $ space between the top-most 5X card and the next HBM2 one. Admittedly, nVidia have performed a bit of social conditioning on their audience of expecting this with their current generation (the 970 and 980 are rather far from each other, performance-wise), but it still feels unpleasant of having a sharp divider between price tiers.
 
If that were true, depending on cost of HBM1 vs 2 interposer and chips, there's a likelihood we'll end up with the discrete graphics market sharply divided between 'peon-level' GPUs running 5X and ultra-high-end running HBM2. That's... worrying. I don't like it when options are limited between 'peasant' and 'swagadelic' tiers, with nothing in between. An 8 GB HBM2 card would absolutely obliterate its 8 GB 5X counterpart in every single benchmark.
It wouldn't.
For my taste there is too much HBM hype.

GP100 is using a 4096-Bit HBM Interface with 1,4 Gbps = 720 GB/s (16-32 GB) for 3840 ALUs.
Greenland is (was) using 2048-Bit with 2 Gbps = 512 GB/s. (8-16 GB) for 4096 ALUs.

Mid-Tier is going below this. With GDDR5X (12 Gbps) you can get 384 GB/s with only 256-Bit.
Take 10 Gbps and you still get 320 GB/s.

Even on GPUs you are never 1:1 bandwidth limited.
 
AMD Polaris 11 SKU spotted, has 16 Compute Units
Still looks too low for me.

If that were true, depending on cost of HBM1 vs 2 interposer and chips, there's a likelihood we'll end up with the discrete graphics market sharply divided between 'peon-level' GPUs running 5X and ultra-high-end running HBM2. That's... worrying. I don't like it when options are limited between 'peasant' and 'swagadelic' tiers, with nothing in between. An 8 GB HBM2 card would absolutely obliterate its 8 GB 5X counterpart in every single benchmark.

A 14Gbps GDDR5X on a 512 bit bus will be able to achieve 896GB/s of bandwidth.
A top speced HBM2 on a 4096 bit bus (which seems to be a typical config for HBM chips as both Fiji and GP100 are using it) is hitting 1000GB/s.
The difference isn't that big as you make it sound really. Give or take several MC wrinkles here and there, some savings in GDDR5X, some issues in HBM2 and you'll end up with pretty much the same bandwidth.
 
It has been speculated, that based on the device ID and already released cards device ID numbering, that that is the most cut down Polaris 11.
Amazing how a GPU that was mid-high-end 4 years ago is now bottom-end laptop tier. 1024:64:32 is the same config the 7850 used.
 
It has been speculated, that based on the device ID and already released cards device ID numbering, that that is the most cut down Polaris 11.

Possible. Another possibility would be that the detection is off because of an increased number of SPs in a CU in Polaris.
 
Top Bottom