Fable beta would definiely be interesting. A shame it is locked behind NDA atm.
BTW, earlier you .gif posted my mentioning of Crysis 2 concerning tesselation but failed to respond. The tessellation myth in Crysis 2 was disproved years ago just so you know. By Cryengine devs themselves among many others. A shame that silly wccftech report taking pictures and videos in debug mode came out... they just id not understand how cryengine works...
Though to be honest, if anyone is trying to offload a 980 Ti on the cheap (Europe) tell me your price
I want to see more Beyond3D benchmarks, those guys are really trying to test both architectures and trying to get some results out of it.
I know the water under the level bit was debunked, but how about the concrete barriers that had 76778767777343973 polygons to render a flat surface?
I don't think so. Fable Legends uses UE4 which seemingly isn't doing anything interesting in this area atm.
It is only interesting as a test of a different dx12 workload.
Compute shaders, yes. You have to look at them talking about it in context of dx11 hardware. The code dealing with it could still be structured for serial execution.
Compute shaders ≠ Async Compute/sharers
Compute sharers have been used for a while now. And has the blog post is saying; they are aiming for DX11 level hardware.
I know the water under the level bit was debunked, but how about the concrete barriers that had 76778767777343973 polygons to render a flat surface?
Exactly. Water under level has always been part of the CryEngine since day one (AFAIK is can now be disabled in the latest version). All the other crap (concrete slabs, pavements, brick walls..) was intentional to crush performance on Radeons (as seen by Crysis 3èwhere everything is tones sown to acceptable levels with still hood quality)
I was under the impression the game is DX12 only currently? And wouldnt AMD cards be better at a default compute (non-async)/serial one anyway?`
No it wasnt. Water under the level never rendered in view. That is not how occlusion culling works in the engine. And you can go in the game yourself with some 5870 or whatever and test how little tesselation costs. It is seriously cheap.
Do you have an AMD GPU now? You can totally try it out with a CVAR to see how little it effects overall framerate.
I was under the impression the game is DX12 only currently? And wouldnt AMD cards be better at a default compute (non-async)/serial one anyway?`
That's closer to my understanding of how this stuff works. Basically, you've got turnstile-style access to a fixed pool of resources; the various math units on the GPU, each with its own specialty. So think of it like loading a roller coaster. Every cycle, the system hangs the next rendering job on the GPU, occupying some or all of those specialized units. These jobs are the people who paid for VIP passes. Then the system looks at the math units that haven't been assigned jobs, compares that to the 64 jobs waiting at their respective turnstile all managed by eight line attendants and lets in whatever punters best fill the remaining seats before dispatching the train.Except it's also completely misleading. All execution on modern vec1 SIMDs is done in a serial fashion, so there is no "8 roads with 8 lanes for trucks which can be used to move freely" but there is 8 roads with 8 lanes which are waiting to be picked from to the execution pipeline. The more you have - the higher efficiency of the execution you may achieve.
Is this a trick question because adding async to the mix "just" increases your peak utilization? It will have empty spaces in its rendering pipeline that need filling, just like any other GPU, if that's what you're asking.Another question is does Maxwell even need async compute to keep its utilization in DX12 at peak?
How likely is it that Pascal will have Asynchronous Compute?
In general, and aside from moving from 28nm to 16nm FF+, having HBM2 and double rate FP16, shouldn't Pascal have more architectural changes over Maxwell than Maxwell had over Kepler?
I don't think so. Fable Legends uses UE4 which seemingly isn't doing anything interesting in this area atm.
It is only interesting as a test of a different dx12 workload.
Actually Unreal Engine 4 supports async compute on XB1, and I think it was lionhead who implemented it in unreal engine. There was also a gameplay video where they talked about it, I believe it was in the context of the pc version.
Yup, here you go.
https://docs.unrealengine.com/lates...ing/ShaderDevelopment/AsyncCompute/index.html
Thanks. However, they only mention Xbox One, and the documentation talks about it being disabled for PC in dx11.1 (possibly means that the documentation hasn't been updated).
Yeah, it says it's not supported in d3d11.1, because it literally isn't part of d3d 11.
Why would it be available on Bone but not PS4?Of course. There is no indication that it will be enabled for PC on dx12. Given that this documentation is tagged for UE 4.9 which was just released probably means it isn't available for PC. That is, if we're going by documentation, which we all know isn't always the most reliable thing.
Why would it be available on Bone but not PS4?
Why would it be available on Bone but not PS4?
Wait, so Maxwell is fully DX12 compliant but does not have async compute like AMD cards have? Does this mean that PS4 is almost DX13 levels then due to having this feature as well as hUMA and a supercharged PC architecture which DX12 does not have? If so I can easily see PS4 competing with the next gen Xbox which will assumedly be based on DX13 further delaying the need for Sony to launch a successor. Woah. If this is true I can easily see PS4 lasting a full ten years. Highly interesting development, I can't wait to see what Naughty Dog and co do with this new found power.
Thanks. However, they only mention Xbox One, and the documentation talks about it being disabled for PC in dx11.1 (possibly means that the documentation hasn't been updated).
Of course. There is no indication that it will be enabled for PC on dx12. Given that this documentation is tagged for UE 4.9 which was just released probably means it isn't available for PC. That is, if we're going by documentation, which we all know isn't always the most reliable thing.
Should be 100% chance. It is part of DX12 after all.How likely is it that Pascal will have Asynchronous Compute?
In general, and aside from moving from 28nm to 16nm FF+, having HBM2 and double rate FP16, shouldn't Pascal have more architectural changes over Maxwell than Maxwell had over Kepler?
Should be 100% chance. It is part of DX12 after all.
The chance of seeing history repeat itself, Maxwell getting Kepler-ed in favor of Pascal, should be fun.
You'd expect that of they've added the feature set for X1 they'd implement on PC though, wasn't that the point of windows 10 code once and deploy to many devices.
Also guys correct me if I'm wrong. But aren't those homebrew tests proving that Maxwell does have Async Compusing but limited to 31 queues ? The test goes by submitting 128 threads. GCN takes them all with the same latency but Maxwell can only take 31 (It has 31+1) queues before processing the next load.
The only thing it shows is that it has a lower latency for low queues processing and a equivalent latency compared to AMD.
They do less but more quickly. GCN does more but slower
I have doubt. Wouldn't this has a significant impact upon the running temperature of the chips?
That would seem to confirm the idea NV are simply alternating between the two job types rather than blending them.To me it looks like the tests aren't showing that at all. Nvidia cards are finishing the graphics + compute test in the same time as both the graphics test and compute test combined, while amd cards are finishing the graphics + compute test in the same time as the compute test alone.
That would seem to confirm the idea NV are simply alternating between the two job types rather than blending them.
Async compute would have to give AMD a 30-40% advantage in most DX12 games for me to care...
Typical 980 Ti OC is around 20% faster than OC'd Fury at 1440p, so, it may only really put them on par for me. I still play a lot of DX11 games and so performance in those is still important. I'll trade 10% less performance in DX12 if that's what it ends up being for that.
The performance of a video game is defined by more than just frames per second or frametimes. Asynchronous compute allows for higher throughput at lower latencies which easily makes it one of the most important features for VR gaming. Remember the beginning of this gen when Mark Cerny explained again and again the importance of async compute for the future of video games? That was before Morpheus was announced. Two years later it all makes sense.
Glad I didn't fork out a ludicrous £550 for a 980 Ti that will struggle with demanding DX12 games in the next couple years. As if I spent that kind of money, I'd expect to not have to upgrade for 2+ years. But then again, my buying habits are different from most who like to upgrade yearly.
What I've done instead is fork out a large sum for an EVGA 980 that will probably last less time
Anyway, these are all good cards for a while yet at least.
What about normal games? Should we expect better performance in them? Will the PS4 perform as well as a higher end PC graphics card?
What about normal games? Should we expect better performance in them? Will the PS4 perform as well as a higher end PC graphics card?
Wait, so Maxwell is fully DX12 compliant but does not have async compute like AMD cards have? Does this mean that PS4 is almost DX13 levels then due to having this feature as well as hUMA and a supercharged PC architecture which DX12 does not have? If so I can easily see PS4 competing with the next gen Xbox which will assumedly be based on DX13 further delaying the need for Sony to launch a successor. Woah. If this is true I can easily see PS4 lasting a full ten years. Highly interesting development, I can't wait to see what Naughty Dog and co do with this new found power.
I have doubt. Wouldn't this has a significant impact upon the running temperature of the chips?
Nvidia itself specifically told the developer to turn off the feature for their card. If it were just a bug they could like fix it instead.So, no one has entertained the very distinct possibility that this one developer was either using a poor implementation of Async Compute or encountered a driver bug? Seems far more likely than Maxwell 2 not supporting the feature it claims to.
First of all, there is no need for personal attacks.
A CPU is a processor that consists of a small number of big processing cores. A GPU is a processor that consists of a very huge number of small processing cores. Therefore most home PCs have multiple processors. "APU" is a marketing term for a single processor that consists of different kinds of processing cores. In the case of the PS4, the APU has two Jaguar modules with four x86 cores each and 18 GCN compute units with 64 shader cores each. You distinguish processors as the following: Single core (like Intel Pentium), multi core (Intel Core i7 or any GPU), hetero core (APUs like the one in PS4) and cloud core (Microsoft Azure for example).
If you take a look back, the evolution of computer technology was always about maximum integration. The reason for that you is want to minimize latency as much as possible. A couple of years ago, GPUs only had fixed-function hardware. That means that every core of the GPU was specialized for a certain task. That changed with the so called unified shader model. Today, the shader cores of a modern GPU are freely programmable. Just think of them as extremely stupid CPU cores. The advantage of a freely programmable GPU, however, is that you have thousands of those cores. The PS4 has 1156 shader cores. That makes a GPU perfectly suited for tasks that benefit from mass parallelization like graphics rendering. You can also utilize them for general purpose computations (GPGPU) which, in theory, opens up a whole new world of possibilities since the brute force of a GPU is much higher than the computational power of a traditional CPU. In practice, however, the possibilities of GPGPU are limited by latency.
If you want to do GPGPU on a traditional gaming PC, you have to copy your data from your RAM pool over the PCIe to your VRAM pool. The process of copying costs latency. A roundtrip from CPU -> GPU -> CPU usually takes so long that the performance gain from utilizing the thousands of shader cores gets immediately eaten up by the additional latency: Even if the GPU is much faster at solving the task than the CPU, the process of copying the data back and forth will make the GPGPU approach slower than letting the CPU do it on its own. That's the reason why GPGPU today is only used for things that don't need to be send back to the CPU. The possibilities on a traditional PC are very limited.
The next step in integration is the so-called hetero core processor. You integrate the CPU cores as well as the GPU shader cores on a single processor die and give them one unified RAM pool to work with. That will allow you to get rid of that nasty copy overhead. Till this day, the PS4 has the most powerful hetero core processor (2 TFLOPS @ 176GB/s) available. Not only that, since the APU in PS4 was built for async compute (see Cerny interviews), it can do GPGPU without negatively affecting graphics rendering performance. It's a pretty awesome system architecture, if you want my opinion.
The only problem is, that PC gamers don't have a unified system architecture. The developers of multiplatform engines have to consider that fact. 1st-party console devs can fully utilize the architecture, though.
So is this false advertising then?: -
![]()
https://forums.geforce.com/default/...s/about-nvidia-and-dx12/post/4656838/#4656838
No, because Nvidia supports it.Badly, but it is there.