Support NeoGAF

GRaider81 · Mar 12, 2016

What are people using for AA?

I normally use 1440p or 4K DSR without any extra AA but at 1440p the Aliasing is pretty bad.

Ive tried the ingame supersampling which is better and i can get a decent result at 1080p with 1.5 SS while maintaining 60fps but should i try the actual AA options?

Just curious what everyone else is opting for.

Using 980ti and 4690k OC 4.4Ghz

icecold1983 · Mar 12, 2016

dr_rus said:
Two theories:
1. Different memory bandwidth may lead to a different units load landscape.
2. Concurrent async compute gains are variable in general because of how they are sliced into the main graphics pipeline.

No, this would be true for everyone. The code written to be ran across several engines doesn't differ much from the code which can run on one engine. The only thing which is different is the batching of compute and graphics jobs together for the single queue.

All DX12 GPUs support async compute.
All AMD GCN GPUs support concurrent async compute.
All GCN versions support it a bit differently.
And all GCN GPUs support it a bit differently because of the differences in caches and memory systems.

Maxwell can perform graphics and compute concurrently but the support for this isn't enabled yet in the driver. No matter how much times you will say that it can't - the fact that it can won't change. This fact is written in Maxwell's whitepaper and has been confirmed by pretty much every party involved - with the only exception of AMD who just likes to lie I guess.

GCN's implementation of async compute is definitely far from "proper".

link to the white paper and people confirming it. ive already given you loads of evidence showing that maxwell can not process graphics and compute threads in parallel

Rebel Leader · Mar 12, 2016

Rizzi said:
Okay, never mind. Paris level murders my framerate. What's the best way to lock it at 30 fps?

Msi afterburner

dgrdsv · Mar 12, 2016

icecold1983 said:
link to the white paper and people confirming it. ive already given you loads of evidence showing that maxwell can not process graphics and compute threads in parallel

I gave you two links already and as for the Maxwell arch whitepaper - I trust you can find it yourself.

icecold1983 · Mar 13, 2016

dr_rus said:
I gave you two links already and as for the Maxwell arch whitepaper - I trust you can find it yourself.

the only links you have given are oxide and nvidia saying async isnt enabled in their drivers. there is no confirmation that maxwell is capable of concurrent execution of 3d and compute.

http://international.download.nvidi...nal/pdfs/GeForce_GTX_980_Whitepaper_FINAL.PDF

is that the white paper you are referring to? if so where exactly does it prove that maxwell can do concurrent execution.

pahamrick · Mar 13, 2016

Does the benchmark give statistics at the end or does it suddenly close out?

GRaider81 · Mar 13, 2016

pahamrick said:
Does the benchmark give statistics at the end or does it suddenly close out?

Just closed out for me.

pahamrick · Mar 13, 2016

GRaider81 said:
Just closed out for me.

Thanks. Was trying to figure out if it was the benchmark crashing or not, since the main game just continues to crash for me.

dgrdsv · Mar 13, 2016

icecold1983 said:
the only links you have given are oxide and nvidia saying async isnt enabled in their drivers. there is no confirmation that maxwell is capable of concurrent execution of 3d and compute.

http://international.download.nvidi...nal/pdfs/GeForce_GTX_980_Whitepaper_FINAL.PDF

is that the white paper you are referring to? if so where exactly does it prove that maxwell can do concurrent execution.

If something isn't enabled in the drivers then this something is supported by the h/w otherwise there would be no way of enabling or disabling it.

No, the wallpaper I'm referring to is the one which shows that Maxwell 2 can run 1 graphics and 31 compute queues in parallel. It was used by Ryan in Anandtech's first article on async compute for example.

icecold1983 · Mar 13, 2016

dr_rus said:
If something isn't enabled in the drivers then this something is supported by the h/w otherwise there would be no way of enabling or disabling it.

No, the wallpaper I'm referring to is the one which shows that Maxwell 2 can run 1 graphics and 31 compute queues in parallel. It was used by Ryan in Anandtech's first article on async compute for example.

ive already debunked that misleading chart with several links. dont know what else i can show to get it through to you.

dgrdsv · Mar 13, 2016

icecold1983 said:
ive already debunked that misleading chart with several links. dont know what else i can show to get it through to you.

You've debunked the official information from NV with several links to gossips and AMD's statements. That's not what I call being debunked.
For now we know that Maxwell 2 can do graphics+compute concurrently but this feature is disabled in the drivers we have right now. The question is - will they even enable the feature for Maxwell 2 considering that we may be less than a quarter away from Pascal launch?

icecold1983 · Mar 13, 2016

dr_rus said:
You've debunked the official information from NV with several links to gossips and AMD's statements. That's not what I call being debunked.
For now we know that Maxwell 2 can do graphics+compute concurrently but this feature is disabled in the drivers we have right now. The question is - will they even enable the feature for Maxwell 2 considering that we may be less than a quarter away from Pascal launch?

none of the links i posted are gossip. and yeah, nvidia is just sitting on unused hardware capability because of drivers. i dont think you even bothered to read any of the links i have given tbh

finer grained preemption in nvidias own words is "a long way off"

Parsnip · Mar 13, 2016

Why does it feel like every tech related thread turns into a slap off between dr_rus and icecold?

I'm running Hitman anywhere between 40 and 60 on both shadow settings on medium, everything else on max on gtx970. DX11, haven't even tried DX12 because the benchmark hangs so why even bother with the game proper.

dgrdsv · Mar 13, 2016

icecold1983 said:
none of the links i posted are gossip. and yeah, nvidia is just sitting on unused hardware capability because of drivers. i dont think you even bothered to read any of the links i have given tbh

finer grained preemption in nvidias own words is "a long way off"

I'm rather sure that I've read more on this whole topic than you so your links are unlikely to provide me with anything I don't know.

Finer grained preemption isn't a requirement to run compute in parallel to graphics. It's what makes GCN's way of using concurrent compute to fill the gaps in graphics pipeline meaningful but this isn't as important for an architecture where there are no such gaps in the first place - Maxwell being this architecture specifically.

As for NV's sitting on something because of the drivers - yeah, this can easily be the case for Maxwell as concurrent compute on NV's architecture may destroy performance much easier than actually improve it (270X results in the game this thread is about should demonstrate this to you; this is also what you've been talking about in your own way). And to have it benefiting the cards in general NV have to do a lot of driver work, implementing heuristics which will decide when and how to use it to not reduce the performance instead of increasing it.

The question I have is will they release what they have now or will they just sell Pascals as cards with a better implementation of the feature? One can see how the second option will be to their benefit.

dragn · Mar 13, 2016

is it time for a new cpu? with my 2500k and 390 i get so many stutter in the paris map, it drops from 50-60 to 35-45 all the time

im also only using 1333 ddr3

dmix90 · Mar 13, 2016

Parsnip said:
Why does it feel like every tech related thread turns into a slap off between dr_rus and icecold?

I would like to know which gpus they are using. I think that might provide pretty good context to their debates

Jintor · Mar 13, 2016

i'm very sad, it autodetected everything to low and still stutters, especially when I go outside or enter crowded rooms

icecold1983 · Mar 13, 2016

dmix90 said:
I would like to know which gpus they are using. I think that might provide pretty good context to their debates

980ti, not that its relevant

dgrdsv · Mar 13, 2016

dmix90 said:
I would like to know which gpus they are using. I think that might provide pretty good context to their debates

980Ti, G1 version. But I've used loads of different cards from all vendors during my years as a h/w reviewer -) 7970 was the last AMD card I had though.

frontieruk · Mar 13, 2016

dmix90 said:
I would like to know which gpus they are using. I think that might provide pretty good context to their debates

icecold1983 said:
980ti, not that its relevant

dr_rus said:
980Ti, G1 version. But I've used loads of different cards from all vendors during my years as a h/w reviewer -) 7970 was the last AMD card I had though.

Lol backfire there, I'm solidly behind icecold on this though, I don't think simultaneous async is possible on NV hardware, but I also don't think the performance increases that AMD is seeing is due to Async, I just think poor dx11 driver overhead was holding their hardware back.

QuantumSquid · Mar 13, 2016

So assuming my 970 is not having the issue where it's running at half clock speed...

Why the hell am I getting such atrocious framerates on Paris on all LOW settings, and why can't I increase the settings past Low in the launcher? I watched a guy on YouTube with identical GPU/CPU to mine and he was getting 35-60 on High settings.

Valravn · Mar 13, 2016

Performance is not good in this game and i think something's wrong and needs to be patched. FPS was all over the place until i lowered shadow resolution to medium.
Now FPS is mostly at 60 FPS. FPS will dip when i enter a room with a mirror. FPS will go down to 46-51 levels. But when i enter a room in the palace with a lot of people, like for instance the bar or the walkway, FPS will stay at 60fps.

icecold1983 · Mar 13, 2016

dr_rus said:
I'm rather sure that I've read more on this whole topic than you so your links are unlikely to provide me with anything I don't know.

Finer grained preemption isn't a requirement to run compute in parallel to graphics. It's what makes GCN's way of using concurrent compute to fill the gaps in graphics pipeline meaningful but this isn't as important for an architecture where there are no such gaps in the first place - Maxwell being this architecture specifically.

As for NV's sitting on something because of the drivers - yeah, this can easily be the case for Maxwell as concurrent compute on NV's architecture may destroy performance much easier than actually improve it (270X results in the game this thread is about should demonstrate this to you; this is also what you've been talking about in your own way). And to have it benefiting the cards in general NV have to do a lot of driver work, implementing heuristics which will decide when and how to use it to not reduce the performance instead of increasing it.

The question I have is will they release what they have now or will they just sell Pascals as cards with a better implementation of the feature? One can see how the second option will be to their benefit.

without finer grained preemption how exactly do you propose async would work?

Deleted member 17706 · Mar 13, 2016

FYI to those with G-Sync monitors, I had to go into the Nvidia control panel and specify G-Sync as the monitor setting, and set V-Sync to off manually for the application to get it to take hold in the game. When I first booted it up and turned off V-Sync in game, I still got bad frame stutter and tearing.

The game is super smooth now that I was able to force those settings in the control panel, though.

Arkanius · Mar 13, 2016

I agree with everyone that says Async is impossible in Nvidia cards.
Every test that was done in Beyond3D clearly showed there was no paralleism in Graphics and Compute work queues.

What it showed however, is that it was possible to get the same performance it if switched contexts with some logic (by grouping the workloads together)

icecold1983 · Mar 13, 2016

Arkanius said:
I agree with everyone that says Async is impossible in Nvidia cards.
Every test that was done in Beyond3D clearly showed there was no paralleism in Graphics and Compute work queues.

What it showed however, is that it was possible to get the same performance it if switched contexts with some logic (by grouping the workloads together)

the same performance as running them serially as it already does in dx11? its doing exactly that in dx12.

Arkanius · Mar 13, 2016

icecold1983 said:
the same performance as running them serially as it already does in dx11? its doing exactly that in dx12.

I'm not sure if I'm right in this part, but is it normal for DX11 games to introduce Compute workloads ?

I also think that Maxwell Cards have something akin to the SMT found in the GCN architecture. But apparently, they are limited to 32 parallel workloads, and for some reason, it's not working in parallel to Graphics work queues, despite not existing anything that should limit them to work together.

dgrdsv · Mar 13, 2016

Arkanius said:
I agree with everyone that says Async is impossible in Nvidia cards.
Every test that was done in Beyond3D clearly showed there was no paralleism in Graphics and Compute work queues.

What it showed however, is that it was possible to get the same performance it if switched contexts with some logic (by grouping the workloads together)

It would be rather strange if any test of a driver which doesn't allow async compute would have shown it to be working.

icecold1983 said:
without finer grained preemption how exactly do you propose async would work?

You can run compute concurrently from within the graphics context.
You can run async compute on some SMs while others are doing graphics.
This is just off the top of my head, I'm sure there are other options possible without the need to have the fine grained preemption to run queue interleaving GCN style.

icecold1983 · Mar 14, 2016

Arkanius said:
I'm not sure if I'm right in this part, but is it normal for DX11 games to introduce Compute workloads ?

I also think that Maxwell Cards have something akin to the SMT found in the GCN architecture. But apparently, they are limited to 32 parallel workloads, and for some reason, it's not working in parallel to Graphics work queues, despite not existing anything that should limit them to work together.

compute works just fine under dx11. compute is just work done typically by the shader units only that bypasses all the other steps in the rendering pipeline. thats why they are often faster than regular rendering for many algorithms. as it stand now, maxwell has 1+31 queues for graphics and compute respectively. it processes them serially 1 after the other but they can not be executed in parallel.

dr_rus said:
It would be rather strange if any test of a driver which doesn't allow async compute would have shown it to be working.

You can run compute concurrently from within the graphics context.
You can run async compute on some SMs while others are doing graphics.
This is just off the top of my head, I'm sure there are other options possible without the need to have the fine grained preemption to run queue interleaving GCN style.

that doesnt answer the question. how do you propose the gpu introduces parallel commands when the gpu can not preempt whats currently being worked on until all the work is finished? its CONFIRMED that a context switch is required to change between graphics and compute, but nvidia can only context switch at draw call boundaries.

dragn · Mar 14, 2016

is there no borderless fullscreen? fullscreen and exclusive fullscreen are the same for me and i get massive tearing with free-sync

Parsnip · Mar 14, 2016

dragn said:
is there no borderless fullscreen? fullscreen and exclusive fullscreen are the same for me and i get massive tearing with free-sync

Are you using DX11 or DX12?
I don't have free-sync but in DX12 benchmark I saw a lot of tearing with borderless.
DX11 borderless seems to work as expected.

dgrdsv · Mar 14, 2016

icecold1983 said:
that doesnt answer the question. how do you propose the gpu introduces parallel commands when the gpu can not preempt whats currently being worked on until all the work is finished? its CONFIRMED that a context switch is required to change between graphics and compute, but nvidia can only context switch at draw call boundaries.

I did answer your question. A context switch is required when you need to run a different contexts on the same pipeline. There is no need to perform a context switch if you're running compute warps in the graphics context and there is no need to perform a context switch if you're running compute warps on a separate pipeline (which is what GCN is doing and what Maxwell should be able to do as well once - if - they'll release the driver with concurrent async support).

icecold1983 · Mar 14, 2016

dr_rus said:
I did answer your question. A context switch is required when you need to run a different contexts on the same pipeline. There is no need to perform a context switch if you're running compute warps in the graphics context and there is no need to perform a context switch if you're running compute warps on a separate pipeline (which is what GCN is doing and what Maxwell should be able to do as well once - if - they'll release the driver with concurrent async support).

you are wrong, compute and graphics both share the same functional unit on maxwell, thats why they cant be run concurrently. its 1 or the other with a context switch required anytime you want to change between the 2. graphics and compute are different contexts. so if you are running compute warps you already switched contexts to get there.

dgrdsv · Mar 14, 2016

icecold1983 said:
you are wrong, compute and graphics both share the same functional unit on maxwell, thats why they cant be run concurrently. its 1 or the other with a context switch required anytime you want to change between the 2. graphics and compute are different contexts. so if you are running compute warps you already switched contexts to get there.

And what unit would that be? Command processor? It's more than powerful to handle them in parallel as it does handle 32 compute queues easily and is able to handle 1+31 according to documentation we have.

I'll PM you because I'm getting tired of this.

dragn · Mar 14, 2016

Parsnip said:
Are you using DX11 or DX12?
I don't have free-sync but in DX12 benchmark I saw a lot of tearing with borderless.
DX11 borderless seems to work as expected.

dx12. i dont really need borderless because freesync only works with fullscreen, but its weird that the 2 fullscreen options are exactly the same and i get tearing

Vitor711 · Mar 14, 2016

Uh, came in here to see how Hitman performs on PC. Apparently things got somewhat off-topic. Rather this thread doesn't get locked guys - that discussion is probably best saved for an Nvidia/AMD thread.

Anyway - sad to see people struggling to get the game to run decently. Hoping my 980/i5-4690k isn't an issue at 1440p. Already had to start toning down settings in most new releases to hit 60FPS there and am feeling this game will continue the trend.

MIMF · Mar 14, 2016

icecold1983 said:
you are wrong, compute and graphics both share the same functional unit on maxwell, thats why they cant be run concurrently. its 1 or the other with a context switch required anytime you want to change between the 2. graphics and compute are different contexts. so if you are running compute warps you already switched contexts to get there.

dr_rus said:
And what unit would that be? Command processor? It's more than powerful to handle them in parallel as it does handle 32 compute queues easily and is able to handle 1+31 according to documentation we have.

I'll PM you because I'm getting tired of this.

I personally believe that dj_rus is right in the inner details of this discussion.

The fact that NVIDIA hardware does not allow a powerfull fine grained preemption does not mean that different graphics/compute command buffers cannot run in parallel because hardware hazards due to shared functional units.

Otherwise, two consecutives graphics command buffers stored in the same graphics queue (I mean this and not two consecutive draw call inside a same command buffer) could never run simultaneously because they obviously share the same functional units, and this is clearly not the case (to be exact, I mean the second one to start before the first one has completely exited the pipeline).

If this was the case, it would be a big inefficiency in NVIDIA hardware that would have the side effect or making useless all the new Vulkan/DX12 sync mechanism that allow an explicit sync/control between command buffers. Or in another words, this would mean that for a command buffer to start execution, the previous one should have completely exited the pipeline what would mean a top of the pipe flush before each command buffer is executed. This is not happening on any NVIDIA hardware from many generations ago.

dgrdsv · Mar 14, 2016

Vitor711 said:
Uh, came in here to see how Hitman performs on PC. Apparently things got somewhat off-topic. Rather this thread doesn't get locked guys - that discussion is probably best saved for an Nvidia/AMD thread.

Anyway - sad to see people struggling to get the game to run decently. Hoping my 980/i5-4690k isn't an issue at 1440p. Already had to start toning down settings in most new releases to hit 60FPS there and am feeling this game will continue the trend.

Yeah, sorry about that.

To get things back on topic: Hitman в режиме DirectX 12 тест GPU

GameGPU's tests are actually showing a drop in DX12 for all h/w involved.

BoosterDuck · Mar 14, 2016

Hitman's been updated, and I don't think it fixed anything >_>

dgrdsv · Mar 14, 2016

Seems that they've updated only the .NET 3.5 redist package.

jediyoshi · Mar 15, 2016

I liked how googling "render target reuse" pulls up this thread as one of the first results with still no real answer.

PikkonX · Mar 15, 2016

Wrong thread, sorry.

shortsFortallPeople · Mar 15, 2016

i still cant finish the benchmark without crashing, dx11 or dx12, doesn't matter. Happens in the same spot, right after you see the french flag atop the building.

camequipped · Mar 16, 2016

DX12 is working really well on my GTX970, latest driver and no crashes so far.

Only had to turn down both shadow resolution and maps to medium, FPS around the mid 70s mark.

dgrdsv · Mar 16, 2016

Another Hitman benchmark: Hitman 2016: PC graphics performance benchmark review - DirectX 11 vs DirectX 12 graphics card performance

Somewhat unexpectedly they've registered a performance loss with DX12 on the Fury card:

While 390X is showing a small gain:

Garrett Hawke · Mar 16, 2016

jediyoshi said:
I liked how googling "render target reuse" pulls up this thread as one of the first results with still no real answer.

It kind of annoys me that it doesn't say anything beside an option is technical as that lol

GhostTrick · Mar 16, 2016

dr_rus said:
Two theories:
1. Different memory bandwidth may lead to a different units load landscape.
2. Concurrent async compute gains are variable in general because of how they are sliced into the main graphics pipeline.

No, this would be true for everyone. The code written to be ran across several engines doesn't differ much from the code which can run on one engine. The only thing which is different is the batching of compute and graphics jobs together for the single queue.

All DX12 GPUs support async compute.
All AMD GCN GPUs support concurrent async compute.
All GCN versions support it a bit differently.
And all GCN GPUs support it a bit differently because of the differences in caches and memory systems.

Maxwell can perform graphics and compute concurrently but the support for this isn't enabled yet in the driver. No matter how much times you will say that it can't - the fact that it can won't change. This fact is written in Maxwell's whitepaper and has been confirmed by pretty much every party involved - with the only exception of AMD who just likes to lie I guess.

GCN's implementation of async compute is definitely far from "proper".

That's because R9 290 base clock is 947mhz. R9 390 is 1ghz. Otherwise, it's the same chip.

Interferon · Mar 16, 2016

Hi guys, I want to play this game, but I'm not sure if my CPU is too much of a bottleneck.
My PC:
CPU: Intel i3 540 (3,07 GHz)
GPU: GTX 750 Ti
RAM: 8GB DDR3

CPU is below the min requirements, but I'm not looking for fancy graphics. Just wondering if I'll be able to run it on low/med settings on, say, 1600x900

KneehighPark · Mar 16, 2016

Interferon said:
Hi guys, I want to play this game, but I'm not sure if my CPU is too much of a bottleneck.
My PC:
CPU: Intel i3 540 (3,07 GHz)
GPU: GTX 970 Ti
RAM: 8GB DDR3

CPU is below the min requirements, but I'm not looking for fancy graphics. Just wondering if I'll be able to run it on low/med settings on, say, 1600x900

970 Ti? i think you need to check that again.

Interferon · Mar 17, 2016

KneehighPark said:
970 Ti? i think you need to check that again.

Woah, big mistake on my side. It's a GTX 750 Ti

Support NeoGAF

Hitman (2016) PC performance thread

Member

Member

THE POWER OF BUTTERSCOTCH BOTTOMS

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Deleted member 17706

Unconfirmed Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Member

Banned

Member

Member

Member

Banned

Member

Member

Member

Banned

Member

Member

Member

Similar threads