GeForce GTX 970s seem to have an issue using all 4GB of VRAM, Nvidia looking into it

Status
Not open for further replies.
From the official bandwidthTest from the CUDA SDK.

I'm not sure what that's supposed to prove exactly. The lesser tests should be expected to result in lower bandwidth figures because you're moving such a small amount of data and your last test is 67MB -- not 67MB in addition to what you tested earlier but 67MB in total.
 
I'm not sure what that's supposed to prove exactly. The lesser tests should be expected to result in lower bandwidth figures because you're moving such a small amount of data and your last test is 67MB -- not 67MB in addition to what you tested earlier but 67MB in total.

There I allocated and transfered an entire 4GB chunk in less than .003s

Code:
bandwidthTest.exe --dtod --start=400000000 --end=400000000 --increment=3000000 --mode=
range --csv
[CUDA Bandwidth Test] - Starting...
Running on...

 Device 0: GeForce GTX 970
 Range Mode

bandwidthTest-D2D, Bandwidth = 143003.8 MB/s, Time = 0.00267 s, Size = 400000000
 bytes, NumDevsUsed = 1
Result = PASS
 
I would assume that if they've advertised it as 2Gb, then it would be different hardware. Would just be inerested to know...Especially since I ordered a 970 SLI setup no earlier than 9 days ago.

The schematics of the number of gpc's at anandtech indicate they've got exactly half of the gcc's vs the 980 which means it should work well with half the memory. (4gigs vs 2 gigs).

http://www.anandtech.com/show/8923/nvidia-launches-geforce-gtx-960

GeForce_GTX_960_Block_Diagram_FINAL_575px.png
 
I have Samsung Memory and showing at 4096 MB. Am I safe?
 
Ah, strange.

Actually not strange if it's doing a bandwidth test internally to copy 2 GB, it has to have 2GB destination and and 2 GB source. After multiple runthroughs, I've noticed there's nothing consistent. It ranges from 40 GB/s to 89 GB/s for the same 2GB copy.
 
Can anyone explain to me why I have 29 chunks and 3712MiByte allocated while all the rest of you seem to have 30 chunks with 3840MB?
 
By the looks of it that test doesn't prove anything.

Anyway I've got an EVGA Superclocked with Samsung memory.
 
Oooerr will have to keep my eye on this, I haven't played anything that will even come close to 3.5GB VRAM usage yet, but this is good to know. I'll wait to hear more from Nvidia!
 
Single card and a single monitor.

Hmmm ok thought it would be related to that but I guess not.

I don't get the flashing screen buy my driver crashes too. Running the 347.25 drivers for the record.

This whole thing leaves a bad taste in my mouth. I'm still waiting for MFAA support for SLI and the tesselation on AC:Unity by the way. And installing the new drivers I thought it would fix the god awful flashing water textures in that damn game but nope!

One thing's for sure though, last time I go SLI.
 
Has there been any test on any 4Gb nVidia cards (any model) that passed this test?

I'm starting to think the benchmark tool or they way it talks to the driver is the issue.
 
Has there been any test on any 4Gb nVidia cards (any model) that passed this test?

I'm starting to think the benchmark tool or they way it talks to the driver is the issue.

I ran it on my 2GB 670 and only the last two chunks reported a drop in memory bandwidth, which could very easily be attributed to OS overhead.
 
I have a MSI 970, have had no problems, but I upgraded from a 470, so it's no wonder that it would feel like a massive performance boost regardless if its 3,5 or 4 gigs.
 
Has there been any test on any 4Gb nVidia cards (any model) that passed this test?

I'm starting to think the benchmark tool or they way it talks to the driver is the issue.

There's a picture of a 980 in the OP. The benches in the OP were run with Windows running on the IGP, not the graphics card. The 980 clears it with full performance for all chunks.
 
There's a picture of a 980 in the OP. The benches in the OP were run with Windows running on the IGP, not the graphics card. The 980 clears it with full performance for all chunks.

I used the IGP on my i5-2500 for years before I got the 970. Can I set Windows to run on IGP and still take full advantage of the 970 in games? And would there be any benefit to do doing so?
 
Huh, I thought this was odd in my FFXIV System Config earlier:
Code:
SYSTEM_GRAPHICS_VRAM	3072.000 MB
SYSTEM_GRAPHICS_SHARED_VRAM	1023.938 MB
Anyways,

Hynix; fails at about 3GB.
Gigabyte GeForce GTX 970 G1 Gaming GV-N970G1 GAMING-4GD
 
Just skimmed the thread so sorry if I missed it:
Are there any reports of people experiencing issues in games (meaning issues that start at ~3GB VRAM usage) that are possibly related to this? The benchmark looks strange but so far it seems like the only indicator that something is wrong.

EDIT: Just saw the case of games using different amount of memory depending on wether a 970 or 980 is used in the exact same scene.
 
I used the IGP on my i5-2500 for years before I got the 970. Can I set Windows to run on IGP and still take full advantage of the 970 in games? And would there be any benefit to do doing so?

Not really. IGP is mostly useful as an extra monitor port or if you need the features (like Quicksync, but AMD/Nvidia GPUs all have their own version of that now). Generally games will use the GPU Monitor 1 is connected to, I think.

If you need 100% of the performance your GPU is capable of, run games in exclusive fullscreen. Running both integrated and dedicated graphics at the same time is just asking for bugs and drama.
 
I'll run the test when I get home, unless it's actually been 100% proven to be useless. I've got a pair of launch MSIs.
 
It's crazy how this is being discovered now. Glad I waited. Hopefully they'll fix the issue because I'm not really willing to put up $200+ more dollars for a 980.
 
For anyone that is interested, here is the link to the source code of the benchmark (post 20 by Nai)

Looking at the source code this is not purely testing bandwidth. Due to the computations inside of benchmarkdramkernel, it's also a measure of Compute performance.

If you wanted to test just plain bandwidth, you would waste some time populating another area of memory(not measuring this) then use cudaMemcpy to copy over the given chunk a number of times. Currently, the ALU will have to waste cycles for the arithmetic inside of the benchmark functions.
 
I don't think there is anything Nvidia can do, it seems to be a structural hardware design issue.

I don't like this, at all. I can see myself switching to the red team because of things like this.
 
Thought I would chime in with some 980 results from that program. It seems to drop off for me too...

results_zpsc4dfl4gy.jpg


I have a launch 980 (Palit) with Samsung memory :)
 
I've (probably) seen this problem in practice couple of times with my 970 G1 Rev 1.0. Recently last week when playing the Evolve beta. Even at 1080p the game uses over 3.5-3.6GB of VRAM and sometimes when I join/load a game, I have 5-10fps for a while before it goes back to solid 60fps which is what I have all the time besides these slowdowns to single digits. Most VRAM heavy games seem to mystically hover at around 3.5GB, never going above that. Meanwhile 980 users are reporting memory usage of well over 3.5GB with similar settings in those games.
 
Status
Not open for further replies.
Top Bottom