Page 1 of 1

980ti Issue?

Posted: Sun Feb 05, 2017 7:00 pm
by abravo
I am finally going to post regarding my issue, because I can't seem to solve it myself.

I am experiencing a "crash" (not sure what else to call it), with my EVGA 980ti Classified. After I start Folding @ Home, and the computer (a dedicated Folding rig) runs fine for a time period, at certain, various times after Windows 10 turns the display off automatically, the computer will no longer continue to fold on the CPU and the display will not come out of the turned off state. If I disable the "turn off display" setting in Windows 10, the computer does not "crash." That has led me to believe that it is something related to the GPU. The only way to get the display back is to hit the reset button and reboot the computer.

I have tried the following Nvidia drivers: 362.00, 378.49, and 372.90. I am currently running 372.90 because it looked like that driver had the highest PPD for my 980ti.

Just for background, I have seven other computers that are Folding, and none of them experience this type of "crash" after their displays turn off. I do not have another 980ti, but I have three 960s, three 980s, and one 1080.

Does anyone have a theory as to why the 980ti rig is crashing.

Re: 980ti Issue?

Posted: Sun Feb 05, 2017 7:26 pm
by PS3EdOlkkola
Sounds like your system is going into power saving mode after a period of time. You may want to set your power profile to "performance" which prevents the system from going into a lower power state and therefore stops folding. You should still be able to allow the system to turn your monitor off and continue to fold normally.

Re: 980ti Issue?

Posted: Sun Feb 05, 2017 8:17 pm
by abravo
Thanks for the fast reply.

Well, I guess what I said above is no longer current information. I just went down to check on my computer, which had been folding in Windows "High Performance" mode, and I had the display set to never turn off, and the computer had "crashed" just like before. The odd thing is that I have a Corsair H110i GT on my 4930K, and I have the RGB block color set to change based on CPU temperature, and the color was somewhere between where it is at idle state (blue), and where it is during folding (white). So it appears that the CPU is still doing something in the "crashed" state. The other strange thing to add is that the fans on the 980ti are still spinning, and there is a little bit of heat being generated from the card, but nowhere near what it would be at full folding. So it seems as though the GPU is still doing something in the "crashed" state and is not completely idle.

I am going to pause the 4930K from folding, just let the 980ti fold, and allow the display to turn off. I want to see what happens if I take the CPU out of the scenario.

I will report back.

Re: 980ti Issue?

Posted: Sun Feb 05, 2017 9:41 pm
by PS3EdOlkkola
Are you using a GPU utility to check the temperature of the 980ti, like Afterburner or GPUZ? It could be the case the 980ti is overheating and crashing the display driver. Sometimes it will recover on its own, but not always. Also, if the 980ti is overclocked, you might want to lower the GPU clock to stock speeds to see if that has an impact. Another consideration: If the GPU is not getting enough power, it can cause the GPU to drop from 3D mode to 2D mode, dropping the GPU clock rate to just above idle, which if I recall correctly is 405MHz, so it will still fold, just a lot more slowly. The scenario could be this: The GPU overheats and/or lacks sufficient power to fold at 3D performance levels, the display driver crashes, recovers but can't resync the monitor, and the GPU continues to fold in 2D mode. That would explain the moderate amount of heat generated by the GPU and why your CPU still shows its folding.

Re: 980ti Issue?

Posted: Sun Feb 05, 2017 10:03 pm
by artoar_11
Try this:
NVIDIA Control Panel -> Manage 3D settings -> Power management mode -> Prefer maximum performance

Re: 980ti Issue?

Posted: Mon Feb 06, 2017 7:03 pm
by bruce
Check the Windows Event Log. Is a GPU error noted at whatever time the monitor seems to hang?

Are all your systems using the same version of Windows and the same driver version (if applicable)?

Re: 980ti Issue?

Posted: Mon Feb 06, 2017 11:22 pm
by abravo
I am getting the following error every four seconds (for several minutes) until the crash: System

- Provider

[ Name] Display

- EventID 4101

[ Qualifiers] 0

Level 3

Task 0

Keywords 0x80000000000000

- TimeCreated

[ SystemTime] 2017-02-06T22:42:06.505990300Z

EventRecordID 63486

Channel System

Computer DESKTOP-**********

Security


- EventData

nvlddmkm

Re: 980ti Issue?

Posted: Tue Feb 07, 2017 2:25 pm
by abravo
artoar_11 wrote:Try this:
NVIDIA Control Panel -> Manage 3D settings -> Power management mode -> Prefer maximum performance
I tried this, but it did not solve the problem.

Re: 980ti Issue?

Posted: Tue Feb 07, 2017 8:05 pm
by bruce
Microsoft says this about EventID 4101:
Windows Vista and Windows Server 2008 can detect when the graphics hardware or device driver take longer than expected to complete an operation. When this happens, Windows attempts to preempt the operation, and restore the display system to a usable state by resetting the graphics adapter. Typically, the only noticeable effect from this is a flicker of the display due to the reset and subsequent screen redraw. For more information, see "Timeout Detection and Recovery of GPUs through WDDM" at http://go.microsoft.com/fwlink/?linkid=77531 on the Microsoft Web site.
In the past, there have been instances where increasing TDR in the Registry was recommended as a short-term solution. FAH took steps to avoid this timeout event in the cases reported.

In fact, this situation is dependent on the WU, the FAHCore, and the characteristics of the GPU that is driving the Windows monitor and other software that might be using the GPU. (i.e - It might be caused by non-FAH activities.) With an issue this complex, they might have missed some combinations that apply to your system.

Gathering all the necessary information and submitting it to Development is one route we can take. I'm going to guess that setting that FAH slot to run with "idle" = "true" may work. Permanently increasing TDR is a possible work-around, too.

Re: 980ti Issue?

Posted: Wed Feb 08, 2017 9:35 pm
by foldy
I have a GPU which crashed the driver during folding and during games and left the PC with a black screen. Even though the GPU did not run too hot, the solution was to renew the thermal paste. Since then it never crashed again.

Re: 980ti Issue?

Posted: Wed Feb 08, 2017 10:22 pm
by bruce
foldy wrote:Even though the GPU did not run too hot, the solution was to renew the thermal paste.
Aha. It was running too hot but the thermal sensors didn't detect it.

It's time to design GPUs with thermal sensors on the chip, itself.