GPU or WU issue?

If you think it might be a driver problem, see viewforum.php?f=79

Moderators: Site Moderators, FAHC Science Team

Post Reply
hiigaran
Posts: 134
Joined: Thu Nov 17, 2011 6:01 pm

GPU or WU issue?

Post by hiigaran »

A few minutes ago, I noticed my folding rig started making more noise every few seconds. A quick investigation showed that the GPU0 fan, which is set to run at 100%, has apparently been running much faster for brief periods of time, going from 4200 RPM at what is supposed to be 100%, to 6000 RPM. A quick inspection of the temperatures revealed that the card was hitting 95 degrees. Another card with the exact same make, model, airflow setup and fan speed settings showed 70 degrees on load. Naturally, I have paused the GPU until the cause is found.

WU PRCG at the time was 9415 (192, 0, 303). Anyone else been having temperature issues with similar WUs?

Also, does this mean I'm not truly running my fans at 100%, or is it actually possible to push the fans harder? If the latter, anyone know how on Linux (naturally, it would reduce the lifespan, but hey, folding cards don't last long anyway)?
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: GPU or WU issue?

Post by bruce »

It's difficult to diagnose without access to the actual hardware.

In fact, you may be experiencing a potential fan failure. There have been a number of failures of fan bearing (lubrication) failures that plague some of the least-expensive fans on the market. Whether you actually have a fan with a long or short life depends on what components were used by your GPU's manufacturer BUT the first sign of this type of failure is an intermittent "strange" noise. Personally, I'd get an RMA for the GPU and let them figure it out ... or if that's not possible, get the fan replaced with a better quality/newer fan.
hiigaran
Posts: 134
Joined: Thu Nov 17, 2011 6:01 pm

Re: GPU or WU issue?

Post by hiigaran »

The noise isn't strange in the sense that it sounds like it's dying. It was the kind of noise you would hear from faster fans and increased airflow. No scraping, no clinking, nothing of that kind.

Oddly enough, the temperature issue went away after I paused the slot, waited for the card to cool down, then resumed the same WU.

No idea what happened, but I can say with confidence that the fan does not show any symptoms of dying or underperforming. The other 7 cards in the cluster sound exactly the same.
SteveWillis
Posts: 409
Joined: Fri Apr 15, 2016 12:42 am
Hardware configuration: PC 1:
Linux Mint 17.3
three gtx 1080 GPUs One on a powered header
Motherboard = [MB-AM3-AS-SB-990FXR2] qty 1 Asus Sabertooth 990FX(+59.99)
CPU = [CPU-AM3-FX-8320BR] qty 1 AMD FX 8320 Eight Core 3.5GHz(+41.99)

PC2:
Linux Mint 18
Open air case
Motherboard: ASUS Crosshair V Formula-Z AM3+ AMD 990FX SATA 6Gb/s USB 3.0 ATX AMD
AMD FD6300WMHKBOX FX-6300 6-Core Processor Black Edition with Cooler Master Hyper 212 EVO - CPU Cooler with 120mm PWM Fan
three gtx 1080,
one gtx 1080 TI on a powered header

Re: GPU or WU issue?

Post by SteveWillis »

I have a GPU (out of 11) that intermittently spikes temperatures into the mid 90's for a few minutes then drops back down to the low 80's. It may or may not be significant that it is GPU 0. I don't worry about it since most of the time it is within acceptable range and hope if it does fail it's within the warranty period. You might try taking the side panel off your case and directing a strong fan into the case. Even with good case fans ventilation is kind of iffy. My best temperatures are with open air cases with fans blowing across them.
Image

1080 and 1080TI GPUs on Linux Mint
hiigaran
Posts: 134
Joined: Thu Nov 17, 2011 6:01 pm

Re: GPU or WU issue?

Post by hiigaran »

I, uhh...don't exactly have a side panel...

In any case, I haven't noticed any other issues since then from any of my cards. Still curious if setting fans past 100% is actually possible from the OS though.
foldy
Posts: 2061
Joined: Sat Dec 01, 2012 3:43 pm
Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441

Re: GPU or WU issue?

Post by foldy »

I guess it is not possible because the GPU driver provides the fan values 0-100% which matches in GPU Bios to a fan rpm. But you could edit the GPU bios and there set a higher fan rpm for 100% if the GPU fan supports it, but by default in bios the 100% value also matches the highest rpm the fan can do. The fan spike to 6000 rpm you where hearing may be directly initiated by the GPU bios because temp reached 95°C which may be the hard limit in bios for your GPU.
hiigaran
Posts: 134
Joined: Thu Nov 17, 2011 6:01 pm

Re: GPU or WU issue?

Post by hiigaran »

Hmm, so that would mean that even at the default 100%, the fan speed is still significantly limited by PWM. Wonder if it would be easier to plug the fans directly into a molex adaptor/mod for pure 12v. I would assume these fans are tested for continuous 12v usage, which would mean that there wouldn't be as significant a drop in lifespan as I had originally thought.

Of course, that's a lot of assumptions...
jrweiss
Posts: 707
Joined: Tue Dec 04, 2007 6:56 am
Hardware configuration: Ryzen 7 5700G, 22.40.46 VGA driver; 32GB G-Skill Trident DDR4-3200; Samsung 860EVO 1TB Boot SSD; VelociRaptor 1TB; MSI GTX 1050ti, 551.23 studio driver; BeQuiet FM 550 PSU; Lian Li PC-9F; Win11Pro-64, F@H 8.3.5.

[Suspended] Ryzen 7 3700X, MSI X570MPG, 32GB G-Skill Trident Z DDR4-3600; Corsair MP600 M.2 PCIe Gen4 Boot, Samsung 840EVO-250 SSDs; VelociRaptor 1TB, Raptor 150; MSI GTX 1050ti, 526.98 driver; Kingwin Stryker 500 PSU; Lian Li PC-K7B. Win10Pro-64, F@H 8.3.5.
Location: @Home
Contact:

Re: GPU or WU issue?

Post by jrweiss »

Since this is 1 of several identical units, you may have a failure of the thermal paste seal between your GPU and its heat sink. If you're adept, you could disassemble it, clean it, and renew the thermal paste.

If all your BIOS settings (computer and GPU, as applicable) are set for continuous 100%, there should be no PWM limiting. However, to test it, you could plug the GPU fan into a 3-pin (non-PWM) fan connector.
Ryzen 7 5700G, 22.40.46 VGA driver; MSI GTX 1050ti, 551.23 studio driver
Ryzen 7 3700X; MSI GTX 1050ti, 551.23 studio driver [Suspended]
Post Reply