511.23 a bad driver?

It seems that a lot of GPU problems revolve around specific versions of drivers. Though NVidia has their own support structure, you can often learn from information reported by others who fold.

Moderators: Site Moderators, FAHC Science Team

XanderF
Posts: 42
Joined: Thu Aug 11, 2011 12:25 am

511.23 a bad driver?

Post by XanderF »

As above - just released today, and I updated to it (from 496.49). Have only had 4 projects since switching, but...well, my PPD on all of them is drastically lower than the earlier driver (about half).

Curiously, although the GPU is showing the expected 100%-ish load, it's not getting hot - temperature also significantly lower than previously (indeed the fans are barely spinning).

Wondering if anyone has any different experience with this driver?
Neil-B
Posts: 2027
Joined: Sun Mar 22, 2020 5:52 pm
Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21
Location: UK

Re: 511.23 a bad driver?

Post by Neil-B »

I'm seeing if anything a very small (sub 1%) upward shift in ppd from this latest driver update so far (om sample of 4 wus) ... rtx3070, win 11 (latest build)
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
gunnarre
Posts: 567
Joined: Sun May 24, 2020 7:23 pm
Location: Norway

Re: 511.23 a bad driver?

Post by gunnarre »

There are no Covid Moonshot WUs at the moment. Moonshot WUs have a higher PPD than other projects, especially on modest Nvidia GPUs, so I suspect that the lower PPD you're seeing might be because you're folding projects with less PPD on your card. Once more Moonshot sprints go live, you might see your PPD reach the same level.

Does your log say "Using CUDA" when it folds? If you got the driver from Windows rather than straight from Nvidia, you might have gotten a driver without CUDA support, so check that it's actually using CUDA in your log.

One other thing you might check is if your power limit was changed when you upgraded the driver and re-started.
Image
Online: GTX 1660 Super, GTX 1080, GTX 1050 Ti 4G OC, RX580 + occasional CPU folding in the cold.
Offline: Radeon HD 7770, GTX 960, GTX 950
toTOW
Site Moderator
Posts: 6296
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: 511.23 a bad driver?

Post by toTOW »

511.23 drivers seem a little faster (about 1 to 3 second less per frame) on all my GPUs than those I had before (from November).
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
Veablo
Posts: 5
Joined: Sat Jan 15, 2022 2:19 pm

Re: 511.23 a bad driver?

Post by Veablo »

After updating to 511.23 GPU load is 100% (ish) but temperature is around 40. ETA on current WU over 4 hours.
After a reboot, temperature at 100% is 68 and ETA on current WU are down to 2 hours 25 min.
When the current WU is completed, GPU gets cold at 100% again. New reboot fixes it...

Update:
It happens during folding also.
Went from TOTAL ESTIMATED PPD 3,579,108 to TOTAL ESTIMATED PPD 1,845,936 suddenly.

RTX-3060TI LHR
XanderF
Posts: 42
Joined: Thu Aug 11, 2011 12:25 am

Re: 511.23 a bad driver?

Post by XanderF »

Veablo wrote: Went from TOTAL ESTIMATED PPD 3,579,108 to TOTAL ESTIMATED PPD 1,845,936 suddenly.
Is that after the reboot, or before? Because that's the sort of thing I was seeing - MASSIVE drop. I'm on a regular 3060, though, so my numbers were more like 2 million PPD before and then down to under 750k. And same with the utilization at 100% but strangely low temperature.
gunnarre wrote:There are no Covid Moonshot WUs at the moment. Moonshot WUs have a higher PPD than other projects, especially on modest Nvidia GPUs, so I suspect that the lower PPD you're seeing might be because you're folding projects with less PPD on your card.
Did those run out, like, yesterday? Because looking at my PPD production over time - it's been fairly consistent at 2million until it dropped like a rock yesterday (a lot of that due to the bad WUs on project 13460, but even beyond those, performance has been much lower on newer drivers).
Veablo
Posts: 5
Joined: Sat Jan 15, 2022 2:19 pm

Re: 511.23 a bad driver?

Post by Veablo »

The drop from 3,579,108 to 1,845,936 happened while folding, no reboot.
XanderF
Posts: 42
Joined: Thu Aug 11, 2011 12:25 am

Re: 511.23 a bad driver?

Post by XanderF »

Veablo wrote: Went from TOTAL ESTIMATED PPD 3,579,108 to TOTAL ESTIMATED PPD 1,845,936 suddenly.
Is that after the reboot, or before? Because that's the sort of thing I was seeing - MASSIVE drop. I'm on a regular 3060, though, so my numbers were more like 2 million PPD before and then down to under 750k. And same with the utilization at 100% but strangely low temperature.
gunnarre wrote:There are no Covid Moonshot WUs at the moment. Moonshot WUs have a higher PPD than other projects, especially on modest Nvidia GPUs, so I suspect that the lower PPD you're seeing might be because you're folding projects with less PPD on your card.
Did those run out, like, yesterday? Because looking at my PPD production over time - it's been fairly consistent at 2million until it dropped like a rock yesterday (a lot of that due to the bad WUs on project 13460, but even beyond those, performance has been much lower on newer drivers).
gunnarre wrote:Does your log say "Using CUDA" when it folds? If you got the driver from Windows rather than straight from Nvidia, you might have gotten a driver without CUDA support, so check that it's actually using CUDA in your log.
Log showed using CUDA, yes. I do notice that driver 511.23 introduces CUDA 11.6 - 511.09 (which seems to work okay for me) is CUDA 11.5 and the 496.49 driver I had been running up until yesterday and worked the best was CUDA 11.4. Wonder if there is an issue with 11.6...
Veablo
Posts: 5
Joined: Sat Jan 15, 2022 2:19 pm

Re: 511.23 a bad driver?

Post by Veablo »

Doing a clean reinstall seems to do the trick. Been stable for a good 30 min.

Update: Didnt last long. Back to high GPU usage and low temp. And TOTAL ESTIMATED PPD 1,695,519
toTOW
Site Moderator
Posts: 6296
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: 511.23 a bad driver?

Post by toTOW »

Is it related to an other use of the GPU (web browser, video playback, ...) ?

Is there some events logged in Windows logs at the same time the slowdowns start ?
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
Veablo
Posts: 5
Joined: Sat Jan 15, 2022 2:19 pm

Re: 511.23 a bad driver?

Post by Veablo »

I downgraded my driver to a Game Ready non DCH driver and all is good now.
Version 472.12, its a bit old but works.
gunnarre
Posts: 567
Joined: Sun May 24, 2020 7:23 pm
Location: Norway

Re: 511.23 a bad driver?

Post by gunnarre »

XanderF wrote: Did those run out, like, yesterday?
Yes, the researcher aborted the projects ahead of time due to the error in making the work units. Other information here seems to suggest that this wasn't the issue, though.
Image
Online: GTX 1660 Super, GTX 1080, GTX 1050 Ti 4G OC, RX580 + occasional CPU folding in the cold.
Offline: Radeon HD 7770, GTX 960, GTX 950
toTOW
Site Moderator
Posts: 6296
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: 511.23 a bad driver?

Post by toTOW »

Veablo, XanderF > Can we see some logs showing the part where PPD is normal and a part where PPD is reduced ?
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
Neil-B
Posts: 2027
Joined: Sun Mar 22, 2020 5:52 pm
Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21
Location: UK

Re: 511.23 a bad driver?

Post by Neil-B »

Could you also watch the clock speeds for the gpus and report if these are changing at all when this happens?
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
tulanebarandgrill
Posts: 36
Joined: Thu Nov 16, 2017 2:57 pm

Re: 511.23 a bad driver?

Post by tulanebarandgrill »

I don't know what other people do but I use the studio drivers for Win 10, for a number of reasons. I have had a couple of runs with low PPD but they ended up aborting (I posted this to a different thread but the tail end of the log is below. I'd check for this in case it's the same thing but I noticed same symptoms, clock rate really high, temp low, not much point action until it crashed. The next WU is working well.

15:38:24:WU00:FS00:0x22:Completed 175000 out of 2500000 steps (7%)
15:49:03:WU00:FS00:0x22:Watchdog triggered, requesting soft shutdown down
15:59:03:WU00:FS00:0x22:Watchdog shutdown failed, hard shutdown triggered
15:59:04:WARNING:WU00:FS00:FahCore returned an unknown error code which probably indicates that it crashed
15:59:04:WARNING:WU00:FS00:FahCore returned: WU_STALLED (127 = 0x7f)
15:59:04:WU00:FS00:Starting
- TulaneBaG
Post Reply