511.23 a bad driver?

It seems that a lot of GPU problems revolve around specific versions of drivers. Though NVidia has their own support structure, you can often learn from information reported by others who fold.

Moderators: Site Moderators, FAHC Science Team

511.23 a bad driver?

Postby XanderF » Sat Jan 15, 2022 2:12 am

As above - just released today, and I updated to it (from 496.49). Have only had 4 projects since switching, but...well, my PPD on all of them is drastically lower than the earlier driver (about half).

Curiously, although the GPU is showing the expected 100%-ish load, it's not getting hot - temperature also significantly lower than previously (indeed the fans are barely spinning).

Wondering if anyone has any different experience with this driver?
XanderF
 
Posts: 38
Joined: Thu Aug 11, 2011 1:25 am

Re: 511.23 a bad driver?

Postby Neil-B » Sat Jan 15, 2022 9:00 am

I'm seeing if anything a very small (sub 1%) upward shift in ppd from this latest driver update so far (om sample of 4 wus) ... rtx3070, win 11 (latest build)
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
Neil-B
 
Posts: 1980
Joined: Sun Mar 22, 2020 6:52 pm
Location: UK

Re: 511.23 a bad driver?

Postby gunnarre » Sat Jan 15, 2022 10:54 am

There are no Covid Moonshot WUs at the moment. Moonshot WUs have a higher PPD than other projects, especially on modest Nvidia GPUs, so I suspect that the lower PPD you're seeing might be because you're folding projects with less PPD on your card. Once more Moonshot sprints go live, you might see your PPD reach the same level.

Does your log say "Using CUDA" when it folds? If you got the driver from Windows rather than straight from Nvidia, you might have gotten a driver without CUDA support, so check that it's actually using CUDA in your log.

One other thing you might check is if your power limit was changed when you upgraded the driver and re-started.
Image
Online: GTX 1660 Super, GTX 1050 Ti, GTX 950 + occasional CPU folding in the cold.
Offline: GTX 960 (intermittent), Radeon HD 7770 (retired)
gunnarre
 
Posts: 502
Joined: Sun May 24, 2020 8:23 pm
Location: Norway

Re: 511.23 a bad driver?

Postby toTOW » Sat Jan 15, 2022 11:50 am

511.23 drivers seem a little faster (about 1 to 3 second less per frame) on all my GPUs than those I had before (from November).
Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.

FAH-Addict : latest news, tests and reviews about Folding@Home project.

Image
User avatar
toTOW
Site Moderator
 
Posts: 5923
Joined: Sun Dec 02, 2007 11:38 am
Location: Bordeaux, France

Re: 511.23 a bad driver?

Postby Veablo » Sat Jan 15, 2022 3:24 pm

After updating to 511.23 GPU load is 100% (ish) but temperature is around 40. ETA on current WU over 4 hours.
After a reboot, temperature at 100% is 68 and ETA on current WU are down to 2 hours 25 min.
When the current WU is completed, GPU gets cold at 100% again. New reboot fixes it...

Update:
It happens during folding also.
Went from TOTAL ESTIMATED PPD 3,579,108 to TOTAL ESTIMATED PPD 1,845,936 suddenly.

RTX-3060TI LHR
Veablo
 
Posts: 5
Joined: Sat Jan 15, 2022 3:19 pm

Re: 511.23 a bad driver?

Postby XanderF » Sat Jan 15, 2022 4:53 pm

Veablo wrote:Went from TOTAL ESTIMATED PPD 3,579,108 to TOTAL ESTIMATED PPD 1,845,936 suddenly.


Is that after the reboot, or before? Because that's the sort of thing I was seeing - MASSIVE drop. I'm on a regular 3060, though, so my numbers were more like 2 million PPD before and then down to under 750k. And same with the utilization at 100% but strangely low temperature.

gunnarre wrote:There are no Covid Moonshot WUs at the moment. Moonshot WUs have a higher PPD than other projects, especially on modest Nvidia GPUs, so I suspect that the lower PPD you're seeing might be because you're folding projects with less PPD on your card.


Did those run out, like, yesterday? Because looking at my PPD production over time - it's been fairly consistent at 2million until it dropped like a rock yesterday (a lot of that due to the bad WUs on project 13460, but even beyond those, performance has been much lower on newer drivers).
XanderF
 
Posts: 38
Joined: Thu Aug 11, 2011 1:25 am

Re: 511.23 a bad driver?

Postby Veablo » Sat Jan 15, 2022 5:07 pm

The drop from 3,579,108 to 1,845,936 happened while folding, no reboot.
Veablo
 
Posts: 5
Joined: Sat Jan 15, 2022 3:19 pm

Re: 511.23 a bad driver?

Postby XanderF » Sat Jan 15, 2022 5:10 pm

Veablo wrote:Went from TOTAL ESTIMATED PPD 3,579,108 to TOTAL ESTIMATED PPD 1,845,936 suddenly.


Is that after the reboot, or before? Because that's the sort of thing I was seeing - MASSIVE drop. I'm on a regular 3060, though, so my numbers were more like 2 million PPD before and then down to under 750k. And same with the utilization at 100% but strangely low temperature.

gunnarre wrote:There are no Covid Moonshot WUs at the moment. Moonshot WUs have a higher PPD than other projects, especially on modest Nvidia GPUs, so I suspect that the lower PPD you're seeing might be because you're folding projects with less PPD on your card.


Did those run out, like, yesterday? Because looking at my PPD production over time - it's been fairly consistent at 2million until it dropped like a rock yesterday (a lot of that due to the bad WUs on project 13460, but even beyond those, performance has been much lower on newer drivers).

gunnarre wrote:Does your log say "Using CUDA" when it folds? If you got the driver from Windows rather than straight from Nvidia, you might have gotten a driver without CUDA support, so check that it's actually using CUDA in your log.


Log showed using CUDA, yes. I do notice that driver 511.23 introduces CUDA 11.6 - 511.09 (which seems to work okay for me) is CUDA 11.5 and the 496.49 driver I had been running up until yesterday and worked the best was CUDA 11.4. Wonder if there is an issue with 11.6...
XanderF
 
Posts: 38
Joined: Thu Aug 11, 2011 1:25 am

Re: 511.23 a bad driver?

Postby Veablo » Sat Jan 15, 2022 6:03 pm

Doing a clean reinstall seems to do the trick. Been stable for a good 30 min.

Update: Didnt last long. Back to high GPU usage and low temp. And TOTAL ESTIMATED PPD 1,695,519
Veablo
 
Posts: 5
Joined: Sat Jan 15, 2022 3:19 pm

Re: 511.23 a bad driver?

Postby toTOW » Sat Jan 15, 2022 11:21 pm

Is it related to an other use of the GPU (web browser, video playback, ...) ?

Is there some events logged in Windows logs at the same time the slowdowns start ?
User avatar
toTOW
Site Moderator
 
Posts: 5923
Joined: Sun Dec 02, 2007 11:38 am
Location: Bordeaux, France

Re: 511.23 a bad driver?

Postby Veablo » Sun Jan 16, 2022 1:10 pm

I downgraded my driver to a Game Ready non DCH driver and all is good now.
Version 472.12, its a bit old but works.
Veablo
 
Posts: 5
Joined: Sat Jan 15, 2022 3:19 pm

Re: 511.23 a bad driver?

Postby gunnarre » Sun Jan 16, 2022 7:28 pm

XanderF wrote:Did those run out, like, yesterday?

Yes, the researcher aborted the projects ahead of time due to the error in making the work units. Other information here seems to suggest that this wasn't the issue, though.
gunnarre
 
Posts: 502
Joined: Sun May 24, 2020 8:23 pm
Location: Norway

Re: 511.23 a bad driver?

Postby toTOW » Mon Jan 17, 2022 10:32 am

Veablo, XanderF > Can we see some logs showing the part where PPD is normal and a part where PPD is reduced ?
User avatar
toTOW
Site Moderator
 
Posts: 5923
Joined: Sun Dec 02, 2007 11:38 am
Location: Bordeaux, France

Re: 511.23 a bad driver?

Postby Neil-B » Mon Jan 17, 2022 11:45 am

Could you also watch the clock speeds for the gpus and report if these are changing at all when this happens?
Neil-B
 
Posts: 1980
Joined: Sun Mar 22, 2020 6:52 pm
Location: UK

Re: 511.23 a bad driver?

Postby tulanebarandgrill » Mon Jan 17, 2022 5:28 pm

I don't know what other people do but I use the studio drivers for Win 10, for a number of reasons. I have had a couple of runs with low PPD but they ended up aborting (I posted this to a different thread but the tail end of the log is below. I'd check for this in case it's the same thing but I noticed same symptoms, clock rate really high, temp low, not much point action until it crashed. The next WU is working well.

15:38:24:WU00:FS00:0x22:Completed 175000 out of 2500000 steps (7%)
15:49:03:WU00:FS00:0x22:Watchdog triggered, requesting soft shutdown down
15:59:03:WU00:FS00:0x22:Watchdog shutdown failed, hard shutdown triggered
15:59:04:WARNING:WU00:FS00:FahCore returned an unknown error code which probably indicates that it crashed
15:59:04:WARNING:WU00:FS00:FahCore returned: WU_STALLED (127 = 0x7f)
15:59:04:WU00:FS00:Starting
- TulaneBaG
tulanebarandgrill
 
Posts: 36
Joined: Thu Nov 16, 2017 3:57 pm

Next

Return to Problems with NVidia drivers

Who is online

Users browsing this forum: No registered users and 3 guests

cron