core22 0.0.10 released to full FAH!

If you think it might be a driver problem, see viewforum.php?f=79

Moderators: Site Moderators, FAHC Science Team

emf
Posts: 16
Joined: Tue Apr 21, 2020 1:35 pm

Re: core22 0.0.10 released to full FAH!

Post by emf »

Yeah, my box got killed a few days ago by the same thing.. ubuntu auto-upgraded the nvidia 440.82 drivers to 440.110 and the userspace no longer matched the running driver in the kernel and it didn't work until a reboot.
Crunchtimer
Posts: 50
Joined: Tue May 05, 2020 5:34 am

Re: core22 0.0.10 released to full FAH!

Post by Crunchtimer »

tomc001 wrote:
Postby ajm » Sun Jun 28, 2020 10:28 am

It should be automatic. You can check in your log whether it's been done, a bit after that line
Thanks. My log shows Core22 so it did indeed update with no intervention from me.

Interesting that the Version: is a mix of 0.0.10 and 0.0.11.
Yes and I don't think that the Core22 upgrades are causing any problems.
I got confused due to, what seems to be, an ubuntu auto upgrade of nvidia driver.

I believe I only have Core22 v0.0.11 starting 20200701 around noon UTC and everything is working fine.
Ichbin3
Posts: 96
Joined: Thu May 28, 2020 8:06 am
Hardware configuration: MSI H81M, G3240, RTX 2080Ti_Rev-A@220W, Ubuntu 18.04
Location: Germany

Re: core22 0.0.10 released to full FAH!

Post by Ichbin3 »

JohnChodera wrote:Anyone else having this issue?
I got similar messages after power up after hibernation mode in Linux Mint.
I had to reboot. Core killing didn't do the job.
Looks like a driver issue too.
Image
MSI H81M, G3240, RTX 2080Ti_Rev-A@220W, Ubuntu 18.04
tulanebarandgrill
Posts: 36
Joined: Thu Nov 16, 2017 2:57 pm

Re: core22 0.0.10 released to full FAH!

Post by tulanebarandgrill »

Seeing the same here

During the period between the watchdog trigger and the hard shutdown I noticed the GPU was running at a high clock rate but at a very low temperature, and the folds were completing in about 40 seconds. This is not normal and with a working WU the clock rate is about 2 - 2.5% slower with a temp in the 70's instead of the 50's (C).

2080 Ti, 5960X CPU; CPU running around 15% load during a well behaving WU.

Code: Select all

15:33:10:WU00:FS00:0x22:*********************** Log Started 2022-01-17T15:33:10Z ***********************
15:33:10:WU00:FS00:0x22:*************************** Core22 Folding@home Core ***************************
15:33:10:WU00:FS00:0x22:       Core: Core22
15:33:10:WU00:FS00:0x22:       Type: 0x22
15:33:10:WU00:FS00:0x22:    Version: 0.0.18
15:33:10:WU00:FS00:0x22:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:33:10:WU00:FS00:0x22:  Copyright: 2020 foldingathome.org
15:33:10:WU00:FS00:0x22:   Homepage: https://foldingathome.org/
15:33:10:WU00:FS00:0x22:       Date: Sep 28 2021
15:33:10:WU00:FS00:0x22:       Time: 05:55:05
15:33:10:WU00:FS00:0x22:   Revision: cfe3d7d990e8f456e371f8ce63b5fcc6daab2103
15:33:10:WU00:FS00:0x22:     Branch: HEAD
15:33:10:WU00:FS00:0x22:   Compiler: Visual C++
15:33:10:WU00:FS00:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
15:33:10:WU00:FS00:0x22:             -DOPENMM_VERSION="\"7.6.0\""
15:33:10:WU00:FS00:0x22:   Platform: win32 10
15:33:10:WU00:FS00:0x22:       Bits: 64
15:33:10:WU00:FS00:0x22:       Mode: Release
15:33:10:WU00:FS00:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
15:33:10:WU00:FS00:0x22:             <peastman@stanford.edu>
15:33:10:WU00:FS00:0x22:       Args: -dir 00 -suffix 01 -version 705 -lifeline 2940 -checkpoint 15
15:33:10:WU00:FS00:0x22:             -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device
15:33:10:WU00:FS00:0x22:             0 -gpu 0 verbosity 9
15:33:10:WU00:FS00:0x22:************************************ libFAH ************************************
15:33:10:WU00:FS00:0x22:       Date: Sep 28 2021
15:33:10:WU00:FS00:0x22:       Time: 05:53:43
15:33:10:WU00:FS00:0x22:   Revision: 44301ed97b996b63fe736bb8073f22209cb2b603
15:33:10:WU00:FS00:0x22:     Branch: HEAD
15:33:10:WU00:FS00:0x22:   Compiler: Visual C++
15:33:10:WU00:FS00:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
15:33:10:WU00:FS00:0x22:   Platform: win32 10
15:33:10:WU00:FS00:0x22:       Bits: 64
15:33:10:WU00:FS00:0x22:       Mode: Release
15:33:10:WU00:FS00:0x22:************************************ CBang *************************************
15:33:10:WU00:FS00:0x22:       Date: Sep 28 2021
15:33:10:WU00:FS00:0x22:       Time: 05:52:38
15:33:10:WU00:FS00:0x22:   Revision: 33fcfc2b3ed2195a423606a264718e31e6b3903f
15:33:10:WU00:FS00:0x22:     Branch: HEAD
15:33:10:WU00:FS00:0x22:   Compiler: Visual C++
15:33:10:WU00:FS00:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
15:33:10:WU00:FS00:0x22:   Platform: win32 10
15:33:10:WU00:FS00:0x22:       Bits: 64
15:33:10:WU00:FS00:0x22:       Mode: Release
15:33:10:WU00:FS00:0x22:************************************ System ************************************
15:33:10:WU00:FS00:0x22:        CPU: Intel(R) Core(TM) i7-5960X CPU @ 3.00GHz
15:33:10:WU00:FS00:0x22:     CPU ID: GenuineIntel Family 6 Model 63 Stepping 2
15:33:10:WU00:FS00:0x22:       CPUs: 16
15:33:10:WU00:FS00:0x22:     Memory: 63.91GiB
15:33:10:WU00:FS00:0x22:Free Memory: 54.94GiB
15:33:10:WU00:FS00:0x22:    Threads: WINDOWS_THREADS
15:33:10:WU00:FS00:0x22: OS Version: 6.2
15:33:10:WU00:FS00:0x22:Has Battery: false
15:33:10:WU00:FS00:0x22: On Battery: false
15:33:10:WU00:FS00:0x22: UTC Offset: -6
15:33:10:WU00:FS00:0x22:        PID: 14560
15:33:10:WU00:FS00:0x22:        CWD: C:\Users\pt\AppData\Roaming\FAHClient\work
15:33:10:WU00:FS00:0x22:************************************ OpenMM ************************************
15:33:10:WU00:FS00:0x22:    Version: 7.6.0
15:33:10:WU00:FS00:0x22:********************************************************************************
15:33:10:WU00:FS00:0x22:Project: 17804 (Run 41, Clone 89, Gen 821)
15:33:10:WU00:FS00:0x22:Unit: 0x00000000000000000000000000000000
15:33:10:WU00:FS00:0x22:Reading tar file core.xml
15:33:10:WU00:FS00:0x22:Reading tar file integrator.xml.bz2
15:33:10:WU00:FS00:0x22:Reading tar file state.xml.bz2
15:33:10:WU00:FS00:0x22:Reading tar file system.xml.bz2
15:33:10:WU00:FS00:0x22:Digital signatures verified
15:33:10:WU00:FS00:0x22:Folding@home GPU Core22 Folding@home Core
15:33:10:WU00:FS00:0x22:Version 0.0.18
15:33:10:WU00:FS00:0x22:  Checkpoint write interval: 125000 steps (5%) [20 total]
15:33:10:WU00:FS00:0x22:  JSON viewer frame write interval: 25000 steps (1%) [100 total]
15:33:10:WU00:FS00:0x22:  XTC frame write interval: 25000 steps (1%) [100 total]
15:33:10:WU00:FS00:0x22:  Global context and integrator variables write interval: disabled
15:33:10:WU00:FS00:0x22:There are 4 platforms available.
15:33:10:WU00:FS00:0x22:Platform 0: Reference
15:33:10:WU00:FS00:0x22:Platform 1: CPU
15:33:10:WU00:FS00:0x22:Platform 2: OpenCL
15:33:10:WU00:FS00:0x22:  opencl-device 0 specified
15:33:10:WU00:FS00:0x22:Platform 3: CUDA
15:33:10:WU00:FS00:0x22:  cuda-device 0 specified
15:33:15:WU01:FS00:Upload 55.88%
15:33:17:WU00:FS00:0x22:Attempting to create CUDA context:
15:33:17:WU00:FS00:0x22:  Configuring platform CUDA
15:33:23:WU00:FS00:0x22:  Using CUDA and gpu 0
15:33:23:WU00:FS00:0x22:Completed 0 out of 2500000 steps (0%)
15:33:23:WU01:FS00:Upload complete
15:33:23:WU01:FS00:Server responded WORK_ACK (400)
15:33:23:WU01:FS00:Final credit estimate, 241420.00 points
15:33:23:WU01:FS00:Cleaning up
15:33:23:WU00:FS00:0x22:Checkpoint completed at step 0
15:34:06:WU00:FS00:0x22:Completed 25000 out of 2500000 steps (1%)
15:34:49:WU00:FS00:0x22:Completed 50000 out of 2500000 steps (2%)
15:35:32:WU00:FS00:0x22:Completed 75000 out of 2500000 steps (3%)
15:36:15:WU00:FS00:0x22:Completed 100000 out of 2500000 steps (4%)
15:36:58:WU00:FS00:0x22:Completed 125000 out of 2500000 steps (5%)
15:36:59:WU00:FS00:0x22:Checkpoint completed at step 125000
15:37:41:WU00:FS00:0x22:Completed 150000 out of 2500000 steps (6%)
15:38:24:WU00:FS00:0x22:Completed 175000 out of 2500000 steps (7%)
15:49:03:WU00:FS00:0x22:Watchdog triggered, requesting soft shutdown down
15:59:03:WU00:FS00:0x22:Watchdog shutdown failed, hard shutdown triggered
15:59:04:WARNING:WU00:FS00:FahCore returned an unknown error code which probably indicates that it crashed
15:59:04:WARNING:WU00:FS00:FahCore returned: WU_STALLED (127 = 0x7f)
15:59:04:WU00:FS00:Starting
- TulaneBaG
XanderF
Posts: 42
Joined: Thu Aug 11, 2011 12:25 am

Re: core22 0.0.10 released to full FAH!

Post by XanderF »

Does this require a particular cuda version for nVidia GPUs?
Neil-B
Posts: 2027
Joined: Sun Mar 22, 2020 5:52 pm
Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21
Location: UK

Re: core22 0.0.10 released to full FAH!

Post by Neil-B »

tulanebarandgrill wrote:Seeing the same here

During the period between the watchdog trigger and the hard shutdown I noticed the GPU was running at a high clock rate but at a very low temperature, and the folds were completing in about 40 seconds. This is not normal and with a working WU the clock rate is about 2 - 2.5% slower with a temp in the 70's instead of the 50's (C).

2080 Ti, 5960X CPU; CPU running around 15% load during a well behaving WU.

Code: Select all

15:33:10:WU00:FS00:0x22:*********************** Log Started 2022-01-17T15:33:10Z ***********************
15:33:10:WU00:FS00:0x22:*************************** Core22 Folding@home Core ***************************
15:33:10:WU00:FS00:0x22:       Core: Core22
15:33:10:WU00:FS00:0x22:       Type: 0x22
15:33:10:WU00:FS00:0x22:    Version: 0.0.18
15:33:10:WU00:FS00:0x22:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:33:10:WU00:FS00:0x22:  Copyright: 2020 foldingathome.org
15:33:10:WU00:FS00:0x22:   Homepage: https://foldingathome.org/
15:33:10:WU00:FS00:0x22:       Date: Sep 28 2021
15:33:10:WU00:FS00:0x22:       Time: 05:55:05
15:33:10:WU00:FS00:0x22:   Revision: cfe3d7d990e8f456e371f8ce63b5fcc6daab2103
15:33:10:WU00:FS00:0x22:     Branch: HEAD
15:33:10:WU00:FS00:0x22:   Compiler: Visual C++
15:33:10:WU00:FS00:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
15:33:10:WU00:FS00:0x22:             -DOPENMM_VERSION="\"7.6.0\""
15:33:10:WU00:FS00:0x22:   Platform: win32 10
15:33:10:WU00:FS00:0x22:       Bits: 64
15:33:10:WU00:FS00:0x22:       Mode: Release
15:33:10:WU00:FS00:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
15:33:10:WU00:FS00:0x22:             <peastman@stanford.edu>
15:33:10:WU00:FS00:0x22:       Args: -dir 00 -suffix 01 -version 705 -lifeline 2940 -checkpoint 15
15:33:10:WU00:FS00:0x22:             -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device
15:33:10:WU00:FS00:0x22:             0 -gpu 0 verbosity 9
15:33:10:WU00:FS00:0x22:************************************ libFAH ************************************
15:33:10:WU00:FS00:0x22:       Date: Sep 28 2021
15:33:10:WU00:FS00:0x22:       Time: 05:53:43
15:33:10:WU00:FS00:0x22:   Revision: 44301ed97b996b63fe736bb8073f22209cb2b603
15:33:10:WU00:FS00:0x22:     Branch: HEAD
15:33:10:WU00:FS00:0x22:   Compiler: Visual C++
15:33:10:WU00:FS00:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
15:33:10:WU00:FS00:0x22:   Platform: win32 10
15:33:10:WU00:FS00:0x22:       Bits: 64
15:33:10:WU00:FS00:0x22:       Mode: Release
15:33:10:WU00:FS00:0x22:************************************ CBang *************************************
15:33:10:WU00:FS00:0x22:       Date: Sep 28 2021
15:33:10:WU00:FS00:0x22:       Time: 05:52:38
15:33:10:WU00:FS00:0x22:   Revision: 33fcfc2b3ed2195a423606a264718e31e6b3903f
15:33:10:WU00:FS00:0x22:     Branch: HEAD
15:33:10:WU00:FS00:0x22:   Compiler: Visual C++
15:33:10:WU00:FS00:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
15:33:10:WU00:FS00:0x22:   Platform: win32 10
15:33:10:WU00:FS00:0x22:       Bits: 64
15:33:10:WU00:FS00:0x22:       Mode: Release
15:33:10:WU00:FS00:0x22:************************************ System ************************************
15:33:10:WU00:FS00:0x22:        CPU: Intel(R) Core(TM) i7-5960X CPU @ 3.00GHz
15:33:10:WU00:FS00:0x22:     CPU ID: GenuineIntel Family 6 Model 63 Stepping 2
15:33:10:WU00:FS00:0x22:       CPUs: 16
15:33:10:WU00:FS00:0x22:     Memory: 63.91GiB
15:33:10:WU00:FS00:0x22:Free Memory: 54.94GiB
15:33:10:WU00:FS00:0x22:    Threads: WINDOWS_THREADS
15:33:10:WU00:FS00:0x22: OS Version: 6.2
15:33:10:WU00:FS00:0x22:Has Battery: false
15:33:10:WU00:FS00:0x22: On Battery: false
15:33:10:WU00:FS00:0x22: UTC Offset: -6
15:33:10:WU00:FS00:0x22:        PID: 14560
15:33:10:WU00:FS00:0x22:        CWD: C:\Users\pt\AppData\Roaming\FAHClient\work
15:33:10:WU00:FS00:0x22:************************************ OpenMM ************************************
15:33:10:WU00:FS00:0x22:    Version: 7.6.0
15:33:10:WU00:FS00:0x22:********************************************************************************
15:33:10:WU00:FS00:0x22:Project: 17804 (Run 41, Clone 89, Gen 821)
15:33:10:WU00:FS00:0x22:Unit: 0x00000000000000000000000000000000
15:33:10:WU00:FS00:0x22:Reading tar file core.xml
15:33:10:WU00:FS00:0x22:Reading tar file integrator.xml.bz2
15:33:10:WU00:FS00:0x22:Reading tar file state.xml.bz2
15:33:10:WU00:FS00:0x22:Reading tar file system.xml.bz2
15:33:10:WU00:FS00:0x22:Digital signatures verified
15:33:10:WU00:FS00:0x22:Folding@home GPU Core22 Folding@home Core
15:33:10:WU00:FS00:0x22:Version 0.0.18
15:33:10:WU00:FS00:0x22:  Checkpoint write interval: 125000 steps (5%) [20 total]
15:33:10:WU00:FS00:0x22:  JSON viewer frame write interval: 25000 steps (1%) [100 total]
15:33:10:WU00:FS00:0x22:  XTC frame write interval: 25000 steps (1%) [100 total]
15:33:10:WU00:FS00:0x22:  Global context and integrator variables write interval: disabled
15:33:10:WU00:FS00:0x22:There are 4 platforms available.
15:33:10:WU00:FS00:0x22:Platform 0: Reference
15:33:10:WU00:FS00:0x22:Platform 1: CPU
15:33:10:WU00:FS00:0x22:Platform 2: OpenCL
15:33:10:WU00:FS00:0x22:  opencl-device 0 specified
15:33:10:WU00:FS00:0x22:Platform 3: CUDA
15:33:10:WU00:FS00:0x22:  cuda-device 0 specified
15:33:15:WU01:FS00:Upload 55.88%
15:33:17:WU00:FS00:0x22:Attempting to create CUDA context:
15:33:17:WU00:FS00:0x22:  Configuring platform CUDA
15:33:23:WU00:FS00:0x22:  Using CUDA and gpu 0
15:33:23:WU00:FS00:0x22:Completed 0 out of 2500000 steps (0%)
15:33:23:WU01:FS00:Upload complete
15:33:23:WU01:FS00:Server responded WORK_ACK (400)
15:33:23:WU01:FS00:Final credit estimate, 241420.00 points
15:33:23:WU01:FS00:Cleaning up
15:33:23:WU00:FS00:0x22:Checkpoint completed at step 0
15:34:06:WU00:FS00:0x22:Completed 25000 out of 2500000 steps (1%)
15:34:49:WU00:FS00:0x22:Completed 50000 out of 2500000 steps (2%)
15:35:32:WU00:FS00:0x22:Completed 75000 out of 2500000 steps (3%)
15:36:15:WU00:FS00:0x22:Completed 100000 out of 2500000 steps (4%)
15:36:58:WU00:FS00:0x22:Completed 125000 out of 2500000 steps (5%)
15:36:59:WU00:FS00:0x22:Checkpoint completed at step 125000
15:37:41:WU00:FS00:0x22:Completed 150000 out of 2500000 steps (6%)
15:38:24:WU00:FS00:0x22:Completed 175000 out of 2500000 steps (7%)
15:49:03:WU00:FS00:0x22:Watchdog triggered, requesting soft shutdown down
15:59:03:WU00:FS00:0x22:Watchdog shutdown failed, hard shutdown triggered
15:59:04:WARNING:WU00:FS00:FahCore returned an unknown error code which probably indicates that it crashed
15:59:04:WARNING:WU00:FS00:FahCore returned: WU_STALLED (127 = 0x7f)
15:59:04:WU00:FS00:Starting
You have posted in a two year old thread about a core that hasn't been used for ages ... Did you mean to?

There appear to be a number of issues with the latest nvidia drivers which seem to cause issues with some setups ... there are threads about these - one of those may be a better place to post for assistance.
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
Post Reply