Crashes on FahCore_22

Moderators: Site Moderators, FAHC Science Team

Post Reply
Familyman_19
Posts: 17
Joined: Sat Jul 18, 2020 2:20 am

Crashes on FahCore_22

Post by Familyman_19 »

I seem to be getting some random crashes on the GPU side of things. It happens maybe every third day or so, but I get a pop up that FahCore_22 has crashed. Of course it doesn't restart until I close the pop up, which in the most recent case was after 12 hours. Here is where the log shows the crash. Each time it has been the same error. Anything I can do in these instances or is it just something to live with?

Code: Select all

23:46:25:WU00:FS01:0x22:An exception occurred at step 1053057: Error invoking kernel: CUDA_ERROR_ILLEGAL_ADDRESS (700)
23:46:25:WU00:FS01:0x22:ERROR:98: Attempting to restart from last good checkpoint by restarting core.
23:46:25:WU00:FS01:0x22:Folding@home Core Shutdown: CORE_RESTART
******************************* Date: 2020-10-23 *******************************
******************************* Date: 2020-10-23 *******************************
12:09:17:WARNING:WU00:FS01:FahCore returned an unknown error code which probably indicates that it crashed
12:09:17:WARNING:WU00:FS01:FahCore returned: UNKNOWN_ENUM (-1073740791 = 0xc0000409)
12:09:17:WU00:FS01:Starting
12:09:17:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\Mike\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/win/64bit/22-0.0.13/Core_22.fah/FahCore_22.exe -dir 00 -suffix 01 -version 706 -lifeline 4304 -checkpoint 15 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
12:09:17:WU00:FS01:Started FahCore on PID 2976
12:09:17:WU00:FS01:Core PID:8804
12:09:17:WU00:FS01:FahCore 0x22 started
12:09:17:WU00:FS01:0x22:*********************** Log Started 2020-10-23T12:09:17Z ***********************
12:09:17:WU00:FS01:0x22:*************************** Core22 Folding@home Core ***************************
12:09:17:WU00:FS01:0x22:       Core: Core22
12:09:17:WU00:FS01:0x22:       Type: 0x22
12:09:17:WU00:FS01:0x22:    Version: 0.0.13
12:09:17:WU00:FS01:0x22:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
12:09:17:WU00:FS01:0x22:  Copyright: 2020 foldingathome.org
12:09:17:WU00:FS01:0x22:   Homepage: https://foldingathome.org/
12:09:17:WU00:FS01:0x22:       Date: Sep 19 2020
12:09:17:WU00:FS01:0x22:       Time: 02:35:58
12:09:17:WU00:FS01:0x22:   Revision: 571cf95de6de2c592c7c3ed48fcfb2e33e9ea7d3
12:09:17:WU00:FS01:0x22:     Branch: core22-0.0.13
12:09:17:WU00:FS01:0x22:   Compiler: Visual C++ 2015
12:09:17:WU00:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
12:09:17:WU00:FS01:0x22:             -DOPENMM_GIT_HASH="\"189320d0\""
12:09:17:WU00:FS01:0x22:   Platform: win32 10
12:09:17:WU00:FS01:0x22:       Bits: 64
12:09:17:WU00:FS01:0x22:       Mode: Release
12:09:17:WU00:FS01:0x22:Maintainers: John Chodera <john.chodera@choderalab.org> and Peter Eastman
12:09:17:WU00:FS01:0x22:             <peastman@stanford.edu>
12:09:17:WU00:FS01:0x22:       Args: -dir 00 -suffix 01 -version 706 -lifeline 2976 -checkpoint 15
12:09:17:WU00:FS01:0x22:             -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device
12:09:17:WU00:FS01:0x22:             0 -gpu 0
12:09:17:WU00:FS01:0x22:************************************ libFAH ************************************
12:09:17:WU00:FS01:0x22:       Date: Sep 7 2020
12:09:17:WU00:FS01:0x22:       Time: 19:09:56
12:09:17:WU00:FS01:0x22:   Revision: 44301ed97b996b63fe736bb8073f22209cb2b603
12:09:17:WU00:FS01:0x22:     Branch: HEAD
12:09:17:WU00:FS01:0x22:   Compiler: Visual C++ 2015
12:09:17:WU00:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
12:09:17:WU00:FS01:0x22:   Platform: win32 10
12:09:17:WU00:FS01:0x22:       Bits: 64
12:09:17:WU00:FS01:0x22:       Mode: Release
12:09:17:WU00:FS01:0x22:************************************ CBang *************************************
12:09:17:WU00:FS01:0x22:       Date: Sep 7 2020
12:09:17:WU00:FS01:0x22:       Time: 19:08:30
12:09:17:WU00:FS01:0x22:   Revision: 33fcfc2b3ed2195a423606a264718e31e6b3903f
12:09:17:WU00:FS01:0x22:     Branch: HEAD
12:09:17:WU00:FS01:0x22:   Compiler: Visual C++ 2015
12:09:17:WU00:FS01:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /O2 /Ob3 /Zc:throwingNew /MT
12:09:17:WU00:FS01:0x22:   Platform: win32 10
12:09:17:WU00:FS01:0x22:       Bits: 64
12:09:17:WU00:FS01:0x22:       Mode: Release
12:09:17:WU00:FS01:0x22:************************************ System ************************************
12:09:17:WU00:FS01:0x22:        CPU: Intel(R) Core(TM) i7 CPU 920 @ 2.67GHz
12:09:17:WU00:FS01:0x22:     CPU ID: GenuineIntel Family 6 Model 26 Stepping 4
12:09:17:WU00:FS01:0x22:       CPUs: 8
12:09:17:WU00:FS01:0x22:     Memory: 23.99GiB
12:09:17:WU00:FS01:0x22:Free Memory: 19.41GiB
12:09:17:WU00:FS01:0x22:    Threads: WINDOWS_THREADS
12:09:17:WU00:FS01:0x22: OS Version: 6.2
12:09:17:WU00:FS01:0x22:Has Battery: false
12:09:17:WU00:FS01:0x22: On Battery: false
12:09:17:WU00:FS01:0x22: UTC Offset: -4
12:09:17:WU00:FS01:0x22:        PID: 8804
12:09:17:WU00:FS01:0x22:        CWD: C:\Users\Mike\AppData\Roaming\FAHClient\work
12:09:17:WU00:FS01:0x22:************************************ OpenMM ************************************
12:09:17:WU00:FS01:0x22:   Revision: 189320d0
12:09:17:WU00:FS01:0x22:********************************************************************************
12:09:17:WU00:FS01:0x22:Project: 17309 (Run 0, Clone 6791, Gen 0)
12:09:17:WU00:FS01:0x22:Unit: 0x0000000012bc7d9a5f91cc5ca4ca0346
12:09:17:WU00:FS01:0x22:Digital signatures verified
12:09:17:WU00:FS01:0x22:Folding@home GPU Core22 Folding@home Core
12:09:17:WU00:FS01:0x22:Version 0.0.13
12:09:17:WU00:FS01:0x22:  Checkpoint write interval: 62500 steps (5%) [20 total]
12:09:17:WU00:FS01:0x22:  JSON viewer frame write interval: 12500 steps (1%) [100 total]
12:09:17:WU00:FS01:0x22:  XTC frame write interval: 125000 steps (10%) [10 total]
12:09:17:WU00:FS01:0x22:  Global context and integrator variables write interval: disabled
12:09:17:WU00:FS01:0x22:There are 4 platforms available.
12:09:17:WU00:FS01:0x22:Platform 0: Reference
12:09:17:WU00:FS01:0x22:Platform 1: CPU
12:09:17:WU00:FS01:0x22:Platform 2: OpenCL
12:09:17:WU00:FS01:0x22:  opencl-device 0 specified
12:09:17:WU00:FS01:0x22:Platform 3: CUDA
12:09:17:WU00:FS01:0x22:  cuda-device 0 specified
12:09:41:WU00:FS01:0x22:Attempting to create CUDA context:
12:09:41:WU00:FS01:0x22:  Configuring platform CUDA
12:09:48:WU00:FS01:0x22:  Using CUDA and gpu 0
12:09:48:WU00:FS01:0x22:Completed 1000000 out of 1250000 steps (80%)
Joe_H
Site Admin
Posts: 7870
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Crashes on FahCore_22

Post by Joe_H »

If you have't cleaned dust out of your system recently, start with that. If your GPU is overclocked, and this includes factory overclocking, try reducing the clock by a bit or run at reference speeds for your card.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
foldy
Posts: 2061
Joined: Sat Dec 01, 2012 3:43 pm
Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441

Re: Crashes on FahCore_22

Post by foldy »

FahCore_22 recently switched to CUDA. So you can try to go back to OpenCL by adding extra core options: -disable-cuda
PantherX
Site Moderator
Posts: 7020
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Crashes on FahCore_22

Post by PantherX »

Also, do you have sufficient VRAM and are any other applications using VRAM on your system?
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Familyman_19
Posts: 17
Joined: Sat Jul 18, 2020 2:20 am

Re: Crashes on FahCore_22

Post by Familyman_19 »

Joe_H wrote:If you have't cleaned dust out of your system recently, start with that. If your GPU is overclocked, and this includes factory overclocking, try reducing the clock by a bit or run at reference speeds for your card.
I do have an overclock on the GPU. I dropped it 25MHz. We'll see if that helps. It didn't show any issues until the switch to CUDA, but maybe that was enough to push it over the edge.
Familyman_19
Posts: 17
Joined: Sat Jul 18, 2020 2:20 am

Re: Crashes on FahCore_22

Post by Familyman_19 »

PantherX wrote:Also, do you have sufficient VRAM and are any other applications using VRAM on your system?
I have the 6GB 1060, currently during folding I am sitting around 10% utilization. It peaked at 30% in the last 15 hours, but I did some gaming last night. I typically pause FAH while I game. The crashes have typically occurred when I'm not using the PC for anything other than folding.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Crashes on FahCore_22

Post by bruce »

CUDA processes more work per unit time so it can push an overclock "over the line" It should be noted that FAH tends to exceed the utilization rates that you get when you run conventional overclocking benchmarks. 100% means different things to different portions of your CPU which is why FAH officially not not support overclocking. You're entirely on your own when new software comes out and we manage to increase the throughput.
AnClar
Posts: 11
Joined: Sun Mar 29, 2020 3:59 am

Re: Crashes on FahCore_22

Post by AnClar »

I've had one random Fah_Core_22 crash since the switchover to CUDA from OpenCL. So far, it hasn't reoccurred, but I'll keep an eye out for any more. My Folding system is a Gen1 Core i7, with an nVidia GTX970 graphics card. I've been folding with the same kit for eight months now, error-free. This may have just been a random glitch related to a particular WU. We'll see.
Image
PantherX
Site Moderator
Posts: 7020
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Crashes on FahCore_22

Post by PantherX »

Occasionally, you may get a bad WU and there's nothing that you can do about it: viewtopic.php?f=19&t=16526
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Post Reply