Won't start folding

Moderators: Site Moderators, FAHC Science Team

Post Reply
xXCapAwesomeXx
Posts: 5
Joined: Mon Mar 23, 2020 9:42 pm

Won't start folding

Post by xXCapAwesomeXx »

I am having problems with my folding client, I launch FAH Control and it says that my client has a wu and is running/ready but has sat at 0% and 24hrs to complete for the past day and a half. I looked through it and couldn't figure out exactly what was wrong. I was going to try a reinstall but I have a different error that prevents me from uninstalling anything, but that is for a different forum. Any advice is much appreciated.

Here is the Log

Code: Select all

23:45:46:WU01:FS00:Starting
23:45:46:WU01:FS00:Removing old file './work/01/logfile_01-20200406-231345.txt'
23:45:46:WU01:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/v7/lin/64bit/Core_a7.fah/FahCore_a7 -dir 01 -suffix 01 -version 705 -lifeline 1985 -checkpoint 15 -np 24
23:45:46:WU01:FS00:Started FahCore on PID 13643
23:45:46:WU01:FS00:Core PID:13647
23:45:46:WU01:FS00:FahCore 0xa7 started
23:45:47:WU01:FS00:0xa7:*********************** Log Started 2020-04-06T23:45:46Z ***********************
23:45:47:WU01:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
23:45:47:WU01:FS00:0xa7:       Type: 0xa7
23:45:47:WU01:FS00:0xa7:       Core: Gromacs
23:45:47:WU01:FS00:0xa7:       Args: -dir 01 -suffix 01 -version 705 -lifeline 13643 -checkpoint 15 -np
23:45:47:WU01:FS00:0xa7:             24
23:45:47:WU01:FS00:0xa7:************************************ CBang *************************************
23:45:47:WU01:FS00:0xa7:       Date: Nov 5 2019
23:45:47:WU01:FS00:0xa7:       Time: 05:57:01
23:45:47:WU01:FS00:0xa7:   Revision: 46c96f1aa8419571d83f3e63f9c99a0d602f6da9
23:45:47:WU01:FS00:0xa7:     Branch: master
23:45:47:WU01:FS00:0xa7:   Compiler: GNU 8.3.0
23:45:47:WU01:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie -fPIC
23:45:47:WU01:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
23:45:47:WU01:FS00:0xa7:       Bits: 64
23:45:47:WU01:FS00:0xa7:       Mode: Release
23:45:47:WU01:FS00:0xa7:************************************ System ************************************
23:45:47:WU01:FS00:0xa7:        CPU: Intel(R) Xeon(R) CPU X5670 @ 2.93GHz
23:45:47:WU01:FS00:0xa7:     CPU ID: GenuineIntel Family 6 Model 44 Stepping 2
23:45:47:WU01:FS00:0xa7:       CPUs: 24
23:45:47:WU01:FS00:0xa7:     Memory: 94.40GiB
23:45:47:WU01:FS00:0xa7:Free Memory: 68.38GiB
23:45:47:WU01:FS00:0xa7:    Threads: POSIX_THREADS
23:45:47:WU01:FS00:0xa7: OS Version: 5.3
23:45:47:WU01:FS00:0xa7:Has Battery: false
23:45:47:WU01:FS00:0xa7: On Battery: false
23:45:47:WU01:FS00:0xa7: UTC Offset: -5
23:45:47:WU01:FS00:0xa7:        PID: 13647
23:45:47:WU01:FS00:0xa7:        CWD: /var/lib/fahclient/work
23:45:47:WU01:FS00:0xa7:******************************** Build - libFAH ********************************
23:45:47:WU01:FS00:0xa7:    Version: 0.0.18
23:45:47:WU01:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
23:45:47:WU01:FS00:0xa7:  Copyright: 2019 foldingathome.org
23:45:47:WU01:FS00:0xa7:   Homepage: https://foldingathome.org/
23:45:47:WU01:FS00:0xa7:       Date: Nov 5 2019
23:45:47:WU01:FS00:0xa7:       Time: 06:13:26
23:45:47:WU01:FS00:0xa7:   Revision: 490c9aa2957b725af319379424d5c5cb36efb656
23:45:47:WU01:FS00:0xa7:     Branch: master
23:45:47:WU01:FS00:0xa7:   Compiler: GNU 8.3.0
23:45:47:WU01:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie
23:45:47:WU01:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
23:45:47:WU01:FS00:0xa7:       Bits: 64
23:45:47:WU01:FS00:0xa7:       Mode: Release
23:45:47:WU01:FS00:0xa7:************************************ Build *************************************
23:45:47:WU01:FS00:0xa7:       SIMD: sse2
23:45:47:WU01:FS00:0xa7:********************************************************************************
23:45:47:WU01:FS00:0xa7:Project: 16417 (Run 2541, Clone 2, Gen 1)
23:45:47:WU01:FS00:0xa7:Unit: 0x0000000196880e6e5e8a5fe643c23d89
23:45:47:WU01:FS00:0xa7:Reading tar file core.xml
23:45:47:WU01:FS00:0xa7:Reading tar file frame1.tpr
23:45:47:WU01:FS00:0xa7:Digital signatures verified
23:45:47:WU01:FS00:0xa7:Calling: mdrun -s frame1.tpr -o frame1.trr -x frame1.xtc -cpt 15 -nt 24
23:45:47:WU01:FS00:0xa7:Steps: first=250000 total=250000
23:45:47:WU01:FS00:0xa7:ERROR:
23:45:47:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
23:45:47:WU01:FS00:0xa7:ERROR:Program GROMACS, VERSION 5.0.4-20191026-456f0d636-unknown
23:45:47:WU01:FS00:0xa7:ERROR:Source code file: /host/debian-stable-64bit-core-a7-sse-release/gromacs-core/build/gromacs/src/gromacs/mdlib/domdec.c, line: 6902
23:45:47:WU01:FS00:0xa7:ERROR:
23:45:47:WU01:FS00:0xa7:ERROR:Fatal error:
23:45:47:WU01:FS00:0xa7:ERROR:There is no domain decomposition for 20 ranks that is compatible with the given box and a minimum cell size of 1.4227 nm
23:45:47:WU01:FS00:0xa7:ERROR:Change the number of ranks or mdrun option -rcon or -dds or your LINCS settings
23:45:47:WU01:FS00:0xa7:ERROR:Look in the log file for details on the domain decomposition
23:45:47:WU01:FS00:0xa7:ERROR:For more information and tips for troubleshooting, please check the GROMACS
23:45:47:WU01:FS00:0xa7:ERROR:website at http://www.gromacs.org/Documentation/Errors
23:45:47:WU01:FS00:0xa7:ERROR:-------------------------------------------------------
23:45:51:WU01:FS00:0xa7:WARNING:Unexpected exit() call
23:45:51:WU01:FS00:0xa7:WARNING:Unexpected exit from science code
23:45:51:WU01:FS00:0xa7:Saving result file ../logfile_01.txt
23:45:51:WU01:FS00:0xa7:Saving result file md.log
23:45:51:WU01:FS00:0xa7:Saving result file science.log
23:45:52:WU01:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
kostuek
Posts: 32
Joined: Tue Mar 17, 2020 11:03 am

Re: Won't start folding

Post by kostuek »

It is probably due to 24 cores in one slot. Try to reduce this number, but pay attention to choose a new number such that it is a multiple of 2 or 3. For example instead of 1 slot with 24 cores you could try to run 2 slots with 12 cores each.
Joe_H
Site Admin
Posts: 7868
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Won't start folding

Post by Joe_H »

The issue is not the 24 cores, but that the WU downloaded s forcing it to run at 20. This problem has been reported to the researcher based on another report involving Project 16417. He is looking into it.

In the meantime, you can try pausing the WU and trying to get it to run at a setting for fewer CPU threads. 18, 16 or 12 should be usable.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
xXCapAwesomeXx
Posts: 5
Joined: Mon Mar 23, 2020 9:42 pm

Re: Won't start folding

Post by xXCapAwesomeXx »

Looks like it is working with 12 cores, ill let it run overnight and see what happens in the morning. Thanks so much for the info!
northcup
Posts: 3
Joined: Thu Mar 26, 2020 9:32 am

Re: Won't start folding

Post by northcup »

For 3 days I have had the problem that the Nvidia GPU aborts the calculation immediately. The client works on another equivalent computer.

Code: Select all

05:24:07:WU01:FS00:Connecting to 65.254.110.245:8080
05:24:08:WU01:FS00:Assigned to work server 40.114.52.201
05:24:08:WU01:FS00:Requesting new work unit for slot 00: READY gpu:0:GP107 [GeForce GTX 1050 Ti]  2138 from 40.114.52.201
05:24:08:WU01:FS00:Connecting to 40.114.52.201:8080
05:24:25:WU01:FS00:Downloading 29.59MiB
05:24:31:WU01:FS00:Download 30.63%
05:24:37:WU01:FS00:Download 64.42%
05:24:43:WU01:FS00:Download 97.59%
05:24:43:WU01:FS00:Download complete
05:24:43:WU01:FS00:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:11778 run:0 clone:12519 gen:19 core:0x22 unit:0x00000023287234c95e7432171900f6fb
05:24:43:WU01:FS00:Starting
05:24:43:WU01:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/v7/lin/64bit/Core_22.fah/FahCore_22 -dir 01 -suffix 01 -version 705 -lifeline 12706 -checkpoint 15 -gpu-vendor nvidia -opencl-device 0 -gpu 0
05:24:43:WU01:FS00:Started FahCore on PID 13336
05:24:43:WU01:FS00:Core PID:13340
05:24:43:WU01:FS00:FahCore 0x22 started
05:24:44:WARNING:WU01:FS00:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
05:24:44:WU01:FS00:Sending unit results: id:01 state:SEND error:FAULTY project:11778 run:0 clone:12519 gen:19 core:0x22 unit:0x00000023287234c95e7432171900f6fb
05:24:44:WU01:FS00:Uploading 7.00KiB to 40.114.52.201
05:24:44:WU01:FS00:Connecting to 40.114.52.201:8080
05:24:44:WU02:FS00:Connecting to 65.254.110.245:8080
05:24:45:WU02:FS00:Assigned to work server 40.114.52.201
05:24:45:WU02:FS00:Requesting new work unit for slot 00: READY gpu:0:GP107 [GeForce GTX 1050 Ti]  2138 from 40.114.52.201
05:24:45:WU02:FS00:Connecting to 40.114.52.201:8080
05:25:16:WU01:FS00:Upload 100.00%
northcup
Posts: 3
Joined: Thu Mar 26, 2020 9:32 am

Re: Won't start folding

Post by northcup »

Here is an addition to my previous post:
Thanks for help!

Code: Select all

06:00:36:WU02:FS00:0x22:Project: 11759 (Run 0, Clone 3702, Gen 17)
06:00:36:WU02:FS00:0x22:Unit: 0x0000002680fccb0a5e6d7cb0041e6603
06:00:36:WU02:FS00:0x22:Reading tar file core.xml
06:00:36:WU02:FS00:0x22:Reading tar file integrator.xml
06:00:36:WU02:FS00:0x22:Reading tar file state.xml
06:00:36:WU02:FS00:0x22:Reading tar file system.xml
06:00:36:WU02:FS00:0x22:Digital signatures verified
06:00:36:WU02:FS00:0x22:Folding@home GPU Core22 Folding@home Core
06:00:36:WU02:FS00:0x22:Version 0.0.2
06:00:36:WU02:FS00:0x22:ERROR:126: Bad platformId size.
06:00:36:WU02:FS00:0x22:Saving result file ../logfile_01.txt
06:00:36:WU02:FS00:0x22:Saving result file science.log
06:00:36:WU02:FS00:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
06:00:36:WARNING:WU02:FS00:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
06:00:36:WU02:FS00:Sending unit results: id:02 state:SEND error:FAULTY project:11759 run:0 clone:3702 gen:17 core:0x22 unit:0x0000002680fccb0a5e6d7cb0041e6603
06:00:36:WU02:FS00:Uploading 7.00KiB to 128.252.203.10
06:00:36:WU02:FS00:Connecting to 128.252.203.10:8080
06:00:36:WU01:FS00:Connecting to 65.254.110.245:8080
06:00:37:WU01:FS00:Assigned to work server 140.163.4.231
06:00:37:WU01:FS00:Requesting new work unit for slot 00: READY gpu:0:GP107 [GeForce GTX 1050 Ti]  2138 from 140.163.4.231
06:00:37:WU01:FS00:Connecting to 140.163.4.231:8080
Mod Edit: Fixed Code Tags - PantherX
PantherX
Site Moderator
Posts: 7020
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Won't start folding

Post by PantherX »

Welcome to the F@H Forum northcup,

Can you please post your log file. Ensure that you have copied the System configuration which is present at the start of the log file (viewtopic.php?f=2&t=26036). Can you also tell us what driver version you have installed on your system?
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
northcup
Posts: 3
Joined: Thu Mar 26, 2020 9:32 am

Re: Won't start folding

Post by northcup »

I solved the problem. Thanks. A kernel update a few days ago has disabled the graphic-driver's Cuda functionality. I just reinstalled the NVIDIA drivers and Folding @ Home is running on the GPU again! Excuse the circumstances
Post Reply