16926 - Some sort of loops with this CPU WU

Moderators: Site Moderators, FAHC Science Team

VAcharonD1
Posts: 14
Joined: Tue Mar 24, 2020 2:46 am

Re: 16926 - Some sort of loops with this CPU WU

Post by VAcharonD1 »

Some more logs, this time from Linux with 12 threads (apparently lowered to 3). I've also seen it happen on a regular 3 thread slot.

Code: Select all

*********************** Log Started 2020-12-01T07:03:43Z ***********************
07:03:43:FS00:Initialized folding slot 00: cpu:12
07:03:43:WU02:FS00:Starting
07:03:43:WARNING:WU02:FS00:AS lowered CPUs from 12 to 3
07:03:43:WU02:FS00:Removing old file 'work/02/logfile_01-20201201-043902.txt'
07:03:43:WU02:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/lin/64bit-avx2-256/a8-0.0.9/Core_a8.fah/FahCore_a8 -dir 02 -suffix 01 -version 706 -lifeline 104733 -checkpoint 15 -np 3
07:03:43:WU02:FS00:Started FahCore on PID 104758
07:03:43:WU02:FS00:Core PID:104762
07:03:43:WU02:FS00:FahCore 0xa8 started
07:03:44:WU02:FS00:0xa8:*********************** Log Started 2020-12-01T07:03:43Z ***********************
07:03:44:WU02:FS00:0xa8:************************** Gromacs Folding@home Core ***************************
07:03:44:WU02:FS00:0xa8:       Core: Gromacs
07:03:44:WU02:FS00:0xa8:       Type: 0xa8
07:03:44:WU02:FS00:0xa8:    Version: 0.0.9
07:03:44:WU02:FS00:0xa8:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
07:03:44:WU02:FS00:0xa8:  Copyright: 2020 foldingathome.org
07:03:44:WU02:FS00:0xa8:   Homepage: https://foldingathome.org/
07:03:44:WU02:FS00:0xa8:       Date: Oct 28 2020
07:03:44:WU02:FS00:0xa8:       Time: 22:15:07
07:03:44:WU02:FS00:0xa8:   Compiler: GNU 8.3.0
07:03:44:WU02:FS00:0xa8:    Options: -faligned-new -std=c++14 -fsigned-char -ffunction-sections
07:03:44:WU02:FS00:0xa8:             -fdata-sections -O3 -funroll-loops -fno-pie
07:03:44:WU02:FS00:0xa8:   Platform: linux2 4.15.0-108-generic
07:03:44:WU02:FS00:0xa8:       Bits: 64
07:03:44:WU02:FS00:0xa8:       Mode: Release
07:03:44:WU02:FS00:0xa8:       SIMD: avx2_256
07:03:44:WU02:FS00:0xa8:     OpenMP: ON
07:03:44:WU02:FS00:0xa8:       CUDA: OFF
07:03:44:WU02:FS00:0xa8:       Args: -dir 02 -suffix 01 -version 706 -lifeline 104758 -checkpoint 15 -np
07:03:44:WU02:FS00:0xa8:             3
07:03:44:WU02:FS00:0xa8:************************************ libFAH ************************************
07:03:44:WU02:FS00:0xa8:       Date: Oct 28 2020
07:03:44:WU02:FS00:0xa8:       Time: 22:12:00
07:03:44:WU02:FS00:0xa8:   Compiler: GNU 8.3.0
07:03:44:WU02:FS00:0xa8:    Options: -faligned-new -std=c++14 -fsigned-char -ffunction-sections
07:03:44:WU02:FS00:0xa8:             -fdata-sections -O3 -funroll-loops -fno-pie
07:03:44:WU02:FS00:0xa8:   Platform: linux2 4.15.0-108-generic
07:03:44:WU02:FS00:0xa8:       Bits: 64
07:03:44:WU02:FS00:0xa8:       Mode: Release
07:03:44:WU02:FS00:0xa8:************************************ CBang *************************************
07:03:44:WU02:FS00:0xa8:       Date: Oct 28 2020
07:03:44:WU02:FS00:0xa8:       Time: 22:11:46
07:03:44:WU02:FS00:0xa8:   Compiler: GNU 8.3.0
07:03:44:WU02:FS00:0xa8:    Options: -faligned-new -std=c++14 -fsigned-char -ffunction-sections
07:03:44:WU02:FS00:0xa8:             -fdata-sections -O3 -funroll-loops -fno-pie -fPIC
07:03:44:WU02:FS00:0xa8:   Platform: linux2 4.15.0-108-generic
07:03:44:WU02:FS00:0xa8:       Bits: 64
07:03:44:WU02:FS00:0xa8:       Mode: Release
07:03:44:WU02:FS00:0xa8:************************************ System ************************************
07:03:44:WU02:FS00:0xa8:        CPU: Intel(R) Xeon(R) E-2286M CPU @ 2.40GHz
07:03:44:WU02:FS00:0xa8:     CPU ID: GenuineIntel Family 6 Model 158 Stepping 13
07:03:44:WU02:FS00:0xa8:       CPUs: 16
07:03:44:WU02:FS00:0xa8:     Memory: 15.48GiB
07:03:44:WU02:FS00:0xa8:Free Memory: 7.62GiB
07:03:44:WU02:FS00:0xa8:    Threads: POSIX_THREADS
07:03:44:WU02:FS00:0xa8: OS Version: 5.9
07:03:44:WU02:FS00:0xa8:Has Battery: true
07:03:44:WU02:FS00:0xa8: On Battery: false
07:03:44:WU02:FS00:0xa8: UTC Offset: -6
07:03:44:WU02:FS00:0xa8:        PID: 104762
07:03:44:WU02:FS00:0xa8:        CWD: /var/lib/fahclient/work
07:03:44:WU02:FS00:0xa8:********************************************************************************
07:03:44:WU02:FS00:0xa8:Project: 16926 (Run 84, Clone 453, Gen 1)
07:03:44:WU02:FS00:0xa8:Unit: 0x000000068120d1cc5fbd3725dfba0f2c
07:03:44:WU02:FS00:0xa8:Reading tar file core.xml
07:03:44:WU02:FS00:0xa8:Reading tar file frame1.tpr
07:03:44:WU02:FS00:0xa8:Digital signatures verified
07:03:44:WU02:FS00:0xa8:Calling: mdrun -c frame1.gro -s frame1.tpr -x frame1.xtc -cpt 15 -nt 3 -ntmpi 1
07:03:44:WU02:FS00:0xa8:Steps: first=0 total=0
07:03:45:WU02:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
07:03:45:WU02:FS00:Starting
07:03:45:WARNING:WU02:FS00:AS lowered CPUs from 12 to 3
07:03:45:WU02:FS00:Removing old file 'work/02/logfile_01-20201201-044002.txt'
07:03:45:WU02:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/lin/64bit-avx2-256/a8-0.0.9/Core_a8.fah/FahCore_a8 -dir 02 -suffix 01 -version 706 -lifeline 104733 -checkpoint 15 -np 3
07:03:45:WU02:FS00:Started FahCore on PID 104778
07:03:45:WU02:FS00:Core PID:104782
07:03:45:WU02:FS00:FahCore 0xa8 started
07:03:46:WU02:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
Mod Edit: Changed Quote Tags To Code Tags - PantherX
Lockheed_Tvr
Posts: 14
Joined: Thu Aug 03, 2017 12:23 pm

Re: 16926 - Some sort of loops with this CPU WU

Post by Lockheed_Tvr »

Linux noob here. Can some tell me how to dump a work unit in Linux? I'm on Ubuntu 18.04. My googling did not return anything helpful.

Do I delete the core or the work?

How do get around not having permission to rm?
Yeroon
Posts: 25
Joined: Tue Jul 07, 2020 11:09 pm

Re: 16926 - Some sort of loops with this CPU WU

Post by Yeroon »

You should be able to do it through the gui file manager. I dont recall any permissions issues but there is always sudo if needed.
WU locations on a default installation (at least how mine is) at /var/lib/fahclient/work/

Find your offending WU number, delete or remove to another location, restart said slot.
Maddog
Posts: 15
Joined: Wed Sep 30, 2020 2:06 pm

Re: 16926 - Some sort of loops with this CPU WU

Post by Maddog »

Do not use sudo in a gui, it can mess up permissions.
In file manager, navigate to /var/lib/fahclient/work/ . Right click on the work folder and it should give You a dropdown menu.
Choose open as administrator. There should then be no problem deleting the WU folder.
Lockheed_Tvr
Posts: 14
Joined: Thu Aug 03, 2017 12:23 pm

Re: 16926 - Some sort of loops with this CPU WU

Post by Lockheed_Tvr »

Sort of odd. Half the boxes allowed me to just delete the work folder from the gui, the others needed to be removed from terminal. There was no "open as administrator" option from the right click menu. "sudo rm -r 02" and then my password worked from the command line. Linux continues to confound me.
gunnarre
Posts: 567
Joined: Sun May 24, 2020 7:23 pm
Location: Norway

Re: 16926 - Some sort of loops with this CPU WU

Post by gunnarre »

16926 (78,49,1) dumped on Linux Mint 20, 8-thread CPU.
Image
Online: GTX 1660 Super, GTX 1080, GTX 1050 Ti 4G OC, RX580 + occasional CPU folding in the cold.
Offline: Radeon HD 7770, GTX 960, GTX 950
kjk
Posts: 6
Joined: Tue Mar 25, 2008 6:12 pm

Re: 16926 - Some sort of loops with this CPU WU

Post by kjk »

Yeah, seems I'm also been been stuck with 16926 (90, 634, 6) for a while. Just noticed today the cpu was idling. My CPU slot is set to 8. I'm on 7.6.21 and Fedora 32 (linux2 5.8.0-1-amd64).
Thus dumping it.
Post Reply