Core seems stuck after start WU

If you think it might be a driver problem, see viewforum.php?f=79

Moderators: Site Moderators, FAHC Science Team

Post Reply
IkkeDus
Posts: 14
Joined: Wed Jun 18, 2008 10:42 am
Hardware configuration: Q9550 @ 2.8 GHz
WIN10 x64
2x Radeon R9 280X-3GB
1x Radeon R9 7950-3GB
Location: Amsterdam, The Netherlands

Core seems stuck after start WU

Post by IkkeDus »

AMD R9 280X 3GB (ID: 6798 SUB: 3001)
Identified as: TAHITI XT [RADEON R9 200/HD 7900/8970]

Project: 11761

I seem to encounter a strange phenomena where a WU is downloaded but the work never starts. It resembles this viewtopic.php?f=74&t=32991&p=319598#p319598 but it not the same.

The log sits at ".. 0x22:Version 0.0.2" line.
Pause/unpause has no effect (not in this log).
When I reboot the work starts.
Apart from the manual entry to indicate the reboot, the timeline is as is.

Code: Select all

08:38:12:WU04:FS03:Received Unit: id:04 state:DOWNLOAD error:NO_ERROR project:11761 run:0 clone:6262 gen:13 core:0x22 unit:0x0000001780fccb0a5e6fcf7f5e2f0ebf
08:38:12:WU04:FS03:Starting
08:38:12:WU04:FS03:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/Core_22.fah/FahCore_22.exe -dir 04 -suffix 01 -version 705 -lifeline 8624 -checkpoint 15 -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
08:38:12:WU04:FS03:Started FahCore on PID 956
08:38:12:WU04:FS03:Core PID:8156
08:38:12:WU04:FS03:FahCore 0x22 started
08:38:13:WU04:FS03:0x22:*********************** Log Started 2020-03-28T08:38:13Z ***********************
08:38:13:WU04:FS03:0x22:*************************** Core22 Folding@home Core ***************************
08:38:13:WU04:FS03:0x22:       Type: 0x22
08:38:13:WU04:FS03:0x22:       Core: Core22
08:38:13:WU04:FS03:0x22:    Website: https://foldingathome.org/
08:38:13:WU04:FS03:0x22:  Copyright: (c) 2009-2018 foldingathome.org
08:38:13:WU04:FS03:0x22:     Author: John Chodera <john.chodera@choderalab.org> and Rafal Wiewiora
08:38:13:WU04:FS03:0x22:             <rafal.wiewiora@choderalab.org>
08:38:13:WU04:FS03:0x22:       Args: -dir 04 -suffix 01 -version 705 -lifeline 956 -checkpoint 15
08:38:13:WU04:FS03:0x22:             -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
08:38:13:WU04:FS03:0x22:     Config: <none>
08:38:13:WU04:FS03:0x22:************************************ Build *************************************
08:38:13:WU04:FS03:0x22:    Version: 0.0.2
08:38:13:WU04:FS03:0x22:       Date: Dec 6 2019
08:38:13:WU04:FS03:0x22:       Time: 21:30:31
08:38:13:WU04:FS03:0x22: Repository: Git
08:38:13:WU04:FS03:0x22:   Revision: abeb39247cc72df5af0f63723edafadb23d5dfbe
08:38:13:WU04:FS03:0x22:     Branch: HEAD
08:38:13:WU04:FS03:0x22:   Compiler: Visual C++ 2008
08:38:13:WU04:FS03:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
08:38:13:WU04:FS03:0x22:   Platform: win32 10
08:38:13:WU04:FS03:0x22:       Bits: 64
08:38:13:WU04:FS03:0x22:       Mode: Release
08:38:13:WU04:FS03:0x22:************************************ System ************************************
08:38:13:WU04:FS03:0x22:        CPU: Intel(R) Core(TM)2 Quad CPU Q9550 @ 2.83GHz
08:38:13:WU04:FS03:0x22:     CPU ID: GenuineIntel Family 6 Model 23 Stepping 10
08:38:13:WU04:FS03:0x22:       CPUs: 4
08:38:13:WU04:FS03:0x22:     Memory: 4.00GiB
08:38:13:WU04:FS03:0x22:Free Memory: 2.22GiB
08:38:13:WU04:FS03:0x22:    Threads: WINDOWS_THREADS
08:38:13:WU04:FS03:0x22: OS Version: 6.2
08:38:13:WU04:FS03:0x22:Has Battery: false
08:38:13:WU04:FS03:0x22: On Battery: false
08:38:13:WU04:FS03:0x22: UTC Offset: 1
08:38:13:WU04:FS03:0x22:        PID: 8156
08:38:13:WU04:FS03:0x22:        CWD: C:\Users\\AppData\Roaming\FAHClient\work
08:38:13:WU04:FS03:0x22:         OS: Windows 10 Pro
08:38:13:WU04:FS03:0x22:    OS Arch: AMD64
08:38:13:WU04:FS03:0x22:********************************************************************************
08:38:13:WU04:FS03:0x22:Project: 11761 (Run 0, Clone 6262, Gen 13)
08:38:13:WU04:FS03:0x22:Unit: 0x0000001780fccb0a5e6fcf7f5e2f0ebf
08:38:13:WU04:FS03:0x22:Reading tar file core.xml
08:38:13:WU04:FS03:0x22:Reading tar file integrator.xml
08:38:13:WU04:FS03:0x22:Reading tar file state.xml
08:38:13:WU04:FS03:0x22:Reading tar file system.xml
08:38:14:WU04:FS03:0x22:Digital signatures verified
08:38:14:WU04:FS03:0x22:Folding@home GPU Core22 Folding@home Core
08:38:14:WU04:FS03:0x22:Version 0.0.2

<<< MANUAL REBOOT>>>

*********************** Log Started 2020-03-28T10:39:32Z ***********************
10:39:32:WU04:FS03:Starting
10:39:32:WU04:FS03:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/v7/win/64bit/Core_22.fah/FahCore_22.exe -dir 04 -suffix 01 -version 705 -lifeline 7792 -checkpoint 15 -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
10:39:32:WU04:FS03:Started FahCore on PID 7304
10:39:32:WU04:FS03:Core PID:7332
10:39:32:WU04:FS03:FahCore 0x22 started
10:39:33:WU04:FS03:0x22:*********************** Log Started 2020-03-28T10:39:32Z ***********************
10:39:33:WU04:FS03:0x22:*************************** Core22 Folding@home Core ***************************
10:39:33:WU04:FS03:0x22:       Type: 0x22
10:39:33:WU04:FS03:0x22:       Core: Core22
10:39:33:WU04:FS03:0x22:    Website: https://foldingathome.org/
10:39:33:WU04:FS03:0x22:  Copyright: (c) 2009-2018 foldingathome.org
10:39:33:WU04:FS03:0x22:     Author: John Chodera <john.chodera@choderalab.org> and Rafal Wiewiora
10:39:33:WU04:FS03:0x22:             <rafal.wiewiora@choderalab.org>
10:39:33:WU04:FS03:0x22:       Args: -dir 04 -suffix 01 -version 705 -lifeline 7304 -checkpoint 15
10:39:33:WU04:FS03:0x22:             -gpu-vendor amd -opencl-platform 0 -opencl-device 0 -gpu 0
10:39:33:WU04:FS03:0x22:     Config: <none>
10:39:33:WU04:FS03:0x22:************************************ Build *************************************
10:39:33:WU04:FS03:0x22:    Version: 0.0.2
10:39:33:WU04:FS03:0x22:       Date: Dec 6 2019
10:39:33:WU04:FS03:0x22:       Time: 21:30:31
10:39:33:WU04:FS03:0x22: Repository: Git
10:39:33:WU04:FS03:0x22:   Revision: abeb39247cc72df5af0f63723edafadb23d5dfbe
10:39:33:WU04:FS03:0x22:     Branch: HEAD
10:39:33:WU04:FS03:0x22:   Compiler: Visual C++ 2008
10:39:33:WU04:FS03:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
10:39:33:WU04:FS03:0x22:   Platform: win32 10
10:39:33:WU04:FS03:0x22:       Bits: 64
10:39:33:WU04:FS03:0x22:       Mode: Release
10:39:33:WU04:FS03:0x22:************************************ System ************************************
10:39:33:WU04:FS03:0x22:        CPU: Intel(R) Core(TM)2 Quad CPU Q9550 @ 2.83GHz
10:39:33:WU04:FS03:0x22:     CPU ID: GenuineIntel Family 6 Model 23 Stepping 10
10:39:33:WU04:FS03:0x22:       CPUs: 4
10:39:33:WU04:FS03:0x22:     Memory: 4.00GiB
10:39:33:WU04:FS03:0x22:Free Memory: 2.29GiB
10:39:33:WU04:FS03:0x22:    Threads: WINDOWS_THREADS
10:39:33:WU04:FS03:0x22: OS Version: 6.2
10:39:33:WU04:FS03:0x22:Has Battery: false
10:39:33:WU04:FS03:0x22: On Battery: false
10:39:33:WU04:FS03:0x22: UTC Offset: 1
10:39:33:WU04:FS03:0x22:        PID: 7332
10:39:33:WU04:FS03:0x22:        CWD: C:\Users\\AppData\Roaming\FAHClient\work
10:39:33:WU04:FS03:0x22:         OS: Windows 10 Pro
10:39:33:WU04:FS03:0x22:    OS Arch: AMD64
10:39:33:WU04:FS03:0x22:********************************************************************************
10:39:33:WU04:FS03:0x22:Project: 11761 (Run 0, Clone 6262, Gen 13)
10:39:33:WU04:FS03:0x22:Unit: 0x0000001780fccb0a5e6fcf7f5e2f0ebf
10:39:33:WU04:FS03:0x22:Reading tar file core.xml
10:39:33:WU04:FS03:0x22:Reading tar file integrator.xml
10:39:33:WU04:FS03:0x22:Reading tar file state.xml
10:39:33:WU04:FS03:0x22:Reading tar file system.xml
10:39:34:WU04:FS03:0x22:Digital signatures verified
10:39:34:WU04:FS03:0x22:Folding@home GPU Core22 Folding@home Core
10:39:34:WU04:FS03:0x22:Version 0.0.2
10:39:56:WU04:FS03:0x22:Completed 0 out of 2000000 steps (0%)
10:39:56:WU04:FS03:0x22:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
Q9550 @ 2.8 GHz | 2x R9 280X-3GB | HD 7950-3GB | Win10 x64

Image
davidcoton
Posts: 1102
Joined: Wed Nov 05, 2008 3:19 pm
Location: Cambridge, UK

Re: Core seems stuck after start WU

Post by davidcoton »

Can you post the full log, including the system details? In Advanced Control, click Refresh before Copy.
Image
IkkeDus
Posts: 14
Joined: Wed Jun 18, 2008 10:42 am
Hardware configuration: Q9550 @ 2.8 GHz
WIN10 x64
2x Radeon R9 280X-3GB
1x Radeon R9 7950-3GB
Location: Amsterdam, The Netherlands

Re: Core seems stuck after start WU

Post by IkkeDus »

Can you elaborate with what you mean with 'advanced control'?
The log was as is. I run 7.5.1 on Windows 10.
Q9550 @ 2.8 GHz | 2x R9 280X-3GB | HD 7950-3GB | Win10 x64

Image
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Core seems stuck after start WU

Post by bruce »

I have to guess, but maybe you're concerned about the time between the message
10:39:56:WU04:FS03:0x22:Completed 0 out of 2000000 steps (0%)
and another line that says:
hh:mm:ss:WU04:FS03:0x22:Completed 20000 out of 2000000 steps (1%)

Depending both on your actual hardware and on the complexity of your protein, it can take quite a while to finish that first 1% ... and probably an equal time to complete the next 1%. During those status updates, there's really noting to say except "I'm working on it." To get to 100% of a the calculation that takes a few days, you don't get updates every few minutes.
IkkeDus
Posts: 14
Joined: Wed Jun 18, 2008 10:42 am
Hardware configuration: Q9550 @ 2.8 GHz
WIN10 x64
2x Radeon R9 280X-3GB
1x Radeon R9 7950-3GB
Location: Amsterdam, The Netherlands

Re: Core seems stuck after start WU

Post by IkkeDus »

No that is not my concern, that is only to indicate that the calculation seem to have started.
The observation is that I seem to have had several cases that the calculation doesn't seem to run. Only when I reboot it seems to start. The log shows that. It sits for 2 hours when I noticed that is was still at 0%. After reboot the logs shows progress (as I didn't focus on the times of the updates after it started, I didn't include the next line)
Typically a WU takes around 4-6 hours on my R9 280X and around 10-12 hours on a 7950 I have running.
I currently have for the same project (11761) a WU running which took around 40s to show the first completed cycle. The updates are approx. 3.5 minutes apart. Which is what I observe in most cases. So having no updates for hours seems weird...
Q9550 @ 2.8 GHz | 2x R9 280X-3GB | HD 7950-3GB | Win10 x64

Image
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Core seems stuck after start WU

Post by bruce »

Start one of the classic performance monitors that can tell what your hardware is doing. Be sure you're using one that reports the performance of the GPU, and it's temperature.
IkkeDus
Posts: 14
Joined: Wed Jun 18, 2008 10:42 am
Hardware configuration: Q9550 @ 2.8 GHz
WIN10 x64
2x Radeon R9 280X-3GB
1x Radeon R9 7950-3GB
Location: Amsterdam, The Netherlands

Re: Core seems stuck after start WU

Post by IkkeDus »

I have that (apart from the AMD overview, I use GPU-Z). All is normal. Temps around 60-65C. Load around 93%.
The setup is a new Win10 install. No OC, just stock. Cards are on risers with separate power supplies. In short: I don't see anything strange there.
Currently everything is running nicely. But once in a while this hiccup seems to occur.
Q9550 @ 2.8 GHz | 2x R9 280X-3GB | HD 7950-3GB | Win10 x64

Image
Post Reply