A4 Unit Stuck on Windows 10

Moderators: Site Moderators, FAHC Science Team

Post Reply
Nert
Posts: 162
Joined: Wed Mar 26, 2014 7:46 pm

A4 Unit Stuck on Windows 10

Post by Nert »

I'm not sure what's going on, but I seem to have a work unit stuck. Advanced control shows my a4 unit with an ETA of 38.23 days. That seems a tad long for an i7-4790K :shock: The system is Windows 10 running CPU and GPU. GPU is a GTX 750 TI, and that seems to be working fine. Here's the log following a restart:

Code: Select all

*********************** Log Started 2015-09-14T22:26:28Z ***********************
22:26:28:************************* Folding@home Client *************************
22:26:28:      Website: http://folding.stanford.edu/
22:26:28:    Copyright: (c) 2009-2014 Stanford University
22:26:28:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
22:26:28:         Args: 
22:26:28:       Config: C:/Users/roger/AppData/Roaming/FAHClient/config.xml
22:26:28:******************************** Build ********************************
22:26:28:      Version: 7.4.4
22:26:28:         Date: Mar 4 2014
22:26:28:         Time: 20:26:54
22:26:28:      SVN Rev: 4130
22:26:28:       Branch: fah/trunk/client
22:26:28:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
22:26:28:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
22:26:28:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
22:26:28:     Platform: win32 XP
22:26:28:         Bits: 32
22:26:28:         Mode: Release
22:26:28:******************************* System ********************************
22:26:28:          CPU: Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
22:26:28:       CPU ID: GenuineIntel Family 6 Model 60 Stepping 3
22:26:28:         CPUs: 8
22:26:28:       Memory: 15.94GiB
22:26:28:  Free Memory: 14.43GiB
22:26:28:      Threads: WINDOWS_THREADS
22:26:28:   OS Version: 6.2
22:26:28:  Has Battery: false
22:26:28:   On Battery: false
22:26:28:   UTC Offset: -5
22:26:28:          PID: 6496
22:26:28:          CWD: C:/Users/roger/AppData/Roaming/FAHClient
22:26:28:           OS: Windows 10 Home
22:26:28:      OS Arch: AMD64
22:26:28:         GPUs: 1
22:26:28:        GPU 0: NVIDIA:4 GM107 [GeForce GTX 750 Ti]
22:26:28:         CUDA: 5.0
22:26:28:  CUDA Driver: 7050
22:26:28:Win32 Service: false
22:26:28:***********************************************************************
22:26:28:<config>
22:26:28:  <!-- Folding Slot Configuration -->
22:26:28:  <cause v='ALZHEIMERS'/>
22:26:28:
22:26:28:  <!-- Network -->
22:26:28:  <proxy v=':8080'/>
22:26:28:
22:26:28:  <!-- Slot Control -->
22:26:28:  <power v='FULL'/>
22:26:28:
22:26:28:  <!-- User Information -->
22:26:28:  <passkey v='********************************'/>
22:26:28:  <team v='165780'/>
22:26:28:  <user v='nert'/>
22:26:28:
22:26:28:  <!-- Folding Slots -->
22:26:28:  <slot id='0' type='CPU'/>
22:26:28:  <slot id='1' type='GPU'/>
22:26:28:</config>
22:26:28:Trying to access database...
22:26:28:Successfully acquired database lock
22:26:28:Enabled folding slot 00: READY cpu:7
22:26:28:Enabled folding slot 01: READY gpu:0:GM107 [GeForce GTX 750 Ti]
22:26:28:WU02:FS00:Starting
22:26:28:WU02:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/roger/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/Core_a4.fah/FahCore_a4.exe -dir 02 -suffix 01 -version 704 -lifeline 6496 -checkpoint 15 -np 7
22:26:29:WU02:FS00:Started FahCore on PID 6760
22:26:29:WU02:FS00:Core PID:6780
22:26:29:WU02:FS00:FahCore 0xa4 started
22:26:29:WU00:FS01:Starting
22:26:29:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/roger/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_18.fah/FahCore_18.exe -dir 00 -suffix 01 -version 704 -lifeline 6496 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
22:26:29:WU00:FS01:Started FahCore on PID 6180
22:26:29:WU02:FS00:0xa4:
22:26:29:WU02:FS00:0xa4:*------------------------------*
22:26:29:WU02:FS00:0xa4:Folding@Home Gromacs GB Core
22:26:29:WU02:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
22:26:29:WU02:FS00:0xa4:
22:26:29:WU02:FS00:0xa4:Preparing to commence simulation
22:26:29:WU02:FS00:0xa4:- Ensuring status. Please wait.
22:26:30:WU00:FS01:Core PID:884
22:26:30:WU00:FS01:FahCore 0x18 started
22:26:31:WU00:FS01:0x18:*********************** Log Started 2015-09-14T22:26:31Z ***********************
22:26:31:WU00:FS01:0x18:Project: 9125 (Run 20, Clone 0, Gen 44)
22:26:31:WU00:FS01:0x18:Unit: 0x0000003d0a3b1e805543ed94be02d460
22:26:31:WU00:FS01:0x18:CPU: 0x00000000000000000000000000000000
22:26:31:WU00:FS01:0x18:Machine: 1
22:26:31:WU00:FS01:0x18:Digital signatures verified
22:26:31:WU00:FS01:0x18:Folding@home GPU core18
22:26:31:WU00:FS01:0x18:Version 0.0.4
22:26:31:WU00:FS01:0x18:  Found a checkpoint file
22:26:38:WU02:FS00:0xa4:- Looking at optimizations...
22:26:38:WU02:FS00:0xa4:- Working with standard loops on this execution.
22:26:38:WU02:FS00:0xa4:- Previous termination of core was improper.
22:26:38:WU02:FS00:0xa4:- Files status OK
22:26:39:WU02:FS00:0xa4:- Expanded 1865009 -> 3294344 (decompressed 176.6 percent)
22:26:39:WU02:FS00:0xa4:Called DecompressByteArray: compressed_data_size=1865009 data_size=3294344, decompressed_data_size=3294344 diff=0
22:26:39:WU02:FS00:0xa4:- Digital signature verified
22:26:39:WU02:FS00:0xa4:
22:26:39:WU02:FS00:0xa4:Project: 7520 (Run 9, Clone 38, Gen 108)
22:26:39:WU02:FS00:0xa4:
22:26:39:WU02:FS00:0xa4:Entering M.D.
22:26:45:WU02:FS00:0xa4:Using Gromacs checkpoints
22:26:45:WU02:FS00:0xa4:Mapping NT from 7 to 7 
22:26:45:WU02:FS00:0xa4:Resuming from checkpoint
22:26:45:WU02:FS00:0xa4:Verified 02/wudata_01.log
22:26:46:WU02:FS00:0xa4:Verified 02/wudata_01.trr
22:26:46:WU02:FS00:0xa4:Verified 02/wudata_01.xtc
22:26:46:WU02:FS00:0xa4:Verified 02/wudata_01.edr
22:26:46:WU02:FS00:0xa4:Completed 1385490 out of 55000000 steps  (2%)
22:27:03:WU00:FS01:0x18:Completed 800000 out of 2500000 steps (32%)
22:27:03:WU00:FS01:0x18:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
22:33:06:WU00:FS01:0x18:Completed 825000 out of 2500000 steps (33%)
22:39:00:WU00:FS01:0x18:Completed 850000 out of 2500000 steps (34%)
22:44:51:WU00:FS01:0x18:Completed 875000 out of 2500000 steps (35%)
I did have a lock up prior to this, but don't know if it's related. Something's definitely wrong here, but I don't know what it is. Let me know if there is any other diagnostic info. that I can add. How do I cancel this work unit that's stuck, and how do I prevent this from happening again ? ... I couldn't figure out why my PPD were dropping. Now I think I know. Looks like something's wrong.

Let me know what else I can provide.
Image
Joe_H
Site Admin
Posts: 7854
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: A4 Unit Stuck on Windows 10

Post by Joe_H »

The Project 7520 WU is reporting the wrong number of steps. It should be 500,000 steps if I recall correctly, not 55M steps. I will report it as bad.

To get rid of it, first pause the CPU folding slot. Then using the Configure function in FAHControl delete the slot and save that change in configuration. After a short period of time the client will delete because there is no slot to process the WU. Then recreate the CPU slot.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Nert
Posts: 162
Joined: Wed Mar 26, 2014 7:46 pm

Re: A4 Unit Stuck on Windows 10

Post by Nert »

Thanks Joe_H. Looks like it's working now. I guess I'll have to pay more attention going forward. I think I lost a couple of days of folding science.
Image
Post Reply