Page 1 of 2

GPU wu's hang at 99%

Posted: Sun Jan 26, 2014 3:18 pm
by Kent Irwin
I've been folding for awhile now with the same hardware - 4 different machines running Win 8.1. Two of the machines are I7's with R9 270 gpu's. The other two are C2D machines with 7700 series gpu's. In the last week almost all of the gpu wu's - 8900 - have been hanging at 99% reported progress. All have the latest CCC drivers.

I have uninstalled the client software and deleted all folders. Upon reinstall the same behavior. Any clues?

Re: GPU wu's hang at 99%

Posted: Sun Jan 26, 2014 6:42 pm
by 7im
Driver source and version for each PC?

Re: GPU wu's hang at 99%

Posted: Sun Jan 26, 2014 7:03 pm
by Kent Irwin
13.251.0.0 for the R9's and 9.12.0.0 for the 7700. For all they're via CCC.

Re: GPU wu's hang at 99%

Posted: Mon Jan 27, 2014 12:58 am
by PantherX
Welcome to the F@H Forum Kent Irwin,

Could you please post the log files from each of your systems which include the initial section of the log file as it contains the system configuration and F@H Settings (viewtopic.php?p=225958#p225958). Are your GPUs operating within normal temperature range or not? You could use GPU-Z (http://www.techpowerup.com/downloads/SysInfo/GPU-Z/) to help you find additional information about your GPUs.

Re: GPU wu's hang at 99%

Posted: Mon Jan 27, 2014 7:50 am
by cw6591
Hi,
Just to let you know I have the same issue, and with WU's for project 8900. The machine in question is a new build based around a 990fx/FX8320 with a R9 270 GPU. Driver is 13.251.0.0 via CCC. This has happened twice. The first time I thought it may be an issue with the build, so I checked driver revisions, patches, etc, but all seem ok. A reboot put the percentage complete from 99% to 83%, the WU processed from there and completed normally. The second happened last night, with the WU hitting 99%/3 seconds to completion and staying that way from early evening until present (07.30).

Re: GPU wu's hang at 99%

Posted: Mon Jan 27, 2014 10:06 am
by cw6591
I tried rebooting, and the WU still presented as 99%3 seconds remaining. An hour later and its now running a new WU for project 8900. However, it seems that the previous WU has vanished...
Log file

Code: Select all

*********************** Log Started 2014-01-27T08:31:19Z ***********************
08:31:19:************************* Folding@home Client *************************
08:31:19:      Website: http://folding.stanford.edu/
08:31:19:    Copyright: (c) 2009-2013 Stanford University
08:31:19:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
08:31:19:         Args: 
08:31:19:       Config: C:/ProgramData/FAHClient/config.xml
08:31:19:******************************** Build ********************************
08:31:19:      Version: 7.3.6
08:31:19:         Date: Feb 18 2013
08:31:19:         Time: 15:25:17
08:31:19:      SVN Rev: 3923
08:31:19:       Branch: fah/trunk/client
08:31:19:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
08:31:19:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
08:31:19:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
08:31:19:     Platform: win32 XP
08:31:19:         Bits: 32
08:31:19:         Mode: Release
08:31:19:******************************* System ********************************
08:31:19:          CPU: AMD FX(tm)-8320 Eight-Core Processor
08:31:19:       CPU ID: AuthenticAMD Family 21 Model 2 Stepping 0
08:31:19:         CPUs: 8
08:31:19:       Memory: 7.96GiB
08:31:19:  Free Memory: 6.47GiB
08:31:19:      Threads: WINDOWS_THREADS
08:31:19:  Has Battery: false
08:31:19:   On Battery: false
08:31:19:   UTC offset: 0
08:31:19:          PID: 3020
08:31:19:          CWD: C:/ProgramData/FAHClient
08:31:19:           OS: Windows 7 Home Premium
08:31:19:      OS Arch: AMD64
08:31:19:         GPUs: 1
08:31:19:        GPU 0: ATI:5 Hawaii [Radeon R9 200 Series]
08:31:19:         CUDA: Not detected
08:31:19:Win32 Service: false
08:31:19:***********************************************************************
08:31:19:<config>
08:31:19:  <!-- Folding Slot Configuration -->
08:31:19:  <power v='full'/>
08:31:19:
08:31:19:  <!-- Network -->
08:31:19:  <proxy v=':8080'/>
08:31:19:
08:31:19:  <!-- User Information -->
08:31:19:  <passkey v='********************************'/>
08:31:19:  <team v='33597'/>
08:31:19:  <user v='dMrItchy'/>
08:31:19:
08:31:19:  <!-- Folding Slots -->
08:31:19:  <slot id='0' type='GPU'/>
08:31:19:  <slot id='1' type='CPU'>
08:31:19:    <cpus v='-1'/>
08:31:19:  </slot>
08:31:19:</config>

Re: GPU wu's hang at 99%

Posted: Mon Jan 27, 2014 12:41 pm
by rhavern
Is that all of the log file? This portion of the log tells what you are running, but not what happened. If you could post your entire log file in code tags, we could possibly help more.

Re: GPU wu's hang at 99%

Posted: Mon Jan 27, 2014 12:59 pm
by cw6591
as requested

Code: Select all

*********************** Log Started 2014-01-27T08:31:19Z ***********************
08:31:19:************************* Folding@home Client *************************
08:31:19:      Website: http://folding.stanford.edu/
08:31:19:    Copyright: (c) 2009-2013 Stanford University
08:31:19:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
08:31:19:         Args: 
08:31:19:       Config: C:/ProgramData/FAHClient/config.xml
08:31:19:******************************** Build ********************************
08:31:19:      Version: 7.3.6
08:31:19:         Date: Feb 18 2013
08:31:19:         Time: 15:25:17
08:31:19:      SVN Rev: 3923
08:31:19:       Branch: fah/trunk/client
08:31:19:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
08:31:19:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
08:31:19:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
08:31:19:     Platform: win32 XP
08:31:19:         Bits: 32
08:31:19:         Mode: Release
08:31:19:******************************* System ********************************
08:31:19:          CPU: AMD FX(tm)-8320 Eight-Core Processor
08:31:19:       CPU ID: AuthenticAMD Family 21 Model 2 Stepping 0
08:31:19:         CPUs: 8
08:31:19:       Memory: 7.96GiB
08:31:19:  Free Memory: 6.47GiB
08:31:19:      Threads: WINDOWS_THREADS
08:31:19:  Has Battery: false
08:31:19:   On Battery: false
08:31:19:   UTC offset: 0
08:31:19:          PID: 3020
08:31:19:          CWD: C:/ProgramData/FAHClient
08:31:19:           OS: Windows 7 Home Premium
08:31:19:      OS Arch: AMD64
08:31:19:         GPUs: 1
08:31:19:        GPU 0: ATI:5 Hawaii [Radeon R9 200 Series]
08:31:19:         CUDA: Not detected
08:31:19:Win32 Service: false
08:31:19:***********************************************************************
08:31:19:<config>
08:31:19:  <!-- Folding Slot Configuration -->
08:31:19:  <power v='full'/>
08:31:19:
08:31:19:  <!-- Network -->
08:31:19:  <proxy v=':8080'/>
08:31:19:
08:31:19:  <!-- User Information -->
08:31:19:  <passkey v='********************************'/>
08:31:19:  <team v='33597'/>
08:31:19:  <user v='dMrItchy'/>
08:31:19:
08:31:19:  <!-- Folding Slots -->
08:31:19:  <slot id='0' type='GPU'/>
08:31:19:  <slot id='1' type='CPU'>
08:31:19:    <cpus v='-1'/>
08:31:19:  </slot>
08:31:19:</config>
08:31:19:Trying to access database...
08:31:19:Successfully acquired database lock
08:31:19:Enabled folding slot 00: READY gpu:0:Hawaii [Radeon R9 200 Series]
08:31:19:Enabled folding slot 01: READY cpu:7
08:31:19:WU00:FS00:Starting
08:31:19:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_17.fah/FahCore_17.exe -dir 00 -suffix 01 -version 703 -lifeline 3020 -checkpoint 15 -gpu 0 -gpu-vendor ati
08:31:19:WU00:FS00:Started FahCore on PID 4072
08:31:26:WU00:FS00:Core PID:3160
08:31:26:WU00:FS00:FahCore 0x17 started
08:31:27:WU01:FS01:Starting
08:31:27:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/Core_a4.fah/FahCore_a4.exe -dir 01 -suffix 01 -version 703 -lifeline 3020 -checkpoint 15 -np 7
08:31:28:WU00:FS00:0x17:*********************** Log Started 2014-01-27T08:31:28Z ***********************
08:31:28:WU00:FS00:0x17:Project: 8900 (Run 706, Clone 2, Gen 79)
08:31:28:WU00:FS00:0x17:Unit: 0x00000080028c126651a6c3b297475409
08:31:28:WU00:FS00:0x17:CPU: 0x00000000000000000000000000000000
08:31:28:WU00:FS00:0x17:Machine: 0
08:31:28:WU00:FS00:0x17:Digital signatures verified
08:31:28:WU00:FS00:0x17:Folding@home GPU core17
08:31:28:WU00:FS00:0x17:Version 0.0.52
08:31:29:WU01:FS01:Started FahCore on PID 3412
08:31:30:WU01:FS01:Core PID:4140
08:31:30:WU01:FS01:FahCore 0xa4 started
08:31:31:WU01:FS01:0xa4:
08:31:31:WU01:FS01:0xa4:*------------------------------*
08:31:31:WU01:FS01:0xa4:Folding@Home Gromacs GB Core
08:31:31:WU01:FS01:0xa4:Version 2.27 (Dec. 15, 2010)
08:31:31:WU01:FS01:0xa4:
08:31:31:WU01:FS01:0xa4:Preparing to commence simulation
08:31:31:WU01:FS01:0xa4:- Ensuring status. Please wait.
08:31:40:WU01:FS01:0xa4:- Looking at optimizations...
08:31:40:WU01:FS01:0xa4:- Working with standard loops on this execution.
08:31:40:WU01:FS01:0xa4:- Previous termination of core was improper.
08:31:40:WU01:FS01:0xa4:- Files status OK
08:31:40:WU01:FS01:0xa4:- Expanded 547445 -> 846876 (decompressed 154.6 percent)
08:31:40:WU01:FS01:0xa4:Called DecompressByteArray: compressed_data_size=547445 data_size=846876, decompressed_data_size=846876 diff=0
08:31:40:WU01:FS01:0xa4:- Digital signature verified
08:31:40:WU01:FS01:0xa4:
08:31:40:WU01:FS01:0xa4:Project: 7647 (Run 19, Clone 0, Gen 202)
08:31:40:WU01:FS01:0xa4:
08:31:40:WU01:FS01:0xa4:Entering M.D.
08:31:46:WU01:FS01:0xa4:Using Gromacs checkpoints
08:31:46:WU01:FS01:0xa4:Mapping NT from 7 to 7 
08:31:46:WU01:FS01:0xa4:Resuming from checkpoint
08:31:46:WU01:FS01:0xa4:Verified 01/wudata_01.log
08:31:48:WU01:FS01:0xa4:Verified 01/wudata_01.trr
08:31:48:WU01:FS01:0xa4:Verified 01/wudata_01.xtc
08:31:48:WU01:FS01:0xa4:Verified 01/wudata_01.edr
08:31:49:WU01:FS01:0xa4:Completed 1423500 out of 2500000 steps  (56%)
08:33:47:WU01:FS01:0xa4:Completed 1425000 out of 2500000 steps  (57%)
08:35:23:WU00:FS00:0x17:Completed 0 out of 2500000 steps (0%)
08:35:23:WU00:FS00:0x17:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
08:41:13:WU00:FS00:0x17:Completed 25000 out of 2500000 steps (1%)
08:47:03:WU00:FS00:0x17:Completed 50000 out of 2500000 steps (2%)
08:52:57:WU00:FS00:0x17:Completed 75000 out of 2500000 steps (3%)
08:58:29:WU00:FS00:0x17:Completed 100000 out of 2500000 steps (4%)
09:00:20:WU01:FS01:0xa4:Completed 1450000 out of 2500000 steps  (58%)
09:04:22:WU00:FS00:0x17:Completed 125000 out of 2500000 steps (5%)
09:09:53:WU00:FS00:0x17:Completed 150000 out of 2500000 steps (6%)
09:15:46:WU00:FS00:0x17:Completed 175000 out of 2500000 steps (7%)
09:21:17:WU00:FS00:0x17:Completed 200000 out of 2500000 steps (8%)
09:27:10:WU00:FS00:0x17:Completed 225000 out of 2500000 steps (9%)
09:27:11:WU01:FS01:0xa4:Completed 1475000 out of 2500000 steps  (59%)
09:32:43:WU00:FS00:0x17:Completed 250000 out of 2500000 steps (10%)
09:38:48:WU00:FS00:0x17:Completed 275000 out of 2500000 steps (11%)
09:44:22:WU00:FS00:0x17:Completed 300000 out of 2500000 steps (12%)
09:50:15:WU00:FS00:0x17:Completed 325000 out of 2500000 steps (13%)
09:54:16:WU01:FS01:0xa4:Completed 1500000 out of 2500000 steps  (60%)
09:55:47:WU00:FS00:0x17:Completed 350000 out of 2500000 steps (14%)
10:01:40:WU00:FS00:0x17:Completed 375000 out of 2500000 steps (15%)
10:07:14:WU00:FS00:0x17:Completed 400000 out of 2500000 steps (16%)
10:10:19:FS00:Shutting core down
10:10:19:FS01:Shutting core down
10:10:20:WU00:FS00:0x17:WARNING:Console control signal 1 on PID 3160
10:10:20:WU00:FS00:0x17:Exiting, please wait. . .
10:10:20:WU00:FS00:0x17:Lost lifeline PID 4072, exiting
10:10:20:WU00:FS00:0x17:ERROR:103: Lost client lifeline
10:10:20:WU00:FS00:0x17:Folding@home Core Shutdown: CLIENT_DIED
10:10:20:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
10:10:27:WU01:FS01:0xa4:Client no longer detected. Shutting down core 
10:10:27:WU01:FS01:0xa4:
10:10:27:WU01:FS01:0xa4:Folding@home Core Shutdown: CLIENT_DIED
10:10:28:WU01:FS01:FahCore returned: INTERRUPTED (102 = 0x66)
10:11:20:Removing old file 'configs/config-20140123-214449.xml'
10:11:20:Saving configuration to config.xml
10:11:20:<config>
10:11:20:  <!-- Folding Slot Configuration -->
10:11:20:  <power v='idle'/>
10:11:20:
10:11:20:  <!-- Network -->
10:11:20:  <proxy v=':8080'/>
10:11:20:
10:11:20:  <!-- User Information -->
10:11:20:  <passkey v='********************************'/>
10:11:20:  <team v='33597'/>
10:11:20:  <user v='dMrItchy'/>
10:11:20:
10:11:20:  <!-- Folding Slots -->
10:11:20:  <slot id='0' type='GPU'/>
10:11:20:  <slot id='1' type='CPU'>
10:11:20:    <cpus v='-1'/>
10:11:20:  </slot>
10:11:20:</config>
10:24:02:WU01:FS01:Starting
10:24:02:WARNING:WU01:FS01:Changed SMP threads from 7 to 4 this can cause some work units to fail
10:24:02:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/Core_a4.fah/FahCore_a4.exe -dir 01 -suffix 01 -version 703 -lifeline 3020 -checkpoint 15 -np 4
10:24:03:WU01:FS01:Started FahCore on PID 7052
10:24:03:WU01:FS01:Core PID:7068
10:24:03:WU01:FS01:FahCore 0xa4 started
10:24:04:WU01:FS01:0xa4:
10:24:04:WU01:FS01:0xa4:*------------------------------*
10:24:04:WU01:FS01:0xa4:Folding@Home Gromacs GB Core
10:24:04:WU01:FS01:0xa4:Version 2.27 (Dec. 15, 2010)
10:24:04:WU01:FS01:0xa4:
10:24:04:WU01:FS01:0xa4:Preparing to commence simulation
10:24:04:WU01:FS01:0xa4:- Looking at optimizations...
10:24:04:WU01:FS01:0xa4:- Files status OK
10:24:04:WU01:FS01:0xa4:- Expanded 547445 -> 846876 (decompressed 154.6 percent)
10:24:04:WU01:FS01:0xa4:Called DecompressByteArray: compressed_data_size=547445 data_size=846876, decompressed_data_size=846876 diff=0
10:24:04:WU01:FS01:0xa4:- Digital signature verified
10:24:04:WU01:FS01:0xa4:
10:24:04:WU01:FS01:0xa4:Project: 7647 (Run 19, Clone 0, Gen 202)
10:24:04:WU01:FS01:0xa4:
10:24:04:WU01:FS01:0xa4:Assembly optimizations on if available.
10:24:04:WU01:FS01:0xa4:Entering M.D.
10:24:10:WU01:FS01:0xa4:Using Gromacs checkpoints
10:24:10:WU01:FS01:0xa4:Mapping NT from 4 to 4 
10:24:10:WU01:FS01:0xa4:Resuming from checkpoint
10:24:10:WU01:FS01:0xa4:Verified 01/wudata_01.log
10:24:10:WU01:FS01:0xa4:Verified 01/wudata_01.trr
10:24:10:WU01:FS01:0xa4:Verified 01/wudata_01.xtc
10:24:10:WU01:FS01:0xa4:Verified 01/wudata_01.edr
10:24:10:WU01:FS01:0xa4:Completed 1507140 out of 2500000 steps  (60%)
10:25:03:Removing old file 'configs/config-20140123-215808.xml'
10:25:03:Saving configuration to config.xml
10:25:03:<config>
10:25:03:  <!-- Folding Slot Configuration -->
10:25:03:  <power v='light'/>
10:25:03:
10:25:03:  <!-- Network -->
10:25:03:  <proxy v=':8080'/>
10:25:03:
10:25:03:  <!-- User Information -->
10:25:03:  <passkey v='********************************'/>
10:25:03:  <team v='33597'/>
10:25:03:  <user v='dMrItchy'/>
10:25:03:
10:25:03:  <!-- Folding Slots -->
10:25:03:  <slot id='0' type='GPU'/>
10:25:03:  <slot id='1' type='CPU'>
10:25:03:    <cpus v='-1'/>
10:25:03:  </slot>
10:25:03:</config>
10:39:48:FS01:Shutting core down
10:39:55:WU01:FS01:0xa4:Client no longer detected. Shutting down core 
10:39:55:WU01:FS01:0xa4:
10:39:55:WU01:FS01:0xa4:Folding@home Core Shutdown: CLIENT_DIED
10:39:55:WU01:FS01:FahCore returned: INTERRUPTED (102 = 0x66)
10:40:49:Removing old file 'configs/config-20140124-065544.xml'
10:40:49:Saving configuration to config.xml
10:40:49:<config>
10:40:49:  <!-- Folding Slot Configuration -->
10:40:49:  <power v='idle'/>
10:40:49:
10:40:49:  <!-- Network -->
10:40:49:  <proxy v=':8080'/>
10:40:49:
10:40:49:  <!-- User Information -->
10:40:49:  <passkey v='********************************'/>
10:40:49:  <team v='33597'/>
10:40:49:  <user v='dMrItchy'/>
10:40:49:
10:40:49:  <!-- Folding Slots -->
10:40:49:  <slot id='0' type='GPU'/>
10:40:49:  <slot id='1' type='CPU'>
10:40:49:    <cpus v='-1'/>
10:40:49:  </slot>
10:40:49:</config>
10:42:54:WU01:FS01:Starting
10:42:54:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/Core_a4.fah/FahCore_a4.exe -dir 01 -suffix 01 -version 703 -lifeline 3020 -checkpoint 15 -np 4
10:42:54:WU01:FS01:Started FahCore on PID 4464
10:42:54:WU01:FS01:Core PID:6412
10:42:54:WU01:FS01:FahCore 0xa4 started
10:42:54:WU01:FS01:0xa4:
10:42:54:WU01:FS01:0xa4:*------------------------------*
10:42:54:WU01:FS01:0xa4:Folding@Home Gromacs GB Core
10:42:54:WU01:FS01:0xa4:Version 2.27 (Dec. 15, 2010)
10:42:54:WU01:FS01:0xa4:
10:42:54:WU01:FS01:0xa4:Preparing to commence simulation
10:42:54:WU01:FS01:0xa4:- Looking at optimizations...
10:42:54:WU01:FS01:0xa4:- Files status OK
10:42:54:WU01:FS01:0xa4:- Expanded 547445 -> 846876 (decompressed 154.6 percent)
10:42:54:WU01:FS01:0xa4:Called DecompressByteArray: compressed_data_size=547445 data_size=846876, decompressed_data_size=846876 diff=0
10:42:54:WU01:FS01:0xa4:- Digital signature verified
10:42:54:WU01:FS01:0xa4:
10:42:54:WU01:FS01:0xa4:Project: 7647 (Run 19, Clone 0, Gen 202)
10:42:54:WU01:FS01:0xa4:
10:42:54:WU01:FS01:0xa4:Assembly optimizations on if available.
10:42:54:WU01:FS01:0xa4:Entering M.D.
10:43:00:WU01:FS01:0xa4:Using Gromacs checkpoints
10:43:00:WU01:FS01:0xa4:Mapping NT from 4 to 4 
10:43:00:WU01:FS01:0xa4:Resuming from checkpoint
10:43:00:WU01:FS01:0xa4:Verified 01/wudata_01.log
10:43:00:WU01:FS01:0xa4:Verified 01/wudata_01.trr
10:43:00:WU01:FS01:0xa4:Verified 01/wudata_01.xtc
10:43:00:WU01:FS01:0xa4:Verified 01/wudata_01.edr
10:43:01:WU01:FS01:0xa4:Completed 1517320 out of 2500000 steps  (60%)
10:43:55:Removing old file 'configs/config-20140124-084136.xml'
10:43:55:Saving configuration to config.xml
10:43:55:<config>
10:43:55:  <!-- Folding Slot Configuration -->
10:43:55:  <power v='light'/>
10:43:55:
10:43:55:  <!-- Network -->
10:43:55:  <proxy v=':8080'/>
10:43:55:
10:43:55:  <!-- User Information -->
10:43:55:  <passkey v='********************************'/>
10:43:55:  <team v='33597'/>
10:43:55:  <user v='dMrItchy'/>
10:43:55:
10:43:55:  <!-- Folding Slots -->
10:43:55:  <slot id='0' type='GPU'/>
10:43:55:  <slot id='1' type='CPU'>
10:43:55:    <cpus v='-1'/>
10:43:55:  </slot>
10:43:55:</config>
10:44:08:FS01:Shutting core down
10:44:08:WU00:FS00:Starting
10:44:08:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_17.fah/FahCore_17.exe -dir 00 -suffix 01 -version 703 -lifeline 3020 -checkpoint 15 -gpu 0 -gpu-vendor ati
10:44:08:WU00:FS00:Started FahCore on PID 6376
10:44:09:WU00:FS00:Core PID:7120
10:44:09:WU00:FS00:FahCore 0x17 started
10:44:10:WU00:FS00:0x17:*********************** Log Started 2014-01-27T10:44:10Z ***********************
10:44:10:WU00:FS00:0x17:Project: 8900 (Run 706, Clone 2, Gen 79)
10:44:10:WU00:FS00:0x17:Unit: 0x00000080028c126651a6c3b297475409
10:44:10:WU00:FS00:0x17:CPU: 0x00000000000000000000000000000000
10:44:10:WU00:FS00:0x17:Machine: 0
10:44:10:WU00:FS00:0x17:Digital signatures verified
10:44:10:WU00:FS00:0x17:Folding@home GPU core17
10:44:10:WU00:FS00:0x17:Version 0.0.52
10:44:10:WU00:FS00:0x17:  Found a checkpoint file
10:44:14:WU01:FS01:FahCore returned: INTERRUPTED (102 = 0x66)
10:44:15:WU01:FS01:Starting
10:44:15:WARNING:WU01:FS01:Changed SMP threads from 4 to 7 this can cause some work units to fail
10:44:15:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/Core_a4.fah/FahCore_a4.exe -dir 01 -suffix 01 -version 703 -lifeline 3020 -checkpoint 15 -np 7
10:44:15:WU01:FS01:Started FahCore on PID 620
10:44:15:WU01:FS01:Core PID:7008
10:44:15:WU01:FS01:FahCore 0xa4 started
10:44:15:WU01:FS01:0xa4:
10:44:15:WU01:FS01:0xa4:*------------------------------*
10:44:15:WU01:FS01:0xa4:Folding@Home Gromacs GB Core
10:44:15:WU01:FS01:0xa4:Version 2.27 (Dec. 15, 2010)
10:44:15:WU01:FS01:0xa4:
10:44:15:WU01:FS01:0xa4:Preparing to commence simulation
10:44:15:WU01:FS01:0xa4:- Looking at optimizations...
10:44:15:WU01:FS01:0xa4:- Files status OK
10:44:15:WU01:FS01:0xa4:- Expanded 547445 -> 846876 (decompressed 154.6 percent)
10:44:15:WU01:FS01:0xa4:Called DecompressByteArray: compressed_data_size=547445 data_size=846876, decompressed_data_size=846876 diff=0
10:44:15:WU01:FS01:0xa4:- Digital signature verified
10:44:15:WU01:FS01:0xa4:
10:44:15:WU01:FS01:0xa4:Project: 7647 (Run 19, Clone 0, Gen 202)
10:44:15:WU01:FS01:0xa4:
10:44:15:WU01:FS01:0xa4:Assembly optimizations on if available.
10:44:15:WU01:FS01:0xa4:Entering M.D.
10:44:21:WU01:FS01:0xa4:Using Gromacs checkpoints
10:44:21:WU01:FS01:0xa4:Mapping NT from 7 to 7 
10:44:21:WU01:FS01:0xa4:Resuming from checkpoint
10:44:21:WU01:FS01:0xa4:Verified 01/wudata_01.log
10:44:21:WU01:FS01:0xa4:Verified 01/wudata_01.trr
10:44:21:WU01:FS01:0xa4:Verified 01/wudata_01.xtc
10:44:21:WU01:FS01:0xa4:Verified 01/wudata_01.edr
10:44:21:WU01:FS01:0xa4:Completed 1517320 out of 2500000 steps  (60%)
10:45:09:Removing old file 'configs/config-20140124-095452.xml'
10:45:09:Saving configuration to config.xml
10:45:09:<config>
10:45:09:  <!-- Folding Slot Configuration -->
10:45:09:  <power v='full'/>
10:45:09:
10:45:09:  <!-- Network -->
10:45:09:  <proxy v=':8080'/>
10:45:09:
10:45:09:  <!-- User Information -->
10:45:09:  <passkey v='********************************'/>
10:45:09:  <team v='33597'/>
10:45:09:  <user v='dMrItchy'/>
10:45:09:
10:45:09:  <!-- Folding Slots -->
10:45:09:  <slot id='0' type='GPU'/>
10:45:09:  <slot id='1' type='CPU'>
10:45:09:    <cpus v='-1'/>
10:45:09:  </slot>
10:45:09:</config>
10:48:04:WU00:FS00:0x17:Completed 400000 out of 2500000 steps (16%)
10:48:04:WU00:FS00:0x17:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
10:52:29:WU01:FS01:0xa4:Completed 1525000 out of 2500000 steps  (61%)
10:53:56:WU00:FS00:0x17:Completed 425000 out of 2500000 steps (17%)
10:59:37:WU00:FS00:0x17:Completed 450000 out of 2500000 steps (18%)
11:05:35:WU00:FS00:0x17:Completed 475000 out of 2500000 steps (19%)
11:11:14:WU00:FS00:0x17:Completed 500000 out of 2500000 steps (20%)
11:17:23:WU00:FS00:0x17:Completed 525000 out of 2500000 steps (21%)
11:21:27:WU01:FS01:0xa4:Completed 1550000 out of 2500000 steps  (62%)
11:22:59:WU00:FS00:0x17:Completed 550000 out of 2500000 steps (22%)
11:28:55:WU00:FS00:0x17:Completed 575000 out of 2500000 steps (23%)
11:34:28:WU00:FS00:0x17:Completed 600000 out of 2500000 steps (24%)
11:40:22:WU00:FS00:0x17:Completed 625000 out of 2500000 steps (25%)
11:45:55:WU00:FS00:0x17:Completed 650000 out of 2500000 steps (26%)
11:49:04:WU01:FS01:0xa4:Completed 1575000 out of 2500000 steps  (63%)
11:52:08:WU00:FS00:0x17:Completed 675000 out of 2500000 steps (27%)
11:57:53:WU00:FS00:0x17:Completed 700000 out of 2500000 steps (28%)
12:03:51:WU00:FS00:0x17:Completed 725000 out of 2500000 steps (29%)
12:09:24:WU00:FS00:0x17:Completed 750000 out of 2500000 steps (30%)
12:15:23:WU00:FS00:0x17:Completed 775000 out of 2500000 steps (31%)
12:18:31:WU01:FS01:0xa4:Completed 1600000 out of 2500000 steps  (64%)
12:21:04:WU00:FS00:0x17:Completed 800000 out of 2500000 steps (32%)
12:27:07:WU00:FS00:0x17:Completed 825000 out of 2500000 steps (33%)
12:32:46:WU00:FS00:0x17:Completed 850000 out of 2500000 steps (34%)
12:38:49:WU00:FS00:0x17:Completed 875000 out of 2500000 steps (35%)
12:44:28:WU00:FS00:0x17:Completed 900000 out of 2500000 steps (36%)
12:47:15:WU01:FS01:0xa4:Completed 1625000 out of 2500000 steps  (65%)
12:50:30:WU00:FS00:0x17:Completed 925000 out of 2500000 steps (37%)
12:56:11:WU00:FS00:0x17:Completed 950000 out of 2500000 steps (38%)

Re: GPU wu's hang at 99%

Posted: Mon Jan 27, 2014 1:42 pm
by Kent Irwin
PantherX wrote:Welcome to the F@H Forum Kent Irwin,

Could you please post the log files from each of your systems which include the initial section of the log file as it contains the system configuration and F@H Settings (viewtopic.php?p=225958#p225958). Are your GPUs operating within normal temperature range or not? You could use GPU-Z (http://www.techpowerup.com/downloads/SysInfo/GPU-Z/) to help you find additional information about your GPUs.
Am at work currently so unable to post log info. All gpu fans are controlled manually with temps no higher than ~51c.

Re: GPU wu's hang at 99%

Posted: Mon Jan 27, 2014 8:51 pm
by bruce
Both the Web Control program and the Advanced Control program hang at 99% but it's terribly unlikely that your title is correct. When a GPU error causes the GPU to be reset, the WU hangs and whatever percentage it is at that time. Due to a limitation in how they're written (aka bug) both Control programs continue to show progress until they reach 99%+.

The primary issue is determining what kind of error the GPU had so it can be prevented in the future. By default, the current beta does not detect GPU stalls but it's likely that another beta will soon detect stalls and dump the WU. GPU resets may happen because of overclocking or because you connect to your screen remotely with a utility that doesn't support using the GPU. Only by avoiding those GPU resets will you truly solve the problem.

Re: GPU wu's hang at 99%

Posted: Tue Jan 28, 2014 8:32 am
by cw6591
Interesting points there, but I think none apply to my situation. All clock speeds are stock, I'm not using any method of remote access for this machine, and I'm not running the beta version of the software.

Re: GPU wu's hang at 99%

Posted: Tue Jan 28, 2014 2:30 pm
by codysluder
Did Windows report an "event" around the time it stopped working?

Re: GPU wu's hang at 99%

Posted: Wed Jan 29, 2014 12:18 am
by Kent Irwin
Again, I have 4 boxes folding with AMD gpu's that all started experiencing the hang about a week ago. None reported any coincidental Windows events. All are adequately cooled. They simply stop processing work according to the logs. I just now had to restart a client as it had stalled according to the log. Once restarted it resumed processing the gpu wu (8900). 4 Different machines. 2 I7's and 2 C2D's. All running Win 8.1. All using AMD gpu's and CCC drivers. A/V is not scanning fah folders. All started to experience the problem at about the same point in time. Logs report no errors other than forced shutdowns when I have to stop/restart the client to get past the hang.

If it were one machine that would be one thing. Four machines doing the same thing after weeks of normal performance makes me wonder.

Re: GPU wu's hang at 99%

Posted: Wed Jan 29, 2014 4:10 am
by PantherX
It seems that the common factor between all of these issues is Windows 8.1, right? I assume that everyone facing this issue is using the latest WHQL Drivers from AMD's site.

Re: GPU wu's hang at 99%

Posted: Wed Jan 29, 2014 4:54 am
by bruce
I have little doubt that Win8/8.1 has made it worse, but I've had it happen twice on Win7 in the last several years.