GPU wu's hang at 99%

It seems that a lot of GPU problems revolve around specific versions of drivers. Though AMD has their own support structure, you can often learn from information reported by others who fold.

Moderators: Site Moderators, FAHC Science Team

Kent Irwin
Posts: 22
Joined: Sun Jan 26, 2014 1:44 pm

GPU wu's hang at 99%

Post by Kent Irwin »

I've been folding for awhile now with the same hardware - 4 different machines running Win 8.1. Two of the machines are I7's with R9 270 gpu's. The other two are C2D machines with 7700 series gpu's. In the last week almost all of the gpu wu's - 8900 - have been hanging at 99% reported progress. All have the latest CCC drivers.

I have uninstalled the client software and deleted all folders. Upon reinstall the same behavior. Any clues?
7im
Posts: 10189
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: GPU wu's hang at 99%

Post by 7im »

Driver source and version for each PC?
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Kent Irwin
Posts: 22
Joined: Sun Jan 26, 2014 1:44 pm

Re: GPU wu's hang at 99%

Post by Kent Irwin »

13.251.0.0 for the R9's and 9.12.0.0 for the 7700. For all they're via CCC.
PantherX
Site Moderator
Posts: 7020
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: GPU wu's hang at 99%

Post by PantherX »

Welcome to the F@H Forum Kent Irwin,

Could you please post the log files from each of your systems which include the initial section of the log file as it contains the system configuration and F@H Settings (viewtopic.php?p=225958#p225958). Are your GPUs operating within normal temperature range or not? You could use GPU-Z (http://www.techpowerup.com/downloads/SysInfo/GPU-Z/) to help you find additional information about your GPUs.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
cw6591
Posts: 5
Joined: Mon Jan 27, 2014 7:06 am

Re: GPU wu's hang at 99%

Post by cw6591 »

Hi,
Just to let you know I have the same issue, and with WU's for project 8900. The machine in question is a new build based around a 990fx/FX8320 with a R9 270 GPU. Driver is 13.251.0.0 via CCC. This has happened twice. The first time I thought it may be an issue with the build, so I checked driver revisions, patches, etc, but all seem ok. A reboot put the percentage complete from 99% to 83%, the WU processed from there and completed normally. The second happened last night, with the WU hitting 99%/3 seconds to completion and staying that way from early evening until present (07.30).
cw6591
Posts: 5
Joined: Mon Jan 27, 2014 7:06 am

Re: GPU wu's hang at 99%

Post by cw6591 »

I tried rebooting, and the WU still presented as 99%3 seconds remaining. An hour later and its now running a new WU for project 8900. However, it seems that the previous WU has vanished...
Log file

Code: Select all

*********************** Log Started 2014-01-27T08:31:19Z ***********************
08:31:19:************************* Folding@home Client *************************
08:31:19:      Website: http://folding.stanford.edu/
08:31:19:    Copyright: (c) 2009-2013 Stanford University
08:31:19:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
08:31:19:         Args: 
08:31:19:       Config: C:/ProgramData/FAHClient/config.xml
08:31:19:******************************** Build ********************************
08:31:19:      Version: 7.3.6
08:31:19:         Date: Feb 18 2013
08:31:19:         Time: 15:25:17
08:31:19:      SVN Rev: 3923
08:31:19:       Branch: fah/trunk/client
08:31:19:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
08:31:19:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
08:31:19:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
08:31:19:     Platform: win32 XP
08:31:19:         Bits: 32
08:31:19:         Mode: Release
08:31:19:******************************* System ********************************
08:31:19:          CPU: AMD FX(tm)-8320 Eight-Core Processor
08:31:19:       CPU ID: AuthenticAMD Family 21 Model 2 Stepping 0
08:31:19:         CPUs: 8
08:31:19:       Memory: 7.96GiB
08:31:19:  Free Memory: 6.47GiB
08:31:19:      Threads: WINDOWS_THREADS
08:31:19:  Has Battery: false
08:31:19:   On Battery: false
08:31:19:   UTC offset: 0
08:31:19:          PID: 3020
08:31:19:          CWD: C:/ProgramData/FAHClient
08:31:19:           OS: Windows 7 Home Premium
08:31:19:      OS Arch: AMD64
08:31:19:         GPUs: 1
08:31:19:        GPU 0: ATI:5 Hawaii [Radeon R9 200 Series]
08:31:19:         CUDA: Not detected
08:31:19:Win32 Service: false
08:31:19:***********************************************************************
08:31:19:<config>
08:31:19:  <!-- Folding Slot Configuration -->
08:31:19:  <power v='full'/>
08:31:19:
08:31:19:  <!-- Network -->
08:31:19:  <proxy v=':8080'/>
08:31:19:
08:31:19:  <!-- User Information -->
08:31:19:  <passkey v='********************************'/>
08:31:19:  <team v='33597'/>
08:31:19:  <user v='dMrItchy'/>
08:31:19:
08:31:19:  <!-- Folding Slots -->
08:31:19:  <slot id='0' type='GPU'/>
08:31:19:  <slot id='1' type='CPU'>
08:31:19:    <cpus v='-1'/>
08:31:19:  </slot>
08:31:19:</config>
rhavern
Posts: 425
Joined: Mon Dec 03, 2007 8:45 am
Location: UK

Re: GPU wu's hang at 99%

Post by rhavern »

Is that all of the log file? This portion of the log tells what you are running, but not what happened. If you could post your entire log file in code tags, we could possibly help more.
Folding since 1 WU=1 point
ImageImage
cw6591
Posts: 5
Joined: Mon Jan 27, 2014 7:06 am

Re: GPU wu's hang at 99%

Post by cw6591 »

as requested

Code: Select all

*********************** Log Started 2014-01-27T08:31:19Z ***********************
08:31:19:************************* Folding@home Client *************************
08:31:19:      Website: http://folding.stanford.edu/
08:31:19:    Copyright: (c) 2009-2013 Stanford University
08:31:19:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
08:31:19:         Args: 
08:31:19:       Config: C:/ProgramData/FAHClient/config.xml
08:31:19:******************************** Build ********************************
08:31:19:      Version: 7.3.6
08:31:19:         Date: Feb 18 2013
08:31:19:         Time: 15:25:17
08:31:19:      SVN Rev: 3923
08:31:19:       Branch: fah/trunk/client
08:31:19:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
08:31:19:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
08:31:19:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
08:31:19:     Platform: win32 XP
08:31:19:         Bits: 32
08:31:19:         Mode: Release
08:31:19:******************************* System ********************************
08:31:19:          CPU: AMD FX(tm)-8320 Eight-Core Processor
08:31:19:       CPU ID: AuthenticAMD Family 21 Model 2 Stepping 0
08:31:19:         CPUs: 8
08:31:19:       Memory: 7.96GiB
08:31:19:  Free Memory: 6.47GiB
08:31:19:      Threads: WINDOWS_THREADS
08:31:19:  Has Battery: false
08:31:19:   On Battery: false
08:31:19:   UTC offset: 0
08:31:19:          PID: 3020
08:31:19:          CWD: C:/ProgramData/FAHClient
08:31:19:           OS: Windows 7 Home Premium
08:31:19:      OS Arch: AMD64
08:31:19:         GPUs: 1
08:31:19:        GPU 0: ATI:5 Hawaii [Radeon R9 200 Series]
08:31:19:         CUDA: Not detected
08:31:19:Win32 Service: false
08:31:19:***********************************************************************
08:31:19:<config>
08:31:19:  <!-- Folding Slot Configuration -->
08:31:19:  <power v='full'/>
08:31:19:
08:31:19:  <!-- Network -->
08:31:19:  <proxy v=':8080'/>
08:31:19:
08:31:19:  <!-- User Information -->
08:31:19:  <passkey v='********************************'/>
08:31:19:  <team v='33597'/>
08:31:19:  <user v='dMrItchy'/>
08:31:19:
08:31:19:  <!-- Folding Slots -->
08:31:19:  <slot id='0' type='GPU'/>
08:31:19:  <slot id='1' type='CPU'>
08:31:19:    <cpus v='-1'/>
08:31:19:  </slot>
08:31:19:</config>
08:31:19:Trying to access database...
08:31:19:Successfully acquired database lock
08:31:19:Enabled folding slot 00: READY gpu:0:Hawaii [Radeon R9 200 Series]
08:31:19:Enabled folding slot 01: READY cpu:7
08:31:19:WU00:FS00:Starting
08:31:19:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_17.fah/FahCore_17.exe -dir 00 -suffix 01 -version 703 -lifeline 3020 -checkpoint 15 -gpu 0 -gpu-vendor ati
08:31:19:WU00:FS00:Started FahCore on PID 4072
08:31:26:WU00:FS00:Core PID:3160
08:31:26:WU00:FS00:FahCore 0x17 started
08:31:27:WU01:FS01:Starting
08:31:27:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/Core_a4.fah/FahCore_a4.exe -dir 01 -suffix 01 -version 703 -lifeline 3020 -checkpoint 15 -np 7
08:31:28:WU00:FS00:0x17:*********************** Log Started 2014-01-27T08:31:28Z ***********************
08:31:28:WU00:FS00:0x17:Project: 8900 (Run 706, Clone 2, Gen 79)
08:31:28:WU00:FS00:0x17:Unit: 0x00000080028c126651a6c3b297475409
08:31:28:WU00:FS00:0x17:CPU: 0x00000000000000000000000000000000
08:31:28:WU00:FS00:0x17:Machine: 0
08:31:28:WU00:FS00:0x17:Digital signatures verified
08:31:28:WU00:FS00:0x17:Folding@home GPU core17
08:31:28:WU00:FS00:0x17:Version 0.0.52
08:31:29:WU01:FS01:Started FahCore on PID 3412
08:31:30:WU01:FS01:Core PID:4140
08:31:30:WU01:FS01:FahCore 0xa4 started
08:31:31:WU01:FS01:0xa4:
08:31:31:WU01:FS01:0xa4:*------------------------------*
08:31:31:WU01:FS01:0xa4:Folding@Home Gromacs GB Core
08:31:31:WU01:FS01:0xa4:Version 2.27 (Dec. 15, 2010)
08:31:31:WU01:FS01:0xa4:
08:31:31:WU01:FS01:0xa4:Preparing to commence simulation
08:31:31:WU01:FS01:0xa4:- Ensuring status. Please wait.
08:31:40:WU01:FS01:0xa4:- Looking at optimizations...
08:31:40:WU01:FS01:0xa4:- Working with standard loops on this execution.
08:31:40:WU01:FS01:0xa4:- Previous termination of core was improper.
08:31:40:WU01:FS01:0xa4:- Files status OK
08:31:40:WU01:FS01:0xa4:- Expanded 547445 -> 846876 (decompressed 154.6 percent)
08:31:40:WU01:FS01:0xa4:Called DecompressByteArray: compressed_data_size=547445 data_size=846876, decompressed_data_size=846876 diff=0
08:31:40:WU01:FS01:0xa4:- Digital signature verified
08:31:40:WU01:FS01:0xa4:
08:31:40:WU01:FS01:0xa4:Project: 7647 (Run 19, Clone 0, Gen 202)
08:31:40:WU01:FS01:0xa4:
08:31:40:WU01:FS01:0xa4:Entering M.D.
08:31:46:WU01:FS01:0xa4:Using Gromacs checkpoints
08:31:46:WU01:FS01:0xa4:Mapping NT from 7 to 7 
08:31:46:WU01:FS01:0xa4:Resuming from checkpoint
08:31:46:WU01:FS01:0xa4:Verified 01/wudata_01.log
08:31:48:WU01:FS01:0xa4:Verified 01/wudata_01.trr
08:31:48:WU01:FS01:0xa4:Verified 01/wudata_01.xtc
08:31:48:WU01:FS01:0xa4:Verified 01/wudata_01.edr
08:31:49:WU01:FS01:0xa4:Completed 1423500 out of 2500000 steps  (56%)
08:33:47:WU01:FS01:0xa4:Completed 1425000 out of 2500000 steps  (57%)
08:35:23:WU00:FS00:0x17:Completed 0 out of 2500000 steps (0%)
08:35:23:WU00:FS00:0x17:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
08:41:13:WU00:FS00:0x17:Completed 25000 out of 2500000 steps (1%)
08:47:03:WU00:FS00:0x17:Completed 50000 out of 2500000 steps (2%)
08:52:57:WU00:FS00:0x17:Completed 75000 out of 2500000 steps (3%)
08:58:29:WU00:FS00:0x17:Completed 100000 out of 2500000 steps (4%)
09:00:20:WU01:FS01:0xa4:Completed 1450000 out of 2500000 steps  (58%)
09:04:22:WU00:FS00:0x17:Completed 125000 out of 2500000 steps (5%)
09:09:53:WU00:FS00:0x17:Completed 150000 out of 2500000 steps (6%)
09:15:46:WU00:FS00:0x17:Completed 175000 out of 2500000 steps (7%)
09:21:17:WU00:FS00:0x17:Completed 200000 out of 2500000 steps (8%)
09:27:10:WU00:FS00:0x17:Completed 225000 out of 2500000 steps (9%)
09:27:11:WU01:FS01:0xa4:Completed 1475000 out of 2500000 steps  (59%)
09:32:43:WU00:FS00:0x17:Completed 250000 out of 2500000 steps (10%)
09:38:48:WU00:FS00:0x17:Completed 275000 out of 2500000 steps (11%)
09:44:22:WU00:FS00:0x17:Completed 300000 out of 2500000 steps (12%)
09:50:15:WU00:FS00:0x17:Completed 325000 out of 2500000 steps (13%)
09:54:16:WU01:FS01:0xa4:Completed 1500000 out of 2500000 steps  (60%)
09:55:47:WU00:FS00:0x17:Completed 350000 out of 2500000 steps (14%)
10:01:40:WU00:FS00:0x17:Completed 375000 out of 2500000 steps (15%)
10:07:14:WU00:FS00:0x17:Completed 400000 out of 2500000 steps (16%)
10:10:19:FS00:Shutting core down
10:10:19:FS01:Shutting core down
10:10:20:WU00:FS00:0x17:WARNING:Console control signal 1 on PID 3160
10:10:20:WU00:FS00:0x17:Exiting, please wait. . .
10:10:20:WU00:FS00:0x17:Lost lifeline PID 4072, exiting
10:10:20:WU00:FS00:0x17:ERROR:103: Lost client lifeline
10:10:20:WU00:FS00:0x17:Folding@home Core Shutdown: CLIENT_DIED
10:10:20:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
10:10:27:WU01:FS01:0xa4:Client no longer detected. Shutting down core 
10:10:27:WU01:FS01:0xa4:
10:10:27:WU01:FS01:0xa4:Folding@home Core Shutdown: CLIENT_DIED
10:10:28:WU01:FS01:FahCore returned: INTERRUPTED (102 = 0x66)
10:11:20:Removing old file 'configs/config-20140123-214449.xml'
10:11:20:Saving configuration to config.xml
10:11:20:<config>
10:11:20:  <!-- Folding Slot Configuration -->
10:11:20:  <power v='idle'/>
10:11:20:
10:11:20:  <!-- Network -->
10:11:20:  <proxy v=':8080'/>
10:11:20:
10:11:20:  <!-- User Information -->
10:11:20:  <passkey v='********************************'/>
10:11:20:  <team v='33597'/>
10:11:20:  <user v='dMrItchy'/>
10:11:20:
10:11:20:  <!-- Folding Slots -->
10:11:20:  <slot id='0' type='GPU'/>
10:11:20:  <slot id='1' type='CPU'>
10:11:20:    <cpus v='-1'/>
10:11:20:  </slot>
10:11:20:</config>
10:24:02:WU01:FS01:Starting
10:24:02:WARNING:WU01:FS01:Changed SMP threads from 7 to 4 this can cause some work units to fail
10:24:02:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/Core_a4.fah/FahCore_a4.exe -dir 01 -suffix 01 -version 703 -lifeline 3020 -checkpoint 15 -np 4
10:24:03:WU01:FS01:Started FahCore on PID 7052
10:24:03:WU01:FS01:Core PID:7068
10:24:03:WU01:FS01:FahCore 0xa4 started
10:24:04:WU01:FS01:0xa4:
10:24:04:WU01:FS01:0xa4:*------------------------------*
10:24:04:WU01:FS01:0xa4:Folding@Home Gromacs GB Core
10:24:04:WU01:FS01:0xa4:Version 2.27 (Dec. 15, 2010)
10:24:04:WU01:FS01:0xa4:
10:24:04:WU01:FS01:0xa4:Preparing to commence simulation
10:24:04:WU01:FS01:0xa4:- Looking at optimizations...
10:24:04:WU01:FS01:0xa4:- Files status OK
10:24:04:WU01:FS01:0xa4:- Expanded 547445 -> 846876 (decompressed 154.6 percent)
10:24:04:WU01:FS01:0xa4:Called DecompressByteArray: compressed_data_size=547445 data_size=846876, decompressed_data_size=846876 diff=0
10:24:04:WU01:FS01:0xa4:- Digital signature verified
10:24:04:WU01:FS01:0xa4:
10:24:04:WU01:FS01:0xa4:Project: 7647 (Run 19, Clone 0, Gen 202)
10:24:04:WU01:FS01:0xa4:
10:24:04:WU01:FS01:0xa4:Assembly optimizations on if available.
10:24:04:WU01:FS01:0xa4:Entering M.D.
10:24:10:WU01:FS01:0xa4:Using Gromacs checkpoints
10:24:10:WU01:FS01:0xa4:Mapping NT from 4 to 4 
10:24:10:WU01:FS01:0xa4:Resuming from checkpoint
10:24:10:WU01:FS01:0xa4:Verified 01/wudata_01.log
10:24:10:WU01:FS01:0xa4:Verified 01/wudata_01.trr
10:24:10:WU01:FS01:0xa4:Verified 01/wudata_01.xtc
10:24:10:WU01:FS01:0xa4:Verified 01/wudata_01.edr
10:24:10:WU01:FS01:0xa4:Completed 1507140 out of 2500000 steps  (60%)
10:25:03:Removing old file 'configs/config-20140123-215808.xml'
10:25:03:Saving configuration to config.xml
10:25:03:<config>
10:25:03:  <!-- Folding Slot Configuration -->
10:25:03:  <power v='light'/>
10:25:03:
10:25:03:  <!-- Network -->
10:25:03:  <proxy v=':8080'/>
10:25:03:
10:25:03:  <!-- User Information -->
10:25:03:  <passkey v='********************************'/>
10:25:03:  <team v='33597'/>
10:25:03:  <user v='dMrItchy'/>
10:25:03:
10:25:03:  <!-- Folding Slots -->
10:25:03:  <slot id='0' type='GPU'/>
10:25:03:  <slot id='1' type='CPU'>
10:25:03:    <cpus v='-1'/>
10:25:03:  </slot>
10:25:03:</config>
10:39:48:FS01:Shutting core down
10:39:55:WU01:FS01:0xa4:Client no longer detected. Shutting down core 
10:39:55:WU01:FS01:0xa4:
10:39:55:WU01:FS01:0xa4:Folding@home Core Shutdown: CLIENT_DIED
10:39:55:WU01:FS01:FahCore returned: INTERRUPTED (102 = 0x66)
10:40:49:Removing old file 'configs/config-20140124-065544.xml'
10:40:49:Saving configuration to config.xml
10:40:49:<config>
10:40:49:  <!-- Folding Slot Configuration -->
10:40:49:  <power v='idle'/>
10:40:49:
10:40:49:  <!-- Network -->
10:40:49:  <proxy v=':8080'/>
10:40:49:
10:40:49:  <!-- User Information -->
10:40:49:  <passkey v='********************************'/>
10:40:49:  <team v='33597'/>
10:40:49:  <user v='dMrItchy'/>
10:40:49:
10:40:49:  <!-- Folding Slots -->
10:40:49:  <slot id='0' type='GPU'/>
10:40:49:  <slot id='1' type='CPU'>
10:40:49:    <cpus v='-1'/>
10:40:49:  </slot>
10:40:49:</config>
10:42:54:WU01:FS01:Starting
10:42:54:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/Core_a4.fah/FahCore_a4.exe -dir 01 -suffix 01 -version 703 -lifeline 3020 -checkpoint 15 -np 4
10:42:54:WU01:FS01:Started FahCore on PID 4464
10:42:54:WU01:FS01:Core PID:6412
10:42:54:WU01:FS01:FahCore 0xa4 started
10:42:54:WU01:FS01:0xa4:
10:42:54:WU01:FS01:0xa4:*------------------------------*
10:42:54:WU01:FS01:0xa4:Folding@Home Gromacs GB Core
10:42:54:WU01:FS01:0xa4:Version 2.27 (Dec. 15, 2010)
10:42:54:WU01:FS01:0xa4:
10:42:54:WU01:FS01:0xa4:Preparing to commence simulation
10:42:54:WU01:FS01:0xa4:- Looking at optimizations...
10:42:54:WU01:FS01:0xa4:- Files status OK
10:42:54:WU01:FS01:0xa4:- Expanded 547445 -> 846876 (decompressed 154.6 percent)
10:42:54:WU01:FS01:0xa4:Called DecompressByteArray: compressed_data_size=547445 data_size=846876, decompressed_data_size=846876 diff=0
10:42:54:WU01:FS01:0xa4:- Digital signature verified
10:42:54:WU01:FS01:0xa4:
10:42:54:WU01:FS01:0xa4:Project: 7647 (Run 19, Clone 0, Gen 202)
10:42:54:WU01:FS01:0xa4:
10:42:54:WU01:FS01:0xa4:Assembly optimizations on if available.
10:42:54:WU01:FS01:0xa4:Entering M.D.
10:43:00:WU01:FS01:0xa4:Using Gromacs checkpoints
10:43:00:WU01:FS01:0xa4:Mapping NT from 4 to 4 
10:43:00:WU01:FS01:0xa4:Resuming from checkpoint
10:43:00:WU01:FS01:0xa4:Verified 01/wudata_01.log
10:43:00:WU01:FS01:0xa4:Verified 01/wudata_01.trr
10:43:00:WU01:FS01:0xa4:Verified 01/wudata_01.xtc
10:43:00:WU01:FS01:0xa4:Verified 01/wudata_01.edr
10:43:01:WU01:FS01:0xa4:Completed 1517320 out of 2500000 steps  (60%)
10:43:55:Removing old file 'configs/config-20140124-084136.xml'
10:43:55:Saving configuration to config.xml
10:43:55:<config>
10:43:55:  <!-- Folding Slot Configuration -->
10:43:55:  <power v='light'/>
10:43:55:
10:43:55:  <!-- Network -->
10:43:55:  <proxy v=':8080'/>
10:43:55:
10:43:55:  <!-- User Information -->
10:43:55:  <passkey v='********************************'/>
10:43:55:  <team v='33597'/>
10:43:55:  <user v='dMrItchy'/>
10:43:55:
10:43:55:  <!-- Folding Slots -->
10:43:55:  <slot id='0' type='GPU'/>
10:43:55:  <slot id='1' type='CPU'>
10:43:55:    <cpus v='-1'/>
10:43:55:  </slot>
10:43:55:</config>
10:44:08:FS01:Shutting core down
10:44:08:WU00:FS00:Starting
10:44:08:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_17.fah/FahCore_17.exe -dir 00 -suffix 01 -version 703 -lifeline 3020 -checkpoint 15 -gpu 0 -gpu-vendor ati
10:44:08:WU00:FS00:Started FahCore on PID 6376
10:44:09:WU00:FS00:Core PID:7120
10:44:09:WU00:FS00:FahCore 0x17 started
10:44:10:WU00:FS00:0x17:*********************** Log Started 2014-01-27T10:44:10Z ***********************
10:44:10:WU00:FS00:0x17:Project: 8900 (Run 706, Clone 2, Gen 79)
10:44:10:WU00:FS00:0x17:Unit: 0x00000080028c126651a6c3b297475409
10:44:10:WU00:FS00:0x17:CPU: 0x00000000000000000000000000000000
10:44:10:WU00:FS00:0x17:Machine: 0
10:44:10:WU00:FS00:0x17:Digital signatures verified
10:44:10:WU00:FS00:0x17:Folding@home GPU core17
10:44:10:WU00:FS00:0x17:Version 0.0.52
10:44:10:WU00:FS00:0x17:  Found a checkpoint file
10:44:14:WU01:FS01:FahCore returned: INTERRUPTED (102 = 0x66)
10:44:15:WU01:FS01:Starting
10:44:15:WARNING:WU01:FS01:Changed SMP threads from 4 to 7 this can cause some work units to fail
10:44:15:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/Core_a4.fah/FahCore_a4.exe -dir 01 -suffix 01 -version 703 -lifeline 3020 -checkpoint 15 -np 7
10:44:15:WU01:FS01:Started FahCore on PID 620
10:44:15:WU01:FS01:Core PID:7008
10:44:15:WU01:FS01:FahCore 0xa4 started
10:44:15:WU01:FS01:0xa4:
10:44:15:WU01:FS01:0xa4:*------------------------------*
10:44:15:WU01:FS01:0xa4:Folding@Home Gromacs GB Core
10:44:15:WU01:FS01:0xa4:Version 2.27 (Dec. 15, 2010)
10:44:15:WU01:FS01:0xa4:
10:44:15:WU01:FS01:0xa4:Preparing to commence simulation
10:44:15:WU01:FS01:0xa4:- Looking at optimizations...
10:44:15:WU01:FS01:0xa4:- Files status OK
10:44:15:WU01:FS01:0xa4:- Expanded 547445 -> 846876 (decompressed 154.6 percent)
10:44:15:WU01:FS01:0xa4:Called DecompressByteArray: compressed_data_size=547445 data_size=846876, decompressed_data_size=846876 diff=0
10:44:15:WU01:FS01:0xa4:- Digital signature verified
10:44:15:WU01:FS01:0xa4:
10:44:15:WU01:FS01:0xa4:Project: 7647 (Run 19, Clone 0, Gen 202)
10:44:15:WU01:FS01:0xa4:
10:44:15:WU01:FS01:0xa4:Assembly optimizations on if available.
10:44:15:WU01:FS01:0xa4:Entering M.D.
10:44:21:WU01:FS01:0xa4:Using Gromacs checkpoints
10:44:21:WU01:FS01:0xa4:Mapping NT from 7 to 7 
10:44:21:WU01:FS01:0xa4:Resuming from checkpoint
10:44:21:WU01:FS01:0xa4:Verified 01/wudata_01.log
10:44:21:WU01:FS01:0xa4:Verified 01/wudata_01.trr
10:44:21:WU01:FS01:0xa4:Verified 01/wudata_01.xtc
10:44:21:WU01:FS01:0xa4:Verified 01/wudata_01.edr
10:44:21:WU01:FS01:0xa4:Completed 1517320 out of 2500000 steps  (60%)
10:45:09:Removing old file 'configs/config-20140124-095452.xml'
10:45:09:Saving configuration to config.xml
10:45:09:<config>
10:45:09:  <!-- Folding Slot Configuration -->
10:45:09:  <power v='full'/>
10:45:09:
10:45:09:  <!-- Network -->
10:45:09:  <proxy v=':8080'/>
10:45:09:
10:45:09:  <!-- User Information -->
10:45:09:  <passkey v='********************************'/>
10:45:09:  <team v='33597'/>
10:45:09:  <user v='dMrItchy'/>
10:45:09:
10:45:09:  <!-- Folding Slots -->
10:45:09:  <slot id='0' type='GPU'/>
10:45:09:  <slot id='1' type='CPU'>
10:45:09:    <cpus v='-1'/>
10:45:09:  </slot>
10:45:09:</config>
10:48:04:WU00:FS00:0x17:Completed 400000 out of 2500000 steps (16%)
10:48:04:WU00:FS00:0x17:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
10:52:29:WU01:FS01:0xa4:Completed 1525000 out of 2500000 steps  (61%)
10:53:56:WU00:FS00:0x17:Completed 425000 out of 2500000 steps (17%)
10:59:37:WU00:FS00:0x17:Completed 450000 out of 2500000 steps (18%)
11:05:35:WU00:FS00:0x17:Completed 475000 out of 2500000 steps (19%)
11:11:14:WU00:FS00:0x17:Completed 500000 out of 2500000 steps (20%)
11:17:23:WU00:FS00:0x17:Completed 525000 out of 2500000 steps (21%)
11:21:27:WU01:FS01:0xa4:Completed 1550000 out of 2500000 steps  (62%)
11:22:59:WU00:FS00:0x17:Completed 550000 out of 2500000 steps (22%)
11:28:55:WU00:FS00:0x17:Completed 575000 out of 2500000 steps (23%)
11:34:28:WU00:FS00:0x17:Completed 600000 out of 2500000 steps (24%)
11:40:22:WU00:FS00:0x17:Completed 625000 out of 2500000 steps (25%)
11:45:55:WU00:FS00:0x17:Completed 650000 out of 2500000 steps (26%)
11:49:04:WU01:FS01:0xa4:Completed 1575000 out of 2500000 steps  (63%)
11:52:08:WU00:FS00:0x17:Completed 675000 out of 2500000 steps (27%)
11:57:53:WU00:FS00:0x17:Completed 700000 out of 2500000 steps (28%)
12:03:51:WU00:FS00:0x17:Completed 725000 out of 2500000 steps (29%)
12:09:24:WU00:FS00:0x17:Completed 750000 out of 2500000 steps (30%)
12:15:23:WU00:FS00:0x17:Completed 775000 out of 2500000 steps (31%)
12:18:31:WU01:FS01:0xa4:Completed 1600000 out of 2500000 steps  (64%)
12:21:04:WU00:FS00:0x17:Completed 800000 out of 2500000 steps (32%)
12:27:07:WU00:FS00:0x17:Completed 825000 out of 2500000 steps (33%)
12:32:46:WU00:FS00:0x17:Completed 850000 out of 2500000 steps (34%)
12:38:49:WU00:FS00:0x17:Completed 875000 out of 2500000 steps (35%)
12:44:28:WU00:FS00:0x17:Completed 900000 out of 2500000 steps (36%)
12:47:15:WU01:FS01:0xa4:Completed 1625000 out of 2500000 steps  (65%)
12:50:30:WU00:FS00:0x17:Completed 925000 out of 2500000 steps (37%)
12:56:11:WU00:FS00:0x17:Completed 950000 out of 2500000 steps (38%)
Kent Irwin
Posts: 22
Joined: Sun Jan 26, 2014 1:44 pm

Re: GPU wu's hang at 99%

Post by Kent Irwin »

PantherX wrote:Welcome to the F@H Forum Kent Irwin,

Could you please post the log files from each of your systems which include the initial section of the log file as it contains the system configuration and F@H Settings (viewtopic.php?p=225958#p225958). Are your GPUs operating within normal temperature range or not? You could use GPU-Z (http://www.techpowerup.com/downloads/SysInfo/GPU-Z/) to help you find additional information about your GPUs.
Am at work currently so unable to post log info. All gpu fans are controlled manually with temps no higher than ~51c.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: GPU wu's hang at 99%

Post by bruce »

Both the Web Control program and the Advanced Control program hang at 99% but it's terribly unlikely that your title is correct. When a GPU error causes the GPU to be reset, the WU hangs and whatever percentage it is at that time. Due to a limitation in how they're written (aka bug) both Control programs continue to show progress until they reach 99%+.

The primary issue is determining what kind of error the GPU had so it can be prevented in the future. By default, the current beta does not detect GPU stalls but it's likely that another beta will soon detect stalls and dump the WU. GPU resets may happen because of overclocking or because you connect to your screen remotely with a utility that doesn't support using the GPU. Only by avoiding those GPU resets will you truly solve the problem.
cw6591
Posts: 5
Joined: Mon Jan 27, 2014 7:06 am

Re: GPU wu's hang at 99%

Post by cw6591 »

Interesting points there, but I think none apply to my situation. All clock speeds are stock, I'm not using any method of remote access for this machine, and I'm not running the beta version of the software.
codysluder
Posts: 1024
Joined: Sun Dec 02, 2007 12:43 pm

Re: GPU wu's hang at 99%

Post by codysluder »

Did Windows report an "event" around the time it stopped working?
Kent Irwin
Posts: 22
Joined: Sun Jan 26, 2014 1:44 pm

Re: GPU wu's hang at 99%

Post by Kent Irwin »

Again, I have 4 boxes folding with AMD gpu's that all started experiencing the hang about a week ago. None reported any coincidental Windows events. All are adequately cooled. They simply stop processing work according to the logs. I just now had to restart a client as it had stalled according to the log. Once restarted it resumed processing the gpu wu (8900). 4 Different machines. 2 I7's and 2 C2D's. All running Win 8.1. All using AMD gpu's and CCC drivers. A/V is not scanning fah folders. All started to experience the problem at about the same point in time. Logs report no errors other than forced shutdowns when I have to stop/restart the client to get past the hang.

If it were one machine that would be one thing. Four machines doing the same thing after weeks of normal performance makes me wonder.
PantherX
Site Moderator
Posts: 7020
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: GPU wu's hang at 99%

Post by PantherX »

It seems that the common factor between all of these issues is Windows 8.1, right? I assume that everyone facing this issue is using the latest WHQL Drivers from AMD's site.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: GPU wu's hang at 99%

Post by bruce »

I have little doubt that Win8/8.1 has made it worse, but I've had it happen twice on Win7 in the last several years.
Post Reply