Cpus running, 2 GPUs failing

If you're new to FAH and need help getting started or you have very basic questions, start here.

Moderators: Site Moderators, FAHC Science Team

Post Reply
gdeckn7
Posts: 4
Joined: Sun Jul 27, 2014 3:06 pm

Cpus running, 2 GPUs failing

Post by gdeckn7 »

Hi everybody,
I downloaded the most recent client and didn't make any changes to the config (-1 on all the defaults). I then tried following some of the dual GPU threads and came up short. If I can get it running I'll try to put together a more powerful box, maybe with titans or something, but at the moment just trying to learn the basics of multi-gpu folding/settings.

Machine is a stock dell t7400 4 cpu machine with a GTX570 and a Quadro FX3700, win 8.1 all latest, mostly standard, gpus have latest drivers (yes I know if I use one of the earlier ones it'll run faster, just trying to get it running for now)

Can anyone help?

thank you!!!

Some notes:
looks like FAH client and cores are running 32 bit.
I was getting a lot of "basd work unit" errors and now I'm getting Open-Cl not set almost continuously even though I have everything set to -1.
Here's my log (verbosity set to 5) for the latest try:

Code: Select all

18:07:06:42:127.0.0.1 GET /api/updates?_=1406477445376&sid=b8926607fd8a8dcaf77d3169a13df903
18:07:07:41:127.0.0.1 GET /?nocache=0.9745721153578131
18:07:07:43:127.0.0.1 GET /css/normalize.css
18:07:07:44:127.0.0.1 GET /css/main.css
18:07:07:45:127.0.0.1 GET /images/facebook.png
18:07:07:46:127.0.0.1 GET /images/mail.png
18:07:07:47:127.0.0.1 GET /images/twitter.png
18:07:08:48:127.0.0.1 GET /images/report-bug.png
18:07:08:49:127.0.0.1 GET /js/jquery-1.10.2.min.js
18:07:08:50:127.0.0.1 GET /js/libs/jquery.selectbox-0.2.js
18:07:08:51:127.0.0.1 GET /js/intercom.min.js
18:07:08:52:127.0.0.1 GET /js/main.js
18:07:08:52:127.0.0.1:New Web connection
18:07:08:53:127.0.0.1 GET /images/template/bg.jpg
18:07:08:54:127.0.0.1 GET /images/template/button_bg.png
18:07:08:55:127.0.0.1 GET /images/template/input_radio.png
18:07:08:56:127.0.0.1 GET /images/template/logo_folding_home.png
18:07:09:57:127.0.0.1 GET /api/updates/set?_=1406484428350&sid=0395da356be6d7531ef7911b136afa46&update_id=0&update_path=/api/basic&update_rate=1
18:07:09:58:127.0.0.1 GET /api/updates/set?_=1406484428351&sid=0395da356be6d7531ef7911b136afa46&update_id=1&update_path=/api/slots&update_rate=1
18:07:09:59:127.0.0.1 GET /api/configured?_=1406484428352&sid=0395da356be6d7531ef7911b136afa46
18:07:09:60:127.0.0.1 GET /images/template/ui-bg-slider.png
18:07:09:61:127.0.0.1 GET /images/template/ui-icon-slider.png
18:07:09:62:127.0.0.1 GET /images/template/ui-progress-bg.png
18:07:09:63:127.0.0.1 GET /images/template/select-icons.png
18:07:10:64:127.0.0.1 GET /api/updates?_=1406484428353&sid=0395da356be6d7531ef7911b136afa46
18:07:10:64:127.0.0.1 GET /api/updates?_=1406484428357&sid=0395da356be6d7531ef7911b136afa46
18:11:02:Saving configuration to config.xml
18:11:02:<config>
18:11:02:  <!-- Folding Core -->
18:11:02:  <core-priority v='low'/>
18:11:02:
18:11:02:  <!-- Logging -->
18:11:02:  <verbosity v='5'/>
18:11:02:
18:11:02:  <!-- Network -->
18:11:02:  <proxy v=':8080'/>
18:11:02:
18:11:02:  <!-- Slot Control -->
18:11:02:  <power v='full'/>
18:11:02:
18:11:02:  <!-- User Information -->
18:11:02:  <user v='BR451'/>
18:11:02:
18:11:02:  <!-- Folding Slots -->
18:11:02:  <slot id='0' type='CPU'>
18:11:02:    <cpus v='2'/>
18:11:02:    <paused v='true'/>
18:11:02:  </slot>
18:11:02:  <slot id='1' type='GPU'>
18:11:02:    <paused v='true'/>
18:11:02:  </slot>
18:11:02:  <slot id='2' type='GPU'>
18:11:02:    <paused v='true'/>
18:11:02:  </slot>
18:11:02:</config>
18:11:45:Saving configuration to config.xml
18:11:45:<config>
18:11:45:  <!-- Folding Core -->
18:11:45:  <core-priority v='low'/>
18:11:45:
18:11:45:  <!-- Logging -->
18:11:45:  <verbosity v='5'/>
18:11:45:
18:11:45:  <!-- Network -->
18:11:45:  <proxy v=':8080'/>
18:11:45:
18:11:45:  <!-- Slot Control -->
18:11:45:  <power v='full'/>
18:11:45:
18:11:45:  <!-- User Information -->
18:11:45:  <user v='BR451'/>
18:11:45:
18:11:45:  <!-- Folding Slots -->
18:11:45:  <slot id='0' type='CPU'>
18:11:45:    <cpus v='2'/>
18:11:45:    <paused v='true'/>
18:11:45:  </slot>
18:11:45:  <slot id='1' type='GPU'>
18:11:45:    <paused v='true'/>
18:11:45:  </slot>
18:11:45:  <slot id='2' type='GPU'>
18:11:45:    <paused v='true'/>
18:11:45:  </slot>
18:11:45:</config>
18:12:07:Saving configuration to config.xml
18:12:07:<config>
18:12:07:  <!-- Folding Core -->
18:12:07:  <core-priority v='low'/>
18:12:07:
18:12:07:  <!-- Logging -->
18:12:07:  <verbosity v='5'/>
18:12:07:
18:12:07:  <!-- Network -->
18:12:07:  <proxy v=':8080'/>
18:12:07:
18:12:07:  <!-- Slot Control -->
18:12:07:  <pause-on-battery v='false'/>
18:12:07:  <power v='full'/>
18:12:07:
18:12:07:  <!-- User Information -->
18:12:07:  <user v='BR451'/>
18:12:07:
18:12:07:  <!-- Folding Slots -->
18:12:07:  <slot id='0' type='CPU'>
18:12:07:    <cpus v='2'/>
18:12:07:    <paused v='true'/>
18:12:07:  </slot>
18:12:07:  <slot id='1' type='GPU'>
18:12:07:    <paused v='true'/>
18:12:07:  </slot>
18:12:07:  <slot id='2' type='GPU'>
18:12:07:    <paused v='true'/>
18:12:07:  </slot>
18:12:07:</config>
18:12:09:FS00:Unpaused
18:12:09:FS01:Unpaused
18:12:09:FS02:Unpaused
18:12:10:WU00:FS00:Starting
18:12:10:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Admin/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/Core_a4.fah/FahCore_a4.exe -dir 00 -suffix 01 -version 704 -lifeline 2196 -checkpoint 15
18:12:10:WU00:FS00:Started FahCore on PID 4932
18:12:10:Started thread 25 on PID 2196
18:12:10:WU00:FS00:Core PID:4048
18:12:10:WU00:FS00:FahCore 0xa4 started
18:12:10:WU01:FS01:Connecting to 171.67.108.201:80
18:12:10:WU02:FS02:Connecting to 171.67.108.201:80
18:12:10:WARNING:WU01:FS01:Failed to get assignment from '171.67.108.201:80': Empty work server assignment
18:12:10:WU01:FS01:Connecting to 171.64.65.160:80
18:12:10:WU00:FS00:0xa4:
18:12:10:WU00:FS00:0xa4:*------------------------------*
18:12:10:WU00:FS00:0xa4:Folding@Home Gromacs GB Core
18:12:10:WU00:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
18:12:10:WU00:FS00:0xa4:
18:12:10:WU00:FS00:0xa4:Preparing to commence simulation
18:12:10:WU00:FS00:0xa4:- Looking at optimizations...
18:12:10:WU00:FS00:0xa4:- Files status OK
18:12:10:WU00:FS00:0xa4:- Expanded 202287 -> 425556 (decompressed 210.3 percent)
18:12:10:WU00:FS00:0xa4:Called DecompressByteArray: compressed_data_size=202287 data_size=425556, decompressed_data_size=425556 diff=0
18:12:10:WU00:FS00:0xa4:- Digital signature verified
18:12:10:WU00:FS00:0xa4:
18:12:10:WU00:FS00:0xa4:Project: 6376 (Run 37, Clone 3, Gen 11)
18:12:10:WU00:FS00:0xa4:
18:12:10:WU00:FS00:0xa4:Assembly optimizations on if available.
18:12:10:WU00:FS00:0xa4:Entering M.D.
18:12:10:WU02:FS02:Assigned to work server 171.67.108.52
18:12:10:WU02:FS02:Requesting new work unit for slot 02: READY gpu:1:GF110 [GeForce GTX 570 HD] from 171.67.108.52
18:12:10:WU02:FS02:Connecting to 171.67.108.52:8080
18:12:11:WU02:FS02:Downloading 1.52MiB
18:12:12:WARNING:WU01:FS01:Failed to get assignment from '171.64.65.160:80': Failed to connect to 171.64.65.160:80: No connection could be made because the target machine actively refused it.
18:12:12:ERROR:WU01:FS01:Exception: Could not get an assignment
18:12:12:WU02:FS02:Download complete
18:12:12:WU02:FS02:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:9201 run:610 clone:1 gen:28 core:0x17 unit:0x000000226652edc45399ee021eea3f92
18:12:12:WU02:FS02:Starting
18:12:12:ERROR:WU02:FS02:Failed to start core: Option 'opencl-index' has no default and is not set.
18:12:13:WU02:FS02:Starting
18:12:13:ERROR:WU02:FS02:Failed to start core: Option 'opencl-index' has no default and is not set.
18:12:16:WU00:FS00:0xa4:Using Gromacs checkpoints
18:12:16:WU00:FS00:0xa4:Mapping NT from 1 to 1 
18:12:16:WU00:FS00:0xa4:Resuming from checkpoint
18:12:16:WU00:FS00:0xa4:Verified 00/wudata_01.log
18:12:16:WU00:FS00:0xa4:Verified 00/wudata_01.trr
18:12:16:WU00:FS00:0xa4:Verified 00/wudata_01.xtc
18:12:16:WU00:FS00:0xa4:Verified 00/wudata_01.edr
18:12:16:WU00:FS00:0xa4:Completed 1325580 out of 2500000 steps  (53%)
18:12:46:Saving configuration to config.xml
18:12:46:<config>
18:12:46:  <!-- Folding Core -->
18:12:46:  <core-priority v='low'/>
18:12:46:
18:12:46:  <!-- Logging -->
18:12:46:  <verbosity v='5'/>
18:12:46:
18:12:46:  <!-- Network -->
18:12:46:  <proxy v=':8080'/>
18:12:46:
18:12:46:  <!-- Slot Control -->
18:12:46:  <pause-on-battery v='false'/>
18:12:46:  <power v='full'/>
18:12:46:
18:12:46:  <!-- User Information -->
18:12:46:  <user v='BR451'/>
18:12:46:
18:12:46:  <!-- Folding Slots -->
18:12:46:  <slot id='0' type='CPU'>
18:12:46:    <cpus v='2'/>
18:12:46:  </slot>
18:12:46:  <slot id='1' type='GPU'/>
18:12:46:  <slot id='2' type='GPU'/>
18:12:46:</config>
18:13:13:WU02:FS02:Starting
18:13:13:ERROR:WU02:FS02:Failed to start core: Option 'opencl-index' has no default and is not set.
18:13:47:WU01:FS01:Connecting to 171.67.108.201:80
18:13:47:WARNING:WU01:FS01:Failed to get assignment from '171.67.108.201:80': Empty work server assignment
18:13:47:WU01:FS01:Connecting to 171.64.65.160:80
18:13:49:WARNING:WU01:FS01:Failed to get assignment from '171.64.65.160:80': Failed to connect to 171.64.65.160:80: No connection could be made because the target machine actively refused it.
18:13:49:ERROR:WU01:FS01:Exception: Could not get an assignment
18:14:50:WU02:FS02:Starting
18:14:50:ERROR:WU02:FS02:Failed to start core: Option 'opencl-index' has no default and is not set.
18:16:24:WU01:FS01:Connecting to 171.67.108.201:80
18:16:24:WARNING:WU01:FS01:Failed to get assignment from '171.67.108.201:80': Empty work server assignment
18:16:24:WU01:FS01:Connecting to 171.64.65.160:80
18:16:26:WARNING:WU01:FS01:Failed to get assignment from '171.64.65.160:80': Failed to connect to 171.64.65.160:80: No connection could be made because the target machine actively refused it.
18:16:26:ERROR:WU01:FS01:Exception: Could not get an assignment
18:17:27:WU02:FS02:Starting
18:17:27:ERROR:WU02:FS02:Failed to start core: Option 'opencl-index' has no default and is not set.
18:20:38:WU01:FS01:Connecting to 171.67.108.201:80
18:20:39:WARNING:WU01:FS01:Failed to get assignment from '171.67.108.201:80': Empty work server assignment
18:20:39:WU01:FS01:Connecting to 171.64.65.160:80
18:20:40:WARNING:WU01:FS01:Failed to get assignment from '171.64.65.160:80': Failed to connect to 171.64.65.160:80: No connection could be made because the target machine actively refused it.
18:20:40:ERROR:WU01:FS01:Exception: Could not get an assignment
18:21:41:WU02:FS02:Starting
18:21:41:ERROR:WU02:FS02:Failed to start core: Option 'opencl-index' has no default and is not set.
18:21:42:WU02:FS02:Sending unit results: id:02 state:SEND error:FAILED project:9201 run:610 clone:1 gen:28 core:0x17 unit:0x000000226652edc45399ee021eea3f92
18:21:42:WU02:FS02:Connecting to 171.67.108.52:8080
18:21:42:WU02:FS02:Server responded WORK_ACK (400)
18:21:42:WU02:FS02:Cleaning up
18:21:42:WU03:FS02:Connecting to 171.67.108.201:80
18:21:43:WU03:FS02:Assigned to work server 171.67.108.52
18:21:43:WU03:FS02:Requesting new work unit for slot 02: READY gpu:1:GF110 [GeForce GTX 570 HD] from 171.67.108.52
18:21:43:WU03:FS02:Connecting to 171.67.108.52:8080
18:21:43:WU03:FS02:Downloading 1.52MiB
18:21:44:WU03:FS02:Download complete
18:21:44:WU03:FS02:Received Unit: id:03 state:DOWNLOAD error:NO_ERROR project:9201 run:529 clone:0 gen:117 core:0x17 unit:0x000000856652edc45399ead198261b6e
18:21:44:WU03:FS02:Starting
18:21:44:ERROR:WU03:FS02:Failed to start core: Option 'opencl-index' has no default and is not set.
18:21:45:WU03:FS02:Starting
18:21:45:ERROR:WU03:FS02:Failed to start core: Option 'opencl-index' has no default and is not set.
18:22:45:WU03:FS02:Starting
18:22:45:ERROR:WU03:FS02:Failed to start core: Option 'opencl-index' has no default and is not set.
gdeckn7
Posts: 4
Joined: Sun Jul 27, 2014 3:06 pm

Re: Cpus running, 2 GPUs failing

Post by gdeckn7 »

Now I removed one GPU slot and assigned the GTX570 and I'm looping a bad memtest error. A search of the boards shows it as a resolved problem from 2012 though?

Thanks again.

Code: Select all

18:42:55:WU01:FS02:0x15:*------------------------------*
18:42:55:WU01:FS02:0x15:Folding@Home GPU Core
18:42:55:WU01:FS02:0x15:Version                2.25 (Wed May 9 17:03:01 EDT 2012)
18:42:55:WU01:FS02:0x15:Build host             AmoebaRemote
18:42:55:WU01:FS02:0x15:Board Type             NVIDIA/CUDA
18:42:55:WU01:FS02:0x15:Core                   15
18:42:55:WU01:FS02:0x15:GPU device info vendor=0 device=0 name=NA match=0 deviceId=1
18:42:55:WU01:FS02:0x15:
18:42:55:WU01:FS02:0x15:Window's signal control handler registered.
18:42:55:WU01:FS02:0x15:Preparing to commence simulation
18:42:55:WU01:FS02:0x15:- Looking at optimizations...
18:42:55:WU01:FS02:0x15:DeleteFrameFiles: successfully deleted file=01/wudata_01.ckp
18:42:55:WU01:FS02:0x15:- Created dyn
18:42:55:WU01:FS02:0x15:- Files status OK
18:42:55:WU01:FS02:0x15:sizeof(CORE_PACKET_HDR) = 512 file=<>
18:42:55:WU01:FS02:0x15:- Expanded 145507 -> 660986 (decompressed 454.2 percent)
18:42:55:WU01:FS02:0x15:Called DecompressByteArray: compressed_data_size=145507 data_size=660986, decompressed_data_size=660986 diff=0
18:42:55:WU01:FS02:0x15:- Digital signature verified
18:42:55:WU01:FS02:0x15:
18:42:55:WU01:FS02:0x15:Project: 8018 (Run 102, Clone 1, Gen 718)
18:42:55:WU01:FS02:0x15:
18:42:55:WU01:FS02:0x15:Assembly optimizations on if available.
18:42:55:WU01:FS02:0x15:Entering M.D.
18:42:57:WU01:FS02:0x15:Tpr hash 01/wudata_01.tpr:  1548425380 1752575979 1237549236 2769554563 2458284104
18:42:57:WU01:FS02:0x15:GPU device id=1
18:42:57:WU01:FS02:0x15:Working on GRowing Old MAkes el Chrono Sweat
18:42:57:WU01:FS02:0x15:Client config unavailable.
18:42:57:WU01:FS02:0x15:Starting GUI Server
18:44:10:WU01:FS02:0x15:Finished fah_main status=59
18:44:10:WU01:FS02:0x15:mdrun_gpu returned 59
18:44:10:WU01:FS02:0x15:GPU memtest failure
18:44:10:WU01:FS02:0x15:
18:44:10:WU01:FS02:0x15:Folding@home Core Shutdown: GPU_MEMTEST_ERROR
18:44:11:WARNING:WU01:FS02:FahCore returned: GPU_MEMTEST_ERROR (124 = 0x7c)
18:44:11:WU01:FS02:Starting
18:44:11:WU01:FS02:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Admin/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_15.fah/FahCore_15.exe -dir 01 -suffix 01 -version 704 -lifeline 3252 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
18:44:11:WU01:FS02:Started FahCore on PID 2908
18:44:11:Started thread 14 on PID 3252
18:44:11:WU01:FS02:Core PID:892
18:44:11:WU01:FS02:FahCore 0x15 started
18:44:11:WU01:FS02:0x15:
18:44:11:WU01:FS02:0x15:*------------------------------*
18:44:11:WU01:FS02:0x15:Folding@Home GPU Core
18:44:11:WU01:FS02:0x15:Version                2.25 (Wed May 9 17:03:01 EDT 2012)
18:44:11:WU01:FS02:0x15:Build host             AmoebaRemote
18:44:11:WU01:FS02:0x15:Board Type             NVIDIA/CUDA
18:44:11:WU01:FS02:0x15:Core                   15
18:44:11:WU01:FS02:0x15:GPU device info vendor=0 device=0 name=NA match=0 deviceId=1
18:44:11:WU01:FS02:0x15:
18:44:11:WU01:FS02:0x15:Window's signal control handler registered.
18:44:11:WU01:FS02:0x15:Preparing to commence simulation
18:44:11:WU01:FS02:0x15:- Looking at optimizations...
18:44:11:WU01:FS02:0x15:DeleteFrameFiles: successfully deleted file=01/wudata_01.ckp
18:44:13:WU01:FS02:0x15:- Created dyn
18:44:13:WU01:FS02:0x15:- Files status OK
18:44:13:WU01:FS02:0x15:sizeof(CORE_PACKET_HDR) = 512 file=<>
18:44:13:WU01:FS02:0x15:- Expanded 145507 -> 660986 (decompressed 454.2 percent)
18:44:13:WU01:FS02:0x15:Called DecompressByteArray: compressed_data_size=145507 data_size=660986, decompressed_data_size=660986 diff=0
18:44:13:WU01:FS02:0x15:- Digital signature verified
18:44:13:WU01:FS02:0x15:
18:44:13:WU01:FS02:0x15:Project: 8018 (Run 102, Clone 1, Gen 718)
18:44:13:WU01:FS02:0x15:
18:44:13:WU01:FS02:0x15:Assembly optimizations on if available.
18:44:13:WU01:FS02:0x15:Entering M.D.
18:44:14:WU01:FS02:0x15:Tpr hash 01/wudata_01.tpr:  1548425380 1752575979 1237549236 2769554563 2458284104
18:44:14:WU01:FS02:0x15:GPU device id=1
18:44:15:WU01:FS02:0x15:Working on GRowing Old MAkes el Chrono Sweat
18:44:15:WU01:FS02:0x15:Client config unavailable.
18:44:15:WU01:FS02:0x15:Starting GUI Server
18:45:27:WU01:FS02:0x15:Finished fah_main status=59
18:45:27:WU01:FS02:0x15:mdrun_gpu returned 59
18:45:27:WU01:FS02:0x15:GPU memtest failure
18:45:27:WU01:FS02:0x15:
18:45:27:WU01:FS02:0x15:Folding@home Core Shutdown: GPU_MEMTEST_ERROR
18:45:28:WARNING:WU01:FS02:FahCore returned: GPU_MEMTEST_ERROR (124 = 0x7c)
18:45:28:WU01:FS02:Starting
18:45:28:WU01:FS02:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Admin/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_15.fah/FahCore_15.exe -dir 01 -suffix 01 -version 704 -lifeline 3252 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
18:45:28:WU01:FS02:Started FahCore on PID 3896
18:45:28:Started thread 15 on PID 3252
18:45:28:WU01:FS02:Core PID:3448
18:45:28:WU01:FS02:FahCore 0x15 started
18:45:29:WU01:FS02:0x15:
18:45:29:WU01:FS02:0x15:*------------------------------*
18:45:29:WU01:FS02:0x15:Folding@Home GPU Core
18:45:29:WU01:FS02:0x15:Version                2.25 (Wed May 9 17:03:01 EDT 2012)
18:45:29:WU01:FS02:0x15:Build host             AmoebaRemote
18:45:29:WU01:FS02:0x15:Board Type             NVIDIA/CUDA
18:45:29:WU01:FS02:0x15:Core                   15
18:45:29:WU01:FS02:0x15:GPU device info vendor=0 device=0 name=NA match=0 deviceId=1
18:45:29:WU01:FS02:0x15:
18:45:29:WU01:FS02:0x15:Window's signal control handler registered.
18:45:29:WU01:FS02:0x15:Preparing to commence simulation
18:45:29:WU01:FS02:0x15:- Looking at optimizations...
18:45:29:WU01:FS02:0x15:DeleteFrameFiles: successfully deleted file=01/wudata_01.ckp
18:45:30:WU01:FS02:0x15:- Created dyn
18:45:30:WU01:FS02:0x15:- Files status OK
18:45:30:WU01:FS02:0x15:sizeof(CORE_PACKET_HDR) = 512 file=<>
18:45:30:WU01:FS02:0x15:- Expanded 145507 -> 660986 (decompressed 454.2 percent)
18:45:30:WU01:FS02:0x15:Called DecompressByteArray: compressed_data_size=145507 data_size=660986, decompressed_data_size=660986 diff=0
18:45:30:WU01:FS02:0x15:- Digital signature verified
18:45:30:WU01:FS02:0x15:
18:45:30:WU01:FS02:0x15:Project: 8018 (Run 102, Clone 1, Gen 718)
18:45:30:WU01:FS02:0x15:
18:45:30:WU01:FS02:0x15:Assembly optimizations on if available.
18:45:30:WU01:FS02:0x15:Entering M.D.
18:45:32:WU01:FS02:0x15:Tpr hash 01/wudata_01.tpr:  1548425380 1752575979 1237549236 2769554563 2458284104
18:45:32:WU01:FS02:0x15:GPU device id=1
18:45:32:WU01:FS02:0x15:Working on GRowing Old MAkes el Chrono Sweat
18:45:32:WU01:FS02:0x15:Client config unavailable.
18:45:32:WU01:FS02:0x15:Starting GUI Server
18:46:44:WU01:FS02:0x15:Finished fah_main status=59
18:46:44:WU01:FS02:0x15:mdrun_gpu returned 59
18:46:44:WU01:FS02:0x15:GPU memtest failure
18:46:44:WU01:FS02:0x15:
18:46:44:WU01:FS02:0x15:Folding@home Core Shutdown: GPU_MEMTEST_ERROR
18:46:45:WARNING:WU01:FS02:FahCore returned: GPU_MEMTEST_ERROR (124 = 0x7c)
18:46:45:WU01:FS02:Starting
18:46:45:WU01:FS02:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Admin/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_15.fah/FahCore_15.exe -dir 01 -suffix 01 -version 704 -lifeline 3252 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
18:46:45:WU01:FS02:Started FahCore on PID 2544
18:46:45:Started thread 16 on PID 3252
18:46:45:WU01:FS02:Core PID:2452
18:46:45:WU01:FS02:FahCore 0x15 started
18:46:46:WU01:FS02:0x15:
18:46:46:WU01:FS02:0x15:*------------------------------*
18:46:46:WU01:FS02:0x15:Folding@Home GPU Core
18:46:46:WU01:FS02:0x15:Version                2.25 (Wed May 9 17:03:01 EDT 2012)
18:46:46:WU01:FS02:0x15:Build host             AmoebaRemote
18:46:46:WU01:FS02:0x15:Board Type             NVIDIA/CUDA
18:46:46:WU01:FS02:0x15:Core                   15
18:46:46:WU01:FS02:0x15:GPU device info vendor=0 device=0 name=NA match=0 deviceId=1
18:46:46:WU01:FS02:0x15:
18:46:46:WU01:FS02:0x15:Window's signal control handler registered.
18:46:46:WU01:FS02:0x15:Preparing to commence simulation
18:46:46:WU01:FS02:0x15:- Looking at optimizations...
18:46:46:WU01:FS02:0x15:DeleteFrameFiles: successfully deleted file=01/wudata_01.ckp
18:46:47:WU01:FS02:0x15:- Created dyn
18:46:47:WU01:FS02:0x15:- Files status OK
18:46:47:WU01:FS02:0x15:sizeof(CORE_PACKET_HDR) = 512 file=<>
18:46:47:WU01:FS02:0x15:- Expanded 145507 -> 660986 (decompressed 454.2 percent)
18:46:47:WU01:FS02:0x15:Called DecompressByteArray: compressed_data_size=145507 data_size=660986, decompressed_data_size=660986 diff=0
18:46:47:WU01:FS02:0x15:- Digital signature verified
18:46:47:WU01:FS02:0x15:
18:46:47:WU01:FS02:0x15:Project: 8018 (Run 102, Clone 1, Gen 718)
18:46:47:WU01:FS02:0x15:
18:46:47:WU01:FS02:0x15:Assembly optimizations on if available.
18:46:47:WU01:FS02:0x15:Entering M.D.
18:46:49:WU01:FS02:0x15:Tpr hash 01/wudata_01.tpr:  1548425380 1752575979 1237549236 2769554563 2458284104
18:46:49:WU01:FS02:0x15:GPU device id=1
18:46:49:WU01:FS02:0x15:Working on GRowing Old MAkes el Chrono Sweat
18:46:49:WU01:FS02:0x15:Client config unavailable.
18:46:49:WU01:FS02:0x15:Starting GUI Server
18:48:02:WU01:FS02:0x15:Finished fah_main status=59
18:48:02:WU01:FS02:0x15:mdrun_gpu returned 59
18:48:02:WU01:FS02:0x15:GPU memtest failure
18:48:02:WU01:FS02:0x15:
18:48:02:WU01:FS02:0x15:Folding@home Core Shutdown: GPU_MEMTEST_ERROR
18:48:03:WARNING:WU01:FS02:FahCore returned: GPU_MEMTEST_ERROR (124 = 0x7c)
18:48:03:WU01:FS02:Starting
18:48:03:WU01:FS02:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Admin/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_15.fah/FahCore_15.exe -dir 01 -suffix 01 -version 704 -lifeline 3252 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
18:48:03:WU01:FS02:Started FahCore on PID 1208
18:48:03:Started thread 17 on PID 3252
18:48:03:WU01:FS02:Core PID:2212
18:48:03:WU01:FS02:FahCore 0x15 started
18:48:03:WU01:FS02:0x15:
18:48:03:WU01:FS02:0x15:*------------------------------*
18:48:03:WU01:FS02:0x15:Folding@Home GPU Core
18:48:03:WU01:FS02:0x15:Version                2.25 (Wed May 9 17:03:01 EDT 2012)
18:48:03:WU01:FS02:0x15:Build host             AmoebaRemote
18:48:03:WU01:FS02:0x15:Board Type             NVIDIA/CUDA
18:48:03:WU01:FS02:0x15:Core                   15
18:48:03:WU01:FS02:0x15:GPU device info vendor=0 device=0 name=NA match=0 deviceId=1
18:48:03:WU01:FS02:0x15:
18:48:03:WU01:FS02:0x15:Window's signal control handler registered.
18:48:03:WU01:FS02:0x15:Preparing to commence simulation
18:48:03:WU01:FS02:0x15:- Looking at optimizations...
18:48:03:WU01:FS02:0x15:DeleteFrameFiles: successfully deleted file=01/wudata_01.ckp
18:48:05:WU01:FS02:0x15:- Created dyn
18:48:05:WU01:FS02:0x15:- Files status OK
18:48:05:WU01:FS02:0x15:sizeof(CORE_PACKET_HDR) = 512 file=<>
18:48:05:WU01:FS02:0x15:- Expanded 145507 -> 660986 (decompressed 454.2 percent)
18:48:05:WU01:FS02:0x15:Called DecompressByteArray: compressed_data_size=145507 data_size=660986, decompressed_data_size=660986 diff=0
18:48:05:WU01:FS02:0x15:- Digital signature verified
18:48:05:WU01:FS02:0x15:
18:48:05:WU01:FS02:0x15:Project: 8018 (Run 102, Clone 1, Gen 718)
18:48:05:WU01:FS02:0x15:
18:48:05:WU01:FS02:0x15:Assembly optimizations on if available.
18:48:05:WU01:FS02:0x15:Entering M.D.
18:48:07:WU01:FS02:0x15:Tpr hash 01/wudata_01.tpr:  1548425380 1752575979 1237549236 2769554563 2458284104
18:48:07:WU01:FS02:0x15:GPU device id=1
18:48:07:WU01:FS02:0x15:Working on GRowing Old MAkes el Chrono Sweat
18:48:07:WU01:FS02:0x15:Client config unavailable.
18:48:07:WU01:FS02:0x15:Starting GUI Server
18:48:47:WU00:FS00:0xa4:Completed 1375000 out of 2500000 steps  (55%)
18:49:19:WU01:FS02:0x15:Finished fah_main status=59
18:49:19:WU01:FS02:0x15:mdrun_gpu returned 59
18:49:19:WU01:FS02:0x15:GPU memtest failure
18:49:19:WU01:FS02:0x15:
18:49:19:WU01:FS02:0x15:Folding@home Core Shutdown: GPU_MEMTEST_ERROR
18:49:20:WARNING:WU01:FS02:FahCore returned: GPU_MEMTEST_ERROR (124 = 0x7c)
18:49:20:WARNING:WU01:FS02:Too many errors, failing
18:49:20:WU01:FS02:Sending unit results: id:01 state:SEND error:FAILED project:8018 run:102 clone:1 gen:718 core:0x15 unit:0x000003276953ee2e500f1ba23dd6a11c
18:49:20:WU01:FS02:Connecting to 171.67.108.142:8080
18:49:20:WU02:FS02:Connecting to 171.67.108.201:80
18:49:20:WU01:FS02:Server responded WORK_QUIT (404)
18:49:20:WARNING:WU01:FS02:Server did not like results, dumping
18:49:20:WU01:FS02:Cleaning up
18:49:21:WU02:FS02:Assigned to work server 171.67.108.52
18:49:21:WU02:FS02:Requesting new work unit for slot 02: READY gpu:1:GF110 [GeForce GTX 570 HD] from 171.67.108.52
18:49:21:WU02:FS02:Connecting to 171.67.108.52:8080
18:49:21:WU02:FS02:Downloading 1.52MiB
18:49:22:WU02:FS02:Download complete
18:49:23:WU02:FS02:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:9201 run:89 clone:0 gen:92 core:0x17 unit:0x0000006b6652edc45399d984ebd450f9
18:49:23:WU02:FS02:Starting
18:49:23:WU02:FS02:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Admin/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17.exe -dir 02 -suffix 01 -version 704 -lifeline 3252 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
18:49:23:WU02:FS02:Started FahCore on PID 2752
18:49:23:Started thread 18 on PID 3252
18:49:23:WU02:FS02:Core PID:1000
18:49:23:WU02:FS02:FahCore 0x17 started
18:49:23:WU02:FS02:0x17:*********************** Log Started 2014-07-27T18:49:23Z ***********************
18:49:23:WU02:FS02:0x17:Project: 9201 (Run 89, Clone 0, Gen 92)
18:49:23:WU02:FS02:0x17:Unit: 0x0000006b6652edc45399d984ebd450f9
18:49:23:WU02:FS02:0x17:CPU: 0x00000000000000000000000000000000
18:49:23:WU02:FS02:0x17:Machine: 2
18:49:23:WU02:FS02:0x17:Reading tar file state.xml
18:49:23:WU02:FS02:0x17:Reading tar file system.xml
18:49:24:WU02:FS02:0x17:Reading tar file integrator.xml
18:49:24:WU02:FS02:0x17:Reading tar file core.xml
18:49:24:WU02:FS02:0x17:Digital signatures verified
18:49:24:WU02:FS02:0x17:Folding@home GPU core17
18:49:24:WU02:FS02:0x17:Version 0.0.52
18:49:57:WU02:FS02:0x17:ERROR:exception: Error invoking kernel computeNonbonded: clEnqueueNDRangeKernel (-5)
18:49:57:WU02:FS02:0x17:Saving result file logfile_01.txt
18:49:57:WU02:FS02:0x17:Saving result file log.txt
18:49:57:WU02:FS02:0x17:Folding@home Core Shutdown: BAD_WORK_UNIT
18:49:57:WARNING:WU02:FS02:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
18:49:57:WU02:FS02:Sending unit results: id:02 state:SEND error:FAULTY project:9201 run:89 clone:0 gen:92 core:0x17 unit:0x0000006b6652edc45399d984ebd450f9
18:49:58:WU02:FS02:Uploading 2.37KiB to 171.67.108.52
18:49:58:WU02:FS02:Connecting to 171.67.108.52:8080
18:49:58:WU02:FS02:Upload complete
18:49:58:WU02:FS02:Server responded WORK_ACK (400)
18:49:58:WU02:FS02:Cleaning up
18:49:58:WU01:FS02:Connecting to 171.67.108.201:80
18:49:58:WU01:FS02:Assigned to work server 171.67.108.52
18:49:58:WU01:FS02:Requesting new work unit for slot 02: READY gpu:1:GF110 [GeForce GTX 570 HD] from 171.67.108.52
18:49:58:WU01:FS02:Connecting to 171.67.108.52:8080
18:49:59:WU01:FS02:Downloading 1.52MiB
18:50:00:WU01:FS02:Download complete
18:50:00:WU01:FS02:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:9201 run:187 clone:0 gen:83 core:0x17 unit:0x000000646652edc45399dd5a037c751f
18:50:00:WU01:FS02:Starting
18:50:00:WU01:FS02:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Admin/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17.exe -dir 01 -suffix 01 -version 704 -lifeline 3252 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
18:50:00:WU01:FS02:Started FahCore on PID 3276
18:50:00:Started thread 19 on PID 3252
18:50:00:WU01:FS02:Core PID:2852
18:50:00:WU01:FS02:FahCore 0x17 started
18:50:01:WU01:FS02:0x17:*********************** Log Started 2014-07-27T18:50:00Z ***********************
18:50:01:WU01:FS02:0x17:Project: 9201 (Run 187, Clone 0, Gen 83)
18:50:01:WU01:FS02:0x17:Unit: 0x000000646652edc45399dd5a037c751f
18:50:01:WU01:FS02:0x17:CPU: 0x00000000000000000000000000000000
18:50:01:WU01:FS02:0x17:Machine: 2
18:50:01:WU01:FS02:0x17:Reading tar file state.xml
18:50:01:WU01:FS02:0x17:Reading tar file system.xml
18:50:01:WU01:FS02:0x17:Reading tar file integrator.xml
18:50:01:WU01:FS02:0x17:Reading tar file core.xml
18:50:01:WU01:FS02:0x17:Digital signatures verified
18:50:01:WU01:FS02:0x17:Folding@home GPU core17
18:50:01:WU01:FS02:0x17:Version 0.0.52
Mod edit: Added Code tags to log file
7im
Posts: 10189
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: Cpus running, 2 GPUs failing

Post by 7im »

If using the win 8 drivers, you need to install the drivers from NV directly.

If you change hardware, reinstall the client. It will auto detect one or more GPUs if you have the correct drivers.

Also need to see the system section of the log, but change the verbosity back to a 3 first.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
gdeckn7
Posts: 4
Joined: Sun Jul 27, 2014 3:06 pm

Re: Cpus running, 2 GPUs failing

Post by gdeckn7 »

Hi 7im, reloaded the drivers manually and reinstalled client. Still having similar issues.

Code: Select all

*********************** Log Started 2014-07-28T01:13:45Z ***********************
01:13:45:************************* Folding@home Client *************************
01:13:45:      Website: http://folding.stanford.edu/
01:13:45:    Copyright: (c) 2009-2014 Stanford University
01:13:45:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
01:13:45:         Args: --open-web-control
01:13:45:       Config: C:/Users/Admin/AppData/Roaming/FAHClient/config.xml
01:13:45:******************************** Build ********************************
01:13:45:      Version: 7.4.4
01:13:45:         Date: Mar 4 2014
01:13:45:         Time: 20:26:54
01:13:45:      SVN Rev: 4130
01:13:45:       Branch: fah/trunk/client
01:13:45:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
01:13:45:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
01:13:45:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
01:13:45:     Platform: win32 XP
01:13:45:         Bits: 32
01:13:45:         Mode: Release
01:13:45:******************************* System ********************************
01:13:45:          CPU: Intel(R) Xeon(R) CPU E5420 @ 2.50GHz
01:13:45:       CPU ID: GenuineIntel Family 6 Model 23 Stepping 6
01:13:45:         CPUs: 4
01:13:45:       Memory: 8.00GiB
01:13:45:  Free Memory: 6.77GiB
01:13:45:      Threads: WINDOWS_THREADS
01:13:45:   OS Version: 6.2
01:13:45:  Has Battery: false
01:13:45:   On Battery: false
01:13:45:   UTC Offset: -7
01:13:45:          PID: 2668
01:13:45:          CWD: C:/Users/Admin/AppData/Roaming/FAHClient
01:13:45:           OS: Windows 8.1
01:13:45:      OS Arch: AMD64
01:13:45:         GPUs: 2
01:13:45:        GPU 0: NVIDIA:1 G92 [Quadro FX 3700]
01:13:45:        GPU 1: NVIDIA:2 GF110 [GeForce GTX 570 HD]
01:13:45:         CUDA: 2.0
01:13:45:  CUDA Driver: 6000
01:13:45:Win32 Service: false
01:13:45:***********************************************************************
01:13:45:<config>
01:13:45:  <!-- Folding Core -->
01:13:45:  <core-priority v='low'/>
01:13:45:
01:13:45:  <!-- Network -->
01:13:45:  <proxy v=':8080'/>
01:13:45:
01:13:45:  <!-- Slot Control -->
01:13:45:  <pause-on-battery v='false'/>
01:13:45:  <power v='full'/>
01:13:45:
01:13:45:  <!-- User Information -->
01:13:45:  <user v='BR451'/>
01:13:45:
01:13:45:  <!-- Folding Slots -->
01:13:45:  <slot id='0' type='CPU'/>
01:13:45:  <slot id='2' type='GPU'/>
01:13:45:  <slot id='1' type='GPU'/>
01:13:45:</config>
01:13:45:Trying to access database...
01:13:45:Successfully acquired database lock
01:13:45:Enabled folding slot 00: READY cpu:2
01:13:45:Enabled folding slot 02: READY gpu:0:G92 [Quadro FX 3700]
01:13:45:Enabled folding slot 01: READY gpu:1:GF110 [GeForce GTX 570 HD]
01:13:45:WU00:FS00:Starting
01:13:45:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Admin/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/Core_a4.fah/FahCore_a4.exe -dir 00 -suffix 01 -version 704 -lifeline 2668 -checkpoint 15
01:13:45:WU00:FS00:Started FahCore on PID 4372
01:13:45:WU00:FS00:Core PID:4744
01:13:45:WU00:FS00:FahCore 0xa4 started
01:13:45:WU00:FS00:0xa4:
01:13:45:WU00:FS00:0xa4:*------------------------------*
01:13:45:WU00:FS00:0xa4:Folding@Home Gromacs GB Core
01:13:45:WU00:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
01:13:45:WU00:FS00:0xa4:
01:13:45:WU00:FS00:0xa4:Preparing to commence simulation
01:13:45:WU00:FS00:0xa4:- Looking at optimizations...
01:13:45:WU00:FS00:0xa4:- Files status OK
01:13:45:WU00:FS00:0xa4:- Expanded 202287 -> 425556 (decompressed 210.3 percent)
01:13:45:WU00:FS00:0xa4:Called DecompressByteArray: compressed_data_size=202287 data_size=425556, decompressed_data_size=425556 diff=0
01:13:45:WU00:FS00:0xa4:- Digital signature verified
01:13:45:WU00:FS00:0xa4:
01:13:45:WU00:FS00:0xa4:Project: 6376 (Run 37, Clone 3, Gen 11)
01:13:45:WU03:FS01:Starting
01:13:45:WU00:FS00:0xa4:
01:13:45:WU00:FS00:0xa4:Assembly optimizations on if available.
01:13:45:WU03:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Admin/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17.exe -dir 03 -suffix 01 -version 704 -lifeline 2668 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
01:13:45:WU00:FS00:0xa4:Entering M.D.
01:13:45:WU03:FS01:Started FahCore on PID 3316
01:13:45:WU03:FS01:Core PID:4412
01:13:45:WU03:FS01:FahCore 0x17 started
01:13:46:WU01:FS02:Connecting to 171.67.108.201:80
01:13:46:WU03:FS01:0x17:*********************** Log Started 2014-07-28T01:13:46Z ***********************
01:13:46:WU03:FS01:0x17:Project: 9201 (Run 569, Clone 1, Gen 12)
01:13:46:WU03:FS01:0x17:Unit: 0x000000156652edc45399ec6414ddb981
01:13:46:WU03:FS01:0x17:CPU: 0x00000000000000000000000000000000
01:13:46:WU03:FS01:0x17:Machine: 1
01:13:46:WU03:FS01:0x17:Reading tar file state.xml
01:13:46:WU03:FS01:0x17:Reading tar file system.xml
01:13:46:WARNING:WU01:FS02:Failed to get assignment from '171.67.108.201:80': Empty work server assignment
01:13:46:WU01:FS02:Connecting to 171.64.65.160:80
01:13:46:WU03:FS01:0x17:Reading tar file integrator.xml
01:13:46:WU03:FS01:0x17:Reading tar file core.xml
01:13:46:WU03:FS01:0x17:Digital signatures verified
01:13:46:WU03:FS01:0x17:Folding@home GPU core17
01:13:46:WU03:FS01:0x17:Version 0.0.52
01:13:47:WARNING:WU01:FS02:Failed to get assignment from '171.64.65.160:80': Failed to connect to 171.64.65.160:80: No connection could be made because the target machine actively refused it.
01:13:47:ERROR:WU01:FS02:Exception: Could not get an assignment
01:13:48:WU01:FS02:Connecting to 171.67.108.201:80
01:13:48:WARNING:WU01:FS02:Failed to get assignment from '171.67.108.201:80': Empty work server assignment
01:13:48:WU01:FS02:Connecting to 171.64.65.160:80
01:13:48:16:127.0.0.1:New Web connection
01:13:51:WU00:FS00:0xa4:Using Gromacs checkpoints
01:13:51:WU00:FS00:0xa4:Mapping NT from 1 to 1 
01:13:51:WU00:FS00:0xa4:Resuming from checkpoint
01:13:51:WU00:FS00:0xa4:Verified 00/wudata_01.log
01:13:51:WU00:FS00:0xa4:Verified 00/wudata_01.trr
01:13:51:WU00:FS00:0xa4:Verified 00/wudata_01.xtc
01:13:51:WU00:FS00:0xa4:Verified 00/wudata_01.edr
01:13:51:WU00:FS00:0xa4:Completed 2075550 out of 2500000 steps  (83%)
01:13:55:WARNING:WU01:FS02:Failed to get assignment from '171.64.65.160:80': Failed to connect to 171.64.65.160:80: No connection could be made because the target machine actively refused it.
01:13:55:ERROR:WU01:FS02:Exception: Could not get an assignment
01:14:13:WU03:FS01:0x17:ERROR:exception: Error invoking kernel computeNonbonded: clEnqueueNDRangeKernel (-5)
01:14:13:WU03:FS01:0x17:Saving result file logfile_01.txt
01:14:13:WU03:FS01:0x17:Saving result file log.txt
01:14:13:WU03:FS01:0x17:Folding@home Core Shutdown: BAD_WORK_UNIT
01:14:13:WARNING:WU03:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
01:14:13:WU03:FS01:Sending unit results: id:03 state:SEND error:FAULTY project:9201 run:569 clone:1 gen:12 core:0x17 unit:0x000000156652edc45399ec6414ddb981
01:14:13:WU03:FS01:Uploading 2.38KiB to 171.67.108.52
01:14:13:WU03:FS01:Connecting to 171.67.108.52:8080
01:14:14:WU02:FS01:Connecting to 171.67.108.201:80
01:14:14:WU02:FS01:Assigned to work server 171.67.108.52
01:14:14:WU02:FS01:Requesting new work unit for slot 01: READY gpu:1:GF110 [GeForce GTX 570 HD] from 171.67.108.52
01:14:14:WU02:FS01:Connecting to 171.67.108.52:8080
01:14:14:WU02:FS01:Downloading 1.53MiB
01:14:16:WU02:FS01:Download complete
01:14:16:WU02:FS01:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:9201 run:17 clone:1 gen:13 core:0x17 unit:0x0000001d6652edc45399d6b6ecc0fe05
01:14:16:WU02:FS01:Starting
01:14:16:WU02:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Admin/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17.exe -dir 02 -suffix 01 -version 704 -lifeline 2668 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
01:14:16:WU02:FS01:Started FahCore on PID 5004
01:14:16:WU02:FS01:Core PID:4604
01:14:16:WU02:FS01:FahCore 0x17 started
01:14:16:WU02:FS01:0x17:*********************** Log Started 2014-07-28T01:14:16Z ***********************
01:14:16:WU02:FS01:0x17:Project: 9201 (Run 17, Clone 1, Gen 13)
01:14:16:WU02:FS01:0x17:Unit: 0x0000001d6652edc45399d6b6ecc0fe05
01:14:16:WU02:FS01:0x17:CPU: 0x00000000000000000000000000000000
01:14:16:WU02:FS01:0x17:Machine: 1
01:14:16:WU02:FS01:0x17:Reading tar file state.xml
01:14:16:WU02:FS01:0x17:Reading tar file system.xml
01:14:16:WU02:FS01:0x17:Reading tar file integrator.xml
01:14:16:WU02:FS01:0x17:Reading tar file core.xml
01:14:16:WU02:FS01:0x17:Digital signatures verified
01:14:16:WU02:FS01:0x17:Folding@home GPU core17
01:14:16:WU02:FS01:0x17:Version 0.0.52
01:14:17:WU03:FS01:Upload complete
01:14:17:WU03:FS01:Server responded WORK_ACK (400)
01:14:17:WU03:FS01:Cleaning up
01:14:44:WU02:FS01:0x17:ERROR:exception: Error invoking kernel computeNonbonded: clEnqueueNDRangeKernel (-5)
01:14:44:WU02:FS01:0x17:Saving result file logfile_01.txt
01:14:44:WU02:FS01:0x17:Saving result file log.txt
01:14:44:WU02:FS01:0x17:Folding@home Core Shutdown: BAD_WORK_UNIT
01:14:45:WARNING:WU02:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
01:14:45:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:9201 run:17 clone:1 gen:13 core:0x17 unit:0x0000001d6652edc45399d6b6ecc0fe05
01:14:45:WU02:FS01:Uploading 2.38KiB to 171.67.108.52
01:14:45:WU02:FS01:Connecting to 171.67.108.52:8080
01:14:45:WU02:FS01:Upload complete
01:14:45:WU03:FS01:Connecting to 171.67.108.201:80
01:14:45:WU02:FS01:Server responded WORK_ACK (400)
01:14:45:WU02:FS01:Cleaning up
01:14:46:WU03:FS01:Assigned to work server 171.67.108.52
01:14:46:WU03:FS01:Requesting new work unit for slot 01: READY gpu:1:GF110 [GeForce GTX 570 HD] from 171.67.108.52
01:14:46:WU03:FS01:Connecting to 171.67.108.52:8080
01:14:46:WU03:FS01:Downloading 1.52MiB
01:14:47:WU03:FS01:Download complete
01:14:47:WU03:FS01:Received Unit: id:03 state:DOWNLOAD error:NO_ERROR project:9201 run:420 clone:1 gen:9 core:0x17 unit:0x0000001a6652edc45399e68e5ad5519c
01:14:47:WU03:FS01:Starting
01:14:47:WU03:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Admin/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_17.fah/FahCore_17.exe -dir 03 -suffix 01 -version 704 -lifeline 2668 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
01:14:48:WU03:FS01:Started FahCore on PID 4080
01:14:48:WU03:FS01:Core PID:692
01:14:48:WU03:FS01:FahCore 0x17 started
01:14:48:WU01:FS02:Connecting to 171.67.108.201:80
01:14:48:WU03:FS01:0x17:*********************** Log Started 2014-07-28T01:14:48Z ***********************
01:14:48:WU03:FS01:0x17:Project: 9201 (Run 420, Clone 1, Gen 9)
01:14:48:WU03:FS01:0x17:Unit: 0x0000001a6652edc45399e68e5ad5519c
01:14:48:WU03:FS01:0x17:CPU: 0x00000000000000000000000000000000
01:14:48:WU03:FS01:0x17:Machine: 1
01:14:48:WU03:FS01:0x17:Reading tar file state.xml
01:14:48:WU03:FS01:0x17:Reading tar file system.xml
01:14:48:WARNING:WU01:FS02:Failed to get assignment from '171.67.108.201:80': Empty work server assignment
01:14:48:WU01:FS02:Connecting to 171.64.65.160:80
01:14:48:WU03:FS01:0x17:Reading tar file integrator.xml
01:14:48:WU03:FS01:0x17:Reading tar file core.xml
01:14:48:WU03:FS01:0x17:Digital signatures verified
01:14:48:WU03:FS01:0x17:Folding@home GPU core17
01:14:48:WU03:FS01:0x17:Version 0.0.52
01:14:50:WARNING:WU01:FS02:Failed to get assignment from '171.64.65.160:80': Failed to connect to 171.64.65.160:80: No connection could be made because the target machine actively refused it.
01:14:50:ERROR:WU01:FS02:Exception: Could not get an assignment
01:15:23:WU03:FS01:0x17:ERROR:exception: Error invoking kernel computeNonbonded: clEnqueueNDRangeKernel (-5)
01:15:23:WU03:FS01:0x17:Saving result file logfile_01.txt
01:15:23:WU03:FS01:0x17:Saving result file log.txt
01:15:23:WU03:FS01:0x17:Folding@home Core Shutdown: BAD_WORK_UNIT
01:15:23:WARNING:WU03:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
01:15:24:WU03:FS01:Sending unit results: id:03 state:SEND error:FAULTY project:9201 run:420 clone:1 gen:9 core:0x17 unit:0x0000001a6652edc45399e68e5ad5519c
01:15:24:WU03:FS01:Uploading 2.38KiB to 171.67.108.52
01:15:24:WU03:FS01:Connecting to 171.67.108.52:8080
01:15:24:WU03:FS01:Upload complete
01:15:24:WU03:FS01:Server responded WORK_ACK (400)
01:15:24:WU03:FS01:Cleaning up
01:15:24:WU02:FS01:Connecting to 171.67.108.201:80
01:15:25:WU02:FS01:Assigned to work server 171.64.65.105
01:15:25:WU02:FS01:Requesting new work unit for slot 01: READY gpu:1:GF110 [GeForce GTX 570 HD] from 171.64.65.105
01:15:25:WU02:FS01:Connecting to 171.64.65.105:8080
01:15:25:WU02:FS01:Downloading 122.78KiB
01:15:26:WU02:FS01:Download complete
01:15:26:WU02:FS01:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:7620 run:714 clone:0 gen:293 core:0x15 unit:0x0000018a664f2dd14e42f7361e9da432
01:15:26:WU02:FS01:Starting
01:15:26:WU02:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Admin/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_15.fah/FahCore_15.exe -dir 02 -suffix 01 -version 704 -lifeline 2668 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
01:15:26:WU02:FS01:Started FahCore on PID 1228
01:15:26:WU02:FS01:Core PID:4620
01:15:26:WU02:FS01:FahCore 0x15 started
01:15:27:WU02:FS01:0x15:
01:15:27:WU02:FS01:0x15:*------------------------------*
01:15:27:WU02:FS01:0x15:Folding@Home GPU Core
01:15:27:WU02:FS01:0x15:Version                2.25 (Wed May 9 17:03:01 EDT 2012)
01:15:27:WU02:FS01:0x15:Build host             AmoebaRemote
01:15:27:WU02:FS01:0x15:Board Type             NVIDIA/CUDA
01:15:27:WU02:FS01:0x15:Core                   15
01:15:27:WU02:FS01:0x15:GPU device info vendor=0 device=0 name=NA match=0 deviceId=1
01:15:27:WU02:FS01:0x15:
01:15:27:WU02:FS01:0x15:Window's signal control handler registered.
01:15:27:WU02:FS01:0x15:Preparing to commence simulation
01:15:27:WU02:FS01:0x15:- Looking at optimizations...
01:15:27:WU02:FS01:0x15:DeleteFrameFiles: successfully deleted file=02/wudata_01.ckp
01:15:27:WU02:FS01:0x15:- Created dyn
01:15:27:WU02:FS01:0x15:- Files status OK
01:15:27:WU02:FS01:0x15:sizeof(CORE_PACKET_HDR) = 512 file=<>
01:15:27:WU02:FS01:0x15:- Expanded 125210 -> 501826 (decompressed 400.7 percent)
01:15:27:WU02:FS01:0x15:Called DecompressByteArray: compressed_data_size=125210 data_size=501826, decompressed_data_size=501826 diff=0
01:15:27:WU02:FS01:0x15:- Digital signature verified
01:15:27:WU02:FS01:0x15:
01:15:27:WU02:FS01:0x15:Project: 7620 (Run 714, Clone 0, Gen 293)
01:15:27:WU02:FS01:0x15:
01:15:27:WU02:FS01:0x15:Assembly optimizations on if available.
01:15:27:WU02:FS01:0x15:Entering M.D.
01:15:29:WU02:FS01:0x15:Tpr hash 02/wudata_01.tpr:  2589215778 534247427 1777015339 3401230602 3183283660
01:15:29:WU02:FS01:0x15:GPU device id=1
01:15:29:WU02:FS01:0x15:Working on Protein
01:15:29:WU02:FS01:0x15:Client config unavailable.
01:15:29:WU02:FS01:0x15:Starting GUI Server
01:16:25:WU01:FS02:Connecting to 171.67.108.201:80
01:16:25:WARNING:WU01:FS02:Failed to get assignment from '171.67.108.201:80': Empty work server assignment
01:16:25:WU01:FS02:Connecting to 171.64.65.160:80
01:16:27:WARNING:WU01:FS02:Failed to get assignment from '171.64.65.160:80': Failed to connect to 171.64.65.160:80: No connection could be made because the target machine actively refused it.
01:16:27:ERROR:WU01:FS02:Exception: Could not get an assignment
01:16:40:WU02:FS01:0x15:Finished fah_main status=59
01:16:40:WU02:FS01:0x15:mdrun_gpu returned 59
01:16:40:WU02:FS01:0x15:GPU memtest failure
01:16:40:WU02:FS01:0x15:
01:16:40:WU02:FS01:0x15:Folding@home Core Shutdown: GPU_MEMTEST_ERROR
01:16:41:WARNING:WU02:FS01:FahCore returned: GPU_MEMTEST_ERROR (124 = 0x7c)
01:16:41:WU02:FS01:Starting
01:16:41:WU02:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Admin/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_15.fah/FahCore_15.exe -dir 02 -suffix 01 -version 704 -lifeline 2668 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
01:16:41:WU02:FS01:Started FahCore on PID 4352
01:16:41:WU02:FS01:Core PID:5088
01:16:41:WU02:FS01:FahCore 0x15 started
01:16:42:WU02:FS01:0x15:
01:16:42:WU02:FS01:0x15:*------------------------------*
01:16:42:WU02:FS01:0x15:Folding@Home GPU Core
01:16:42:WU02:FS01:0x15:Version                2.25 (Wed May 9 17:03:01 EDT 2012)
01:16:42:WU02:FS01:0x15:Build host             AmoebaRemote
01:16:42:WU02:FS01:0x15:Board Type             NVIDIA/CUDA
01:16:42:WU02:FS01:0x15:Core                   15
01:16:42:WU02:FS01:0x15:GPU device info vendor=0 device=0 name=NA match=0 deviceId=1
01:16:42:WU02:FS01:0x15:
01:16:42:WU02:FS01:0x15:Window's signal control handler registered.
01:16:42:WU02:FS01:0x15:Preparing to commence simulation
01:16:42:WU02:FS01:0x15:- Looking at optimizations...
01:16:42:WU02:FS01:0x15:DeleteFrameFiles: successfully deleted file=02/wudata_01.ckp
01:16:42:WU02:FS01:0x15:- Created dyn
01:16:42:WU02:FS01:0x15:- Files status OK
01:16:42:WU02:FS01:0x15:sizeof(CORE_PACKET_HDR) = 512 file=<>
01:16:42:WU02:FS01:0x15:- Expanded 125210 -> 501826 (decompressed 400.7 percent)
01:16:42:WU02:FS01:0x15:Called DecompressByteArray: compressed_data_size=125210 data_size=501826, decompressed_data_size=501826 diff=0
01:16:42:WU02:FS01:0x15:- Digital signature verified
01:16:42:WU02:FS01:0x15:
01:16:42:WU02:FS01:0x15:Project: 7620 (Run 714, Clone 0, Gen 293)
01:16:42:WU02:FS01:0x15:
01:16:42:WU02:FS01:0x15:Assembly optimizations on if available.
01:16:42:WU02:FS01:0x15:Entering M.D.
01:16:45:WU02:FS01:0x15:Tpr hash 02/wudata_01.tpr:  2589215778 534247427 1777015339 3401230602 3183283660
01:16:45:WU02:FS01:0x15:GPU device id=1
01:16:45:WU02:FS01:0x15:Working on Protein
01:16:45:WU02:FS01:0x15:Client config unavailable.
01:16:45:WU02:FS01:0x15:Starting GUI Server
Napoleon
Posts: 887
Joined: Wed May 26, 2010 2:31 pm
Hardware configuration: Atom330 (overclocked):
Windows 7 Ultimate 64bit
Intel Atom330 dualcore (4 HyperThreads)
NVidia GT430, core_15 work
2x2GB Kingston KVR1333D3N9K2/4G 1333MHz memory kit
Asus AT3IONT-I Deluxe motherboard
Location: Finland

Re: Cpus running, 2 GPUs failing

Post by Napoleon »

I'm pretty sure the autodetect is getting gpu-index/cuda-index/opencl-index values mixed up once again. Try setting them manually for each GPU slot - viewtopic.php?f=67&t=19989&p=199379#p199379 for instructions.
gdeckn7
Posts: 4
Joined: Sun Jul 27, 2014 3:06 pm

Re: Cpus running, 2 GPUs failing

Post by gdeckn7 »

Thanks, the gpu index vs cuda/open-cl index solution fixed it.
Post Reply