Page 2 of 2

Re: Failing to get WU today work server refuses [171.67.108.

Posted: Mon Apr 30, 2018 12:39 am
by Aurum
SteveWillis wrote:systemctl stop FAHClient
sleep 2
pkill -e -9 FahCore
pkill -e -9 FAHClient
systemctl restart FAHClient
As a Linux neophyte I've been wondering how to stop running FAH. You said script, do you save this and it executes automatically??? Or do you execute line by line in Terminal???
I had 5 rigs sitting idle this morning. They're headless Linux rigs so I just cycle the power. TeamViewer 13 for Linux hardly works 10% of the time, rest just a black screen if it can detect a connection at all.

Re: Failing to get WU today work server refuses [171.67.108.

Posted: Mon Apr 30, 2018 6:02 am
by Nick200
Hi

This issue does not appear to have been fixed. Here's what I am seeing in the log for one of my two stalled GTX1080 cards:

Code: Select all

*********************** Log Started 2018-04-30T05:16:24Z ***********************
05:16:24:************************* Folding@home Client *************************
05:16:24:        Website: http://folding.stanford.edu/
05:16:24:      Copyright: (c) 2009-2016 Stanford University
05:16:24:         Author: Joseph Coffland <joseph@cauldrondevelopment.com>
05:16:24:           Args: 
05:16:24:         Config: C:\Users\nickm\AppData\Roaming\FAHClient\config.xml
05:16:24:******************************** Build ********************************
05:16:24:        Version: 7.4.16
05:16:24:           Date: Jan 6 2017
05:16:24:           Time: 00:25:14
05:16:24:     Repository: Git
05:16:24:       Revision: a9e9e27dc2ee6ff01398c439677bc27f6cb74032
05:16:24:         Branch: master
05:16:24:       Compiler: Visual C++ 2008
05:16:24:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox -arch:SSE /MT
05:16:24:       Platform: win32 10
05:16:24:           Bits: 32
05:16:24:           Mode: Release
05:16:24:******************************* System ********************************
05:16:24:            CPU: Intel(R) Core(TM) i5-8600K CPU @ 3.60GHz
05:16:24:         CPU ID: GenuineIntel Family 6 Model 158 Stepping 10
05:16:24:           CPUs: 6
05:16:24:         Memory: 15.88GiB
05:16:24:    Free Memory: 11.42GiB
05:16:24:        Threads: WINDOWS_THREADS
05:16:24:     OS Version: 6.2
05:16:24:    Has Battery: false
05:16:24:     On Battery: false
05:16:24:     UTC Offset: 12
05:16:24:            PID: 15768
05:16:24:            CWD: C:\Users\nickm\AppData\Roaming\FAHClient
05:16:24:             OS: Windows 10 Pro
05:16:24:        OS Arch: AMD64
05:16:24:           GPUs: 2
05:16:24:          GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:7 GP102 [GeForce GTX 1080 Ti] 11380
05:16:24:          GPU 1: Bus:2 Slot:0 Func:0 NVIDIA:7 GP102 [GeForce GTX 1080 Ti] 11380
05:16:24:  CUDA Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:6.1 Driver:9.2
05:16:24:  CUDA Device 1: Platform:0 Device:1 Bus:2 Slot:0 Compute:6.1 Driver:9.2
05:16:24:OpenCL Device 0: Platform:0 Device:0 Bus:NA Slot:NA Compute:2.1 Driver:23.20
05:16:24:  Win32 Service: false
05:16:24:***********************************************************************
05:16:24:<config>
05:16:24:  <!-- Network -->
05:16:24:  <proxy v=':8080'/>
05:16:24:
05:16:24:  <!-- Slot Control -->
05:16:24:  <power v='full'/>
05:16:24:
05:16:24:  <!-- User Information -->
05:16:24:  <passkey v='********************************'/>
05:16:24:  <team v='142900'/>
05:16:24:  <user v='Montague-Cripps'/>
05:16:24:
05:16:24:  <!-- Folding Slots -->
05:16:24:  <slot id='0' type='CPU'/>
05:16:24:  <slot id='1' type='GPU'>
05:16:24:    <client-type v='advanced'/>
05:16:24:    <gpu-index v='0'/>
05:16:24:  </slot>
05:16:24:  <slot id='2' type='GPU'>
05:16:24:    <client-type v='advanced'/>
05:16:24:    <gpu-index v='1'/>
05:16:24:  </slot>
05:16:24:</config>
05:16:24:Trying to access database...
05:16:24:Successfully acquired database lock
05:16:24:Enabled folding slot 00: READY cpu:4
05:16:24:Enabled folding slot 01: READY gpu:0:GP102 [GeForce GTX 1080 Ti] 11380
05:16:24:Enabled folding slot 02: READY gpu:1:GP102 [GeForce GTX 1080 Ti] 11380
05:16:24:WU00:FS00:Starting
05:16:24:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\nickm\AppData\Roaming\FAHClient\cores/fahwebx.stanford.edu/cores/Win32/AMD64/Core_a4.fah/FahCore_a4.exe -dir 00 -suffix 01 -version 704 -lifeline 15768 -checkpoint 15 -np 4
05:16:24:WU00:FS00:Started FahCore on PID 16040
05:16:25:WU00:FS00:Core PID:16064
05:16:25:WU00:FS00:FahCore 0xa4 started
05:16:25:WU01:FS01:Connecting to 171.67.108.45:80
05:16:25:WU02:FS02:Connecting to 171.67.108.45:80
05:16:25:WU00:FS00:0xa4:
05:16:25:WU00:FS00:0xa4:*------------------------------*
05:16:25:WU00:FS00:0xa4:Folding@Home Gromacs GB Core
05:16:25:WU00:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
05:16:25:WU00:FS00:0xa4:
05:16:25:WU00:FS00:0xa4:Preparing to commence simulation
05:16:25:WU00:FS00:0xa4:- Ensuring status. Please wait.
05:16:34:WU00:FS00:0xa4:- Looking at optimizations...
05:16:34:WU00:FS00:0xa4:- Working with standard loops on this execution.
05:16:34:WU00:FS00:0xa4:- Previous termination of core was improper.
05:16:34:WU00:FS00:0xa4:- Files status OK
05:16:34:WU00:FS00:0xa4:- Expanded 343462 -> 549472 (decompressed 159.9 percent)
05:16:34:WU00:FS00:0xa4:Called DecompressByteArray: compressed_data_size=343462 data_size=549472, decompressed_data_size=549472 diff=0
05:16:34:WU00:FS00:0xa4:- Digital signature verified
05:16:34:WU00:FS00:0xa4:
05:16:34:WU00:FS00:0xa4:Project: 8658 (Run 78, Clone 0, Gen 86)
05:16:34:WU00:FS00:0xa4:
05:16:34:WU00:FS00:0xa4:Entering M.D.
05:16:40:WU00:FS00:0xa4:Using Gromacs checkpoints
05:16:40:WU00:FS00:0xa4:Mapping NT from 4 to 4 
05:16:40:WU00:FS00:0xa4:Resuming from checkpoint
05:16:40:WU00:FS00:0xa4:Verified 00/wudata_01.log
05:16:40:WU00:FS00:0xa4:Verified 00/wudata_01.trr
05:16:40:WU00:FS00:0xa4:Verified 00/wudata_01.xtc
05:16:40:WU00:FS00:0xa4:Verified 00/wudata_01.edr
05:16:40:WU00:FS00:0xa4:Completed 2045410 out of 2500000 steps  (81%)
05:17:05:WU00:FS00:0xa4:Completed 2050000 out of 2500000 steps  (82%)
05:17:26:WARNING:WU01:FS01:Failed to get assignment from '171.67.108.45:80': 10002: Received short response, expected 272 bytes, got 0
05:17:26:WU01:FS01:Connecting to 171.64.65.35:80
05:17:26:WARNING:WU02:FS02:Failed to get assignment from '171.67.108.45:80': 10002: Received short response, expected 272 bytes, got 0
05:17:26:WU02:FS02:Connecting to 171.64.65.35:80
05:18:27:WARNING:WU02:FS02:Failed to get assignment from '171.64.65.35:80': 10002: Received short response, expected 272 bytes, got 0
05:18:27:ERROR:WU02:FS02:Exception: Could not get an assignment
05:18:27:WARNING:WU01:FS01:Failed to get assignment from '171.64.65.35:80': 10002: Received short response, expected 272 bytes, got 0
05:18:27:ERROR:WU01:FS01:Exception: Could not get an assignment
05:18:27:WU02:FS02:Connecting to 171.67.108.45:80
05:18:27:WU01:FS01:Connecting to 171.67.108.45:80
05:19:16:WU00:FS00:0xa4:Completed 2075000 out of 2500000 steps  (83%)
05:19:28:WARNING:WU02:FS02:Failed to get assignment from '171.67.108.45:80': 10002: Received short response, expected 272 bytes, got 0
05:19:28:WU02:FS02:Connecting to 171.64.65.35:80
05:19:28:WARNING:WU01:FS01:Failed to get assignment from '171.67.108.45:80': 10002: Received short response, expected 272 bytes, got 0
05:19:28:WU01:FS01:Connecting to 171.64.65.35:80
05:20:29:WARNING:WU02:FS02:Failed to get assignment from '171.64.65.35:80': 10002: Received short response, expected 272 bytes, got 0
05:20:29:ERROR:WU02:FS02:Exception: Could not get an assignment
05:20:29:WARNING:WU01:FS01:Failed to get assignment from '171.64.65.35:80': 10002: Received short response, expected 272 bytes, got 0
05:20:29:ERROR:WU01:FS01:Exception: Could not get an assignment
05:20:29:WU02:FS02:Connecting to 171.67.108.45:80
05:20:29:WU01:FS01:Connecting to 171.67.108.45:80
05:21:30:WARNING:WU01:FS01:Failed to get assignment from '171.67.108.45:80': 10002: Received short response, expected 272 bytes, got 0
05:21:30:WU01:FS01:Connecting to 171.64.65.35:80
05:21:30:WARNING:WU02:FS02:Failed to get assignment from '171.67.108.45:80': 10002: Received short response, expected 272 bytes, got 0
05:21:30:WU02:FS02:Connecting to 171.64.65.35:80
05:21:31:WU00:FS00:0xa4:Completed 2100000 out of 2500000 steps  (84%)
05:22:31:WARNING:WU01:FS01:Failed to get assignment from '171.64.65.35:80': 10002: Received short response, expected 272 bytes, got 0
05:22:31:ERROR:WU01:FS01:Exception: Could not get an assignment
05:22:31:WARNING:WU02:FS02:Failed to get assignment from '171.64.65.35:80': 10002: Received short response, expected 272 bytes, got 0
05:22:31:ERROR:WU02:FS02:Exception: Could not get an assignment
05:22:31:WU01:FS01:Connecting to 171.67.108.45:80
05:22:31:WU02:FS02:Connecting to 171.67.108.45:80
05:23:32:WARNING:WU01:FS01:Failed to get assignment from '171.67.108.45:80': 10002: Received short response, expected 272 bytes, got 0
05:23:32:WU01:FS01:Connecting to 171.64.65.35:80
05:23:32:WARNING:WU02:FS02:Failed to get assignment from '171.67.108.45:80': 10002: Received short response, expected 272 bytes, got 0
05:23:32:WU02:FS02:Connecting to 171.64.65.35:80
05:23:32:WU00:FS00:0xa4:Completed 2125000 out of 2500000 steps  (85%)
05:24:34:WARNING:WU01:FS01:Failed to get assignment from '171.64.65.35:80': 10002: Received short response, expected 272 bytes, got 0
05:24:34:ERROR:WU01:FS01:Exception: Could not get an assignment
05:24:35:WARNING:WU02:FS02:Failed to get assignment from '171.64.65.35:80': 10002: Received short response, expected 272 bytes, got 0
05:24:35:ERROR:WU02:FS02:Exception: Could not get an assignment
05:25:08:WU01:FS01:Connecting to 171.67.108.45:80
05:25:09:WU02:FS02:Connecting to 171.67.108.45:80
05:25:39:WU00:FS00:0xa4:Completed 2150000 out of 2500000 steps  (86%)
05:26:10:WARNING:WU01:FS01:Failed to get assignment from '171.67.108.45:80': 10002: Received short response, expected 272 bytes, got 0
05:26:10:WU01:FS01:Connecting to 171.64.65.35:80
05:26:10:WARNING:WU02:FS02:Failed to get assignment from '171.67.108.45:80': 10002: Received short response, expected 272 bytes, got 0
05:26:10:WU02:FS02:Connecting to 171.64.65.35:80
05:27:12:WARNING:WU01:FS01:Failed to get assignment from '171.64.65.35:80': 10002: Received short response, expected 272 bytes, got 0
05:27:12:ERROR:WU01:FS01:Exception: Could not get an assignment
05:27:12:WARNING:WU02:FS02:Failed to get assignment from '171.64.65.35:80': 10002: Received short response, expected 272 bytes, got 0
05:27:12:ERROR:WU02:FS02:Exception: Could not get an assignment
05:27:50:WU00:FS00:0xa4:Completed 2175000 out of 2500000 steps  (87%)
05:29:23:WU01:FS01:Connecting to 171.67.108.45:80
05:29:23:WU02:FS02:Connecting to 171.67.108.45:80
05:30:02:WU00:FS00:0xa4:Completed 2200000 out of 2500000 steps  (88%)
05:30:24:WARNING:WU02:FS02:Failed to get assignment from '171.67.108.45:80': 10002: Received short response, expected 272 bytes, got 0
05:30:24:WU02:FS02:Connecting to 171.64.65.35:80
05:30:24:WARNING:WU01:FS01:Failed to get assignment from '171.67.108.45:80': 10002: Received short response, expected 272 bytes, got 0
05:30:24:WU01:FS01:Connecting to 171.64.65.35:80
05:31:25:WARNING:WU01:FS01:Failed to get assignment from '171.64.65.35:80': 10002: Received short response, expected 272 bytes, got 0
05:31:25:ERROR:WU01:FS01:Exception: Could not get an assignment
05:31:25:WARNING:WU02:FS02:Failed to get assignment from '171.64.65.35:80': 10002: Received short response, expected 272 bytes, got 0
05:31:25:ERROR:WU02:FS02:Exception: Could not get an assignment
05:32:29:WU00:FS00:0xa4:Completed 2225000 out of 2500000 steps  (89%)
05:34:53:WU00:FS00:0xa4:Completed 2250000 out of 2500000 steps  (90%)
05:36:14:WU01:FS01:Connecting to 171.67.108.45:80
05:36:14:WU02:FS02:Connecting to 171.67.108.45:80
05:37:07:WU00:FS00:0xa4:Completed 2275000 out of 2500000 steps  (91%)
05:37:17:WARNING:WU02:FS02:Failed to get assignment from '171.67.108.45:80': 10002: Received short response, expected 272 bytes, got 0
05:37:17:WU02:FS02:Connecting to 171.64.65.35:80
05:37:17:WARNING:WU01:FS01:Failed to get assignment from '171.67.108.45:80': 10002: Received short response, expected 272 bytes, got 0
05:37:17:WU01:FS01:Connecting to 171.64.65.35:80
05:38:19:WARNING:WU02:FS02:Failed to get assignment from '171.64.65.35:80': 10002: Received short response, expected 272 bytes, got 0
05:38:19:ERROR:WU02:FS02:Exception: Could not get an assignment
05:38:19:WARNING:WU01:FS01:Failed to get assignment from '171.64.65.35:80': 10002: Received short response, expected 272 bytes, got 0
05:38:19:ERROR:WU01:FS01:Exception: Could not get an assignment
05:39:08:WU00:FS00:0xa4:Completed 2300000 out of 2500000 steps  (92%)
05:41:08:WU00:FS00:0xa4:Completed 2325000 out of 2500000 steps  (93%)
05:43:10:WU00:FS00:0xa4:Completed 2350000 out of 2500000 steps  (94%)
05:45:08:WU00:FS00:0xa4:Completed 2375000 out of 2500000 steps  (95%)
05:47:07:WU00:FS00:0xa4:Completed 2400000 out of 2500000 steps  (96%)
05:47:20:WU01:FS01:Connecting to 171.67.108.45:80
05:47:20:WU02:FS02:Connecting to 171.67.108.45:80
05:48:21:WARNING:WU01:FS01:Failed to get assignment from '171.67.108.45:80': 10002: Received short response, expected 272 bytes, got 0
05:48:21:WU01:FS01:Connecting to 171.64.65.35:80
05:48:21:WARNING:WU02:FS02:Failed to get assignment from '171.67.108.45:80': 10002: Received short response, expected 272 bytes, got 0
05:48:21:WU02:FS02:Connecting to 171.64.65.35:80
05:49:06:WU00:FS00:0xa4:Completed 2425000 out of 2500000 steps  (97%)
05:49:23:WARNING:WU01:FS01:Failed to get assignment from '171.64.65.35:80': 10002: Received short response, expected 272 bytes, got 0
05:49:23:ERROR:WU01:FS01:Exception: Could not get an assignment
05:49:23:WARNING:WU02:FS02:Failed to get assignment from '171.64.65.35:80': 10002: Received short response, expected 272 bytes, got 0
05:49:23:ERROR:WU02:FS02:Exception: Could not get an assignment
05:51:06:WU00:FS00:0xa4:Completed 2450000 out of 2500000 steps  (98%)
05:53:06:WU00:FS00:0xa4:Completed 2475000 out of 2500000 steps  (99%)
05:53:07:WU03:FS00:Connecting to 171.67.108.45:8080
05:53:08:WU03:FS00:Assigned to work server 171.67.108.158
05:53:08:WU03:FS00:Requesting new work unit for slot 00: RUNNING cpu:4 from 171.67.108.158
05:53:08:WU03:FS00:Connecting to 171.67.108.158:8080
05:53:09:WU03:FS00:Downloading 807.75KiB
05:53:12:WU03:FS00:Download complete
05:53:12:WU03:FS00:Received Unit: id:03 state:DOWNLOAD error:NO_ERROR project:9040 run:665 clone:1 gen:923 core:0xa4 unit:0x000003efab436c9e56e9d861d8584307
05:55:05:WU00:FS00:0xa4:Completed 2500000 out of 2500000 steps  (100%)
05:55:05:WU00:FS00:0xa4:DynamicWrapper: Finished Work Unit: sleep=10000
05:55:15:WU00:FS00:0xa4:
05:55:15:WU00:FS00:0xa4:Finished Work Unit:
05:55:15:WU00:FS00:0xa4:- Reading up to 1953936 from "00/wudata_01.trr": Read 1953936
05:55:15:WU00:FS00:0xa4:trr file hash check passed.
05:55:15:WU00:FS00:0xa4:- Reading up to 2463392 from "00/wudata_01.xtc": Read 2463392
05:55:15:WU00:FS00:0xa4:xtc file hash check passed.
05:55:15:WU00:FS00:0xa4:edr file hash check passed.
05:55:15:WU00:FS00:0xa4:logfile size: 56958
05:55:15:WU00:FS00:0xa4:Leaving Run
05:55:20:WU00:FS00:0xa4:- Writing 4504846 bytes of core data to disk...
05:55:20:WU00:FS00:0xa4:Done: 4504334 -> 4361681 (compressed to 96.8 percent)
05:55:20:WU00:FS00:0xa4:  ... Done.
05:55:20:WU00:FS00:0xa4:- Shutting down core
05:55:20:WU00:FS00:0xa4:
05:55:20:WU00:FS00:0xa4:Folding@home Core Shutdown: FINISHED_UNIT
05:55:20:WU00:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
05:55:20:WU00:FS00:Sending unit results: id:00 state:SEND error:NO_ERROR project:8658 run:78 clone:0 gen:86 core:0xa4 unit:0x0000005b0002894c57fe64b4af16eff2
05:55:20:WU00:FS00:Uploading 4.16MiB to 155.247.166.220
05:55:20:WU00:FS00:Connecting to 155.247.166.220:8080
05:55:21:WU03:FS00:Starting
05:55:21:WU03:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\nickm\AppData\Roaming\FAHClient\cores/fahwebx.stanford.edu/cores/Win32/AMD64/Core_a4.fah/FahCore_a4.exe -dir 03 -suffix 01 -version 704 -lifeline 15768 -checkpoint 15 -np 4
05:55:21:WU03:FS00:Started FahCore on PID 15536
05:55:21:WU03:FS00:Core PID:356
05:55:21:WU03:FS00:FahCore 0xa4 started
05:55:21:WU03:FS00:0xa4:
05:55:21:WU03:FS00:0xa4:*------------------------------*
05:55:21:WU03:FS00:0xa4:Folding@Home Gromacs GB Core
05:55:21:WU03:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
05:55:21:WU03:FS00:0xa4:
05:55:21:WU03:FS00:0xa4:Preparing to commence simulation
05:55:21:WU03:FS00:0xa4:- Looking at optimizations...
05:55:21:WU03:FS00:0xa4:- Created dyn
05:55:21:WU03:FS00:0xa4:- Files status OK
05:55:21:WU03:FS00:0xa4:- Expanded 826620 -> 1402440 (decompressed 169.6 percent)
05:55:21:WU03:FS00:0xa4:Called DecompressByteArray: compressed_data_size=826620 data_size=1402440, decompressed_data_size=1402440 diff=0
05:55:21:WU03:FS00:0xa4:- Digital signature verified
05:55:21:WU03:FS00:0xa4:
05:55:21:WU03:FS00:0xa4:Project: 9040 (Run 665, Clone 1, Gen 923)
05:55:21:WU03:FS00:0xa4:
05:55:21:WU03:FS00:0xa4:Assembly optimizations on if available.
05:55:21:WU03:FS00:0xa4:Entering M.D.
05:55:24:WU00:FS00:Upload complete
05:55:24:WU00:FS00:Server responded WORK_ACK (400)
05:55:24:WU00:FS00:Final credit estimate, 1386.00 points
05:55:24:WU00:FS00:Cleaning up
05:55:27:WU03:FS00:0xa4:Mapping NT from 4 to 4 
05:55:27:WU03:FS00:0xa4:Completed 0 out of 250000 steps  (0%)
05:56:40:WU03:FS00:0xa4:Completed 2500 out of 250000 steps  (1%)
05:57:52:WU03:FS00:0xa4:Completed 5000 out of 250000 steps  (2%)
Oddly enough, this problem only affects two of my four folding rigs - and has been the same since the assignment server problems late last week.

If it helps, here's the log from the other affected machine:

Code: Select all

*********************** Log Started 2018-04-30T06:14:17Z ***********************
06:14:17:************************* Folding@home Client *************************
06:14:17:        Website: http://folding.stanford.edu/
06:14:17:      Copyright: (c) 2009-2016 Stanford University
06:14:17:         Author: Joseph Coffland <joseph@cauldrondevelopment.com>
06:14:17:           Args: --open-web-control
06:14:17:         Config: C:\Users\Nick Montague\AppData\Roaming\FAHClient\config.xml
06:14:17:******************************** Build ********************************
06:14:17:        Version: 7.4.16
06:14:17:           Date: Jan 6 2017
06:14:17:           Time: 00:25:14
06:14:17:     Repository: Git
06:14:17:       Revision: a9e9e27dc2ee6ff01398c439677bc27f6cb74032
06:14:17:         Branch: master
06:14:17:       Compiler: Visual C++ 2008
06:14:17:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox -arch:SSE /MT
06:14:17:       Platform: win32 10
06:14:17:           Bits: 32
06:14:17:           Mode: Release
06:14:17:******************************* System ********************************
06:14:17:            CPU: Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz
06:14:17:         CPU ID: GenuineIntel Family 6 Model 60 Stepping 3
06:14:17:           CPUs: 8
06:14:17:         Memory: 15.89GiB
06:14:17:    Free Memory: 9.68GiB
06:14:17:        Threads: WINDOWS_THREADS
06:14:17:     OS Version: 6.2
06:14:17:    Has Battery: false
06:14:17:     On Battery: false
06:14:17:     UTC Offset: 12
06:14:17:            PID: 125444
06:14:17:            CWD: C:\Users\Nick Montague\AppData\Roaming\FAHClient
06:14:17:             OS: Windows 10 Pro
06:14:17:        OS Arch: AMD64
06:14:17:           GPUs: 2
06:14:17:          GPU 0: Bus:6 Slot:0 Func:0 NVIDIA:7 GP104 [GeForce GTX 1080] 8873
06:14:17:          GPU 1: Bus:1 Slot:0 Func:0 NVIDIA:7 GP104 [GeForce GTX 1080] 8873
06:14:17:  CUDA Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:6.1 Driver:9.2
06:14:17:  CUDA Device 1: Platform:0 Device:1 Bus:6 Slot:0 Compute:6.1 Driver:9.2
06:14:17:OpenCL Device 0: Platform:0 Device:0 Bus:NA Slot:NA Compute:1.2 Driver:20.19
06:14:17:  Win32 Service: false
06:14:17:***********************************************************************
06:14:17:<config>
06:14:17:  <!-- Network -->
06:14:17:  <proxy v=':8080'/>
06:14:17:
06:14:17:  <!-- Slot Control -->
06:14:17:  <power v='full'/>
06:14:17:
06:14:17:  <!-- User Information -->
06:14:17:  <passkey v='********************************'/>
06:14:17:  <team v='142900'/>
06:14:17:  <user v='Montague-Cripps'/>
06:14:17:
06:14:17:  <!-- Folding Slots -->
06:14:17:  <slot id='0' type='CPU'>
06:14:17:    <cpus v='6'/>
06:14:17:    <paused v='true'/>
06:14:17:  </slot>
06:14:17:  <slot id='1' type='GPU'>
06:14:17:    <client-type v='advanced'/>
06:14:17:    <paused v='true'/>
06:14:17:  </slot>
06:14:17:  <slot id='2' type='GPU'>
06:14:17:    <client-type v='advanced'/>
06:14:17:    <paused v='true'/>
06:14:17:  </slot>
06:14:17:</config>
06:14:17:Trying to access database...
06:14:17:Successfully acquired database lock
06:14:17:Enabled folding slot 00: PAUSED cpu:6 (by user)
06:14:17:Enabled folding slot 01: PAUSED gpu:0:GP104 [GeForce GTX 1080] 8873 (by user)
06:14:17:Enabled folding slot 02: PAUSED gpu:1:GP104 [GeForce GTX 1080] 8873 (by user)
06:14:20:9:127.0.0.1:New Web connection
06:14:23:25:127.0.0.1:New Web connection
06:14:30:FS00:Unpaused
06:14:30:FS01:Unpaused
06:14:30:FS02:Unpaused
06:14:30:WU00:FS00:Starting
06:14:30:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" "C:\Users\Nick Montague\AppData\Roaming\FAHClient\cores/fahwebx.stanford.edu/cores/Win32/AMD64/AVX/Core_a7.fah/FahCore_a7.exe" -dir 00 -suffix 01 -version 704 -lifeline 125444 -checkpoint 15 -np 6
06:14:30:WU00:FS00:Started FahCore on PID 138228
06:14:30:WU00:FS00:Core PID:137240
06:14:30:WU00:FS00:FahCore 0xa7 started
06:14:30:WU01:FS01:Connecting to 171.67.108.45:80
06:14:30:WU02:FS02:Connecting to 171.67.108.45:80
06:14:30:WU00:FS00:0xa7:*********************** Log Started 2018-04-30T06:14:30Z ***********************
06:14:30:WU00:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
06:14:30:WU00:FS00:0xa7:       Type: 0xa7
06:14:30:WU00:FS00:0xa7:       Core: Gromacs
06:14:30:WU00:FS00:0xa7:    Website: http://folding.stanford.edu/
06:14:30:WU00:FS00:0xa7:  Copyright: (c) 2009-2016 Stanford University
06:14:30:WU00:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
06:14:30:WU00:FS00:0xa7:       Args: -dir 00 -suffix 01 -version 704 -lifeline 138228 -checkpoint 15 -np
06:14:30:WU00:FS00:0xa7:             6
06:14:30:WU00:FS00:0xa7:     Config: <none>
06:14:30:WU00:FS00:0xa7:************************************ Build *************************************
06:14:30:WU00:FS00:0xa7:    Version: 0.0.16
06:14:30:WU00:FS00:0xa7:       Date: Oct 31 2017
06:14:30:WU00:FS00:0xa7:       Time: 14:04:33
06:14:30:WU00:FS00:0xa7: Repository: Git
06:14:30:WU00:FS00:0xa7:   Revision: 2f0a8a3d0b0698be48154fe99a0216f289060932
06:14:30:WU00:FS00:0xa7:     Branch: master
06:14:30:WU00:FS00:0xa7:   Compiler: Visual C++ 2008
06:14:30:WU00:FS00:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
06:14:30:WU00:FS00:0xa7:   Platform: win32 10
06:14:30:WU00:FS00:0xa7:       Bits: 64
06:14:30:WU00:FS00:0xa7:       Mode: Release
06:14:30:WU00:FS00:0xa7:       SIMD: avx_256
06:14:30:WU00:FS00:0xa7:************************************ System ************************************
06:14:30:WU00:FS00:0xa7:        CPU: Unknown
06:14:30:WU00:FS00:0xa7:     CPU ID: 
06:14:30:WU00:FS00:0xa7:       CPUs: 8
06:14:30:WU00:FS00:0xa7:     Memory: 15.89GiB
06:14:30:WU00:FS00:0xa7:Free Memory: 9.35GiB
06:14:30:WU00:FS00:0xa7:    Threads: WINDOWS_THREADS
06:14:30:WU00:FS00:0xa7: OS Version: 6.2
06:14:30:WU00:FS00:0xa7:Has Battery: false
06:14:30:WU00:FS00:0xa7: On Battery: false
06:14:30:WU00:FS00:0xa7: UTC Offset: 12
06:14:30:WU00:FS00:0xa7:        PID: 137240
06:14:30:WU00:FS00:0xa7:        CWD: C:\Users\Nick Montague\AppData\Roaming\FAHClient\work
06:14:30:WU00:FS00:0xa7:         OS: Windows 10 Pro
06:14:30:WU00:FS00:0xa7:    OS Arch: AMD64
06:14:30:WU00:FS00:0xa7:********************************************************************************
06:14:30:WU00:FS00:0xa7:Project: 13812 (Run 0, Clone 154, Gen 77)
06:14:30:WU00:FS00:0xa7:Unit: 0x0000005180fccb025a98193398e28eef
06:14:30:WU00:FS00:0xa7:Digital signatures verified
06:14:30:WU00:FS00:0xa7:Calling: mdrun -s frame77.tpr -o frame77.trr -x frame77.xtc -cpi state.cpt -cpt 15 -nt 6
06:14:30:WU00:FS00:0xa7:Steps: first=19250000 total=250000
06:14:31:WU01:FS01:Assigned to work server 140.163.4.231
06:14:31:WU01:FS01:Requesting new work unit for slot 01: READY gpu:0:GP104 [GeForce GTX 1080] 8873 from 140.163.4.231
06:14:31:WU01:FS01:Connecting to 140.163.4.231:8080
06:14:32:WU00:FS00:0xa7:Completed 107794 out of 250000 steps (43%)
06:14:33:WU02:FS02:Assigned to work server 155.247.166.219
06:14:33:WU02:FS02:Requesting new work unit for slot 02: READY gpu:1:GP104 [GeForce GTX 1080] 8873 from 155.247.166.219
06:14:33:WU02:FS02:Connecting to 155.247.166.219:8080
06:14:33:WU01:FS01:Downloading 16.52MiB
06:14:35:WU02:FS02:Downloading 903.67KiB
06:14:38:WU02:FS02:Download complete
06:14:38:WU02:FS02:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:13783 run:608 clone:3 gen:164 core:0x21 unit:0x000000ba0002894b5a833fc3ea868f08
06:14:38:WU02:FS02:Starting
06:14:38:ERROR:WU02:FS02:Failed to start core: OpenCL device matching slot 2 not found, try setting 'opencl-index' manually
06:14:38:WU02:FS02:Starting
06:14:38:ERROR:WU02:FS02:Failed to start core: OpenCL device matching slot 2 not found, try setting 'opencl-index' manually
06:14:39:WU01:FS01:Download 10.97%
06:14:45:WU01:FS01:Download 29.51%
06:14:51:WU01:FS01:Download 69.24%
06:14:55:WU01:FS01:Download complete
06:14:55:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:11713 run:5 clone:335 gen:18 core:0x21 unit:0x000000178ca304e75adf743009dc6bfa
06:14:55:WU01:FS01:Starting
06:14:55:ERROR:WU01:FS01:Failed to start core: OpenCL device matching slot 1 not found, try setting 'opencl-index' manually
06:14:56:WU01:FS01:Starting
06:14:56:ERROR:WU01:FS01:Failed to start core: OpenCL device matching slot 1 not found, try setting 'opencl-index' manually
06:15:18:Removing old file 'configs/config-20180329-062952.xml'
06:15:18:Saving configuration to config.xml
06:15:18:<config>
06:15:18:  <!-- Network -->
06:15:18:  <proxy v=':8080'/>
06:15:18:
06:15:18:  <!-- Slot Control -->
06:15:18:  <power v='full'/>
06:15:18:
06:15:18:  <!-- User Information -->
06:15:18:  <passkey v='********************************'/>
06:15:18:  <team v='142900'/>
06:15:18:  <user v='Montague-Cripps'/>
06:15:18:
06:15:18:  <!-- Folding Slots -->
06:15:18:  <slot id='0' type='CPU'>
06:15:18:    <cpus v='6'/>
06:15:18:  </slot>
06:15:18:  <slot id='1' type='GPU'>
06:15:18:    <client-type v='advanced'/>
06:15:18:  </slot>
06:15:18:  <slot id='2' type='GPU'>
06:15:18:    <client-type v='advanced'/>
06:15:18:  </slot>
06:15:18:</config>
06:15:38:WU02:FS02:Starting
06:15:38:ERROR:WU02:FS02:Failed to start core: OpenCL device matching slot 2 not found, try setting 'opencl-index' manually
06:15:56:WU01:FS01:Starting
06:15:56:ERROR:WU01:FS01:Failed to start core: OpenCL device matching slot 1 not found, try setting 'opencl-index' manually
06:17:15:WU02:FS02:Starting
06:17:15:ERROR:WU02:FS02:Failed to start core: OpenCL device matching slot 2 not found, try setting 'opencl-index' manually
06:17:33:WU01:FS01:Starting
06:17:33:ERROR:WU01:FS01:Failed to start core: OpenCL device matching slot 1 not found, try setting 'opencl-index' manually
06:17:52:WU00:FS00:0xa7:Completed 110000 out of 250000 steps (44%)
I had to switch of all the protection layers on malwarebytes, as it looks as though that software was blocking connections, even after adding exclusions for fahclient and fahcontrol.

Only my CPU slots on the affected machines are folding at the moment.

Hope it gets fixed soon so I can return to full folding.

Cheers

Nick

Re: Failing to get WU today work server refuses [171.67.108.

Posted: Mon Apr 30, 2018 6:33 am
by Nick200
Or maybe this is a driver-related problem as I have not migrated the unaffected machines to driver 397.31. Will rollback on the affected machines to see whether that is the cause.

Re: Failing to get WU today work server refuses [171.67.108.

Posted: Mon Apr 30, 2018 7:17 am
by Nick200
Nope. Did a clean install of version 391.35, with no effect at all. Still showing both GPU slots failing to download a work unit. And so I suspect that this is a mix of server side and local configuration issues. But blowed if I can figure out what settings to change...

Re: Failing to get WU today work server refuses [171.67.108.

Posted: Mon Apr 30, 2018 9:51 am
by toTOW
Nick200 wrote:05:30:24:WARNING:WU02:FS02:Failed to get assignment from '171.67.108.45:80': 10002: Received short response, expected 272 bytes, got 0
This message usually indicates that something (usually security software or corporate proxies) are messing up the network traffic between the client and the servers ...

On the other extract of logs, something is wrong with your OpenCL installation. This is from Intel drivers : 06:14:17:OpenCL Device 0: Platform:0 Device:0 Bus:NA Slot:NA Compute:1.2 Driver:20.19, but with NV drivers, you should see something like this instead : 21:12:33:OpenCL Device 0: Platform:0 Device:0 Bus:2 Slot:0 Compute:1.2 Driver:397.31. The many OpenCL errors that complain about missing devices is probably that the client is trying to use the Intel OpenCL platform instead of the NV one (which has the devices it searches). Try removing Intel drivers (or at least OpenCL component of the drivers) and maybe reinstalling NV ones after doing do.

Re: Failing to get WU today work server refuses [171.67.108.

Posted: Mon Apr 30, 2018 9:55 am
by SteveWillis
Aurum wrote:
SteveWillis wrote:systemctl stop FAHClient
sleep 2
pkill -e -9 FahCore
pkill -e -9 FAHClient
systemctl restart FAHClient
As a Linux neophyte I've been wondering how to stop running FAH. You said script, do you save this and it executes automatically??? Or do you execute line by line in Terminal???
I had 5 rigs sitting idle this morning. They're headless Linux rigs so I just cycle the power. TeamViewer 13 for Linux hardly works 10% of the time, rest just a black screen if it can detect a connection at all.
Hi Aurum.
The Linux commands I gave can either be executed line by line or as suggested or put in a script. If in a script they must be executed with sudo (as root). I've added some code to remind you to use sudo if you forget and to retry if the restart fails, which it sometimes does:

Code: Select all

#!/bin/ksh
if ! [ $(id -u) = 0 ]; then
   echo "This script must be run as root (sudo   path/restartclient.sh)"
   exit 1
fi

set -x
    for i in {1..5} 
    do
        systemctl stop FAHClient
        sleep 5
        pkill -e -9 FahCore
        pkill -e -9 FAHClient
        #sleep 5
        systemctl restart FAHClient
        [[ $? == 0 ]] && break
        sleep 3
    done
You can download the scriipt, restartclient.sh from https://drive.google.com/drive/folders/ ... sp=sharing
However you might be interested in my other script, reboot.sh that you can download from the same location. It continuously monitors (once per minute) the log.txt file for hung folding slots and for no WUs and automatically executes client restarts and reboots as necessary and also FS pause/unpause cycles to force the client to more quickly try and get a WU. It works pretty well now. Including comments and a little white space is over 600 lines how. It's a real project. If you should decide to use it read all the comments at the top and there is some configuration necessary. I don't run it headless but have a friend on discord who is testing it for me headless who has been pretty positive about it.

Also, instead of teamviewer I use nomachine (nomachine.com). I have a permanent ip for my main pc set up through noip.com so I can remote in with nomachine when out of town and then remote with nomachine to my other two rigs from there. And I have a VPN set up on the laptop (mullvad.net @5 euro/month) I use remotely because I don't trust the wi-fi at hotels and rented condos. It all works pretty well.

If you have any questions feel free to PM me or to email me at the address included in the reboot.sh script (mention reboot.sh in the subject). I don't monitor this topic very closely as it's not a problem I'm having so if you just post here I may not see it. I'd love to have another person using the script and giving feedback.

Re: Failing to get WU today work server refuses [171.67.108.

Posted: Mon Apr 30, 2018 10:22 pm
by Nick200
toTOW wrote:
Nick200 wrote:05:30:24:WARNING:WU02:FS02:Failed to get assignment from '171.67.108.45:80': 10002: Received short response, expected 272 bytes, got 0
This message usually indicates that something (usually security software or corporate proxies) are messing up the network traffic between the client and the servers ...

On the other extract of logs, something is wrong with your OpenCL installation. This is from Intel drivers : 06:14:17:OpenCL Device 0: Platform:0 Device:0 Bus:NA Slot:NA Compute:1.2 Driver:20.19, but with NV drivers, you should see something like this instead : 21:12:33:OpenCL Device 0: Platform:0 Device:0 Bus:2 Slot:0 Compute:1.2 Driver:397.31. The many OpenCL errors that complain about missing devices is probably that the client is trying to use the Intel OpenCL platform instead of the NV one (which has the devices it searches). Try removing Intel drivers (or at least OpenCL component of the drivers) and maybe reinstalling NV ones after doing do.
Yes, and that is why I have disabled both Malwarebytes and Windows defender but to no effect. I have no corporate proxies and I fold from home.

I raised this issue on github (issue #1226), and Joseph Copland said in reply:
I'm not sure how to solve this. It appears that your GPUs are supported and detected by the client but the 0x21 core fails to detect them. Since this is a client configuration issue you should take your problem to the foldingforum.org. They will likely be able to better help you there.
I will try again to alter the OpenCL settings on the three affected machines. But that'll have to wait till I get home. I don't think I have any other OpenCL installation other than what comes with the driver. I will do a full clean re-install tonight after applying DDU.

I have also just installed the Spring Creators Update to see if that helped. It looks good and is very stable, but has not resolved the issue.

Cheers

Re: Failing to get WU today work server refuses [171.67.108.

Posted: Thu May 03, 2018 1:57 pm
by toTOW
Bad platformId size error is again something wrong with you OpenCL installation ... and the start of the log still shows the Intel OpenCL drivers as OpenCL platform 0 which is probably messing everything ...

You should disable your IGP, these things are completely useless anyway, and uninstall any Intel GPU drivers ...

Re: Failing to get WU today work server refuses [171.67.108.

Posted: Thu May 10, 2018 1:44 pm
by Aurum
Half my rigs cannot DL WUs and I do not have time to switch them to a working DC project.

Re: Failing to get WU today work server refuses [171.67.108.

Posted: Fri May 11, 2018 10:07 am
by toTOW
Aurum wrote:Half my rigs cannot DL WUs and I do not have time to switch them to a working DC project.
And we don't have time to help you with so few details about your issue ...

Re: Failing to get WU today work server refuses [171.67.108.

Posted: Fri Jun 29, 2018 7:06 pm
by foldy
This problem can be solved by disable malwarebytes protection.