140.163.4.244:8080 not sending WU

Moderators: Site Moderators, FAHC Science Team

Post Reply
silverpulser
Posts: 106
Joined: Sat Nov 10, 2012 9:06 am

140.163.4.244:8080 not sending WU

Post by silverpulser »

I have an Nvidia GTX 750Ti GPU and set up to request slot 1 WU only (no CPU WUs)

The log shows:

Requesting new work unit for slot 01: READY gpu:0:GM107 [GeForce GTX 750 Ti] from 140.163.4.244
09:49:29:WU00:FS01:Connecting to 140.163.4.244:8080
09:54:49:20:127.0.0.1:New Web connection

Then nothing. It did this all day yesterday until I restarted my machine in the evening, whereupon it loaded a WU as normally. I set the WU to Finish and set the machine to turn off using the Task Scheduler, as I have dome successfully many times before. When I turned it on again this morning I am getting the same response with no download of a WU.

Shaun

UPDATE

After lunch I once again performed a restart of my machine and once again the WU has now downloaded and started.
I hope this isn't going to be a permanent feature!
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: 140.163.4.244:8080 not sending WU

Post by bruce »

To be helpful, we need to see the log. (See the signature of this post.)
silverpulser
Posts: 106
Joined: Sat Nov 10, 2012 9:06 am

Re: 140.163.4.244:8080 not sending WU

Post by silverpulser »

Ok but the log file from the successful WU is different from the log file of the stalled WU. I will try to include both for completeness

First the successful WU

Code: Select all

*********************** Log Started 2017-01-08T14:06:18Z ***********************
14:06:18:************************* Folding@home Client *************************
14:06:18:      Website: http://folding.stanford.edu/
14:06:18:    Copyright: (c) 2009-2014 Stanford University
14:06:18:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
14:06:18:         Args: 
14:06:18:       Config: C:/Users/User/AppData/Roaming/FAHClient/config.xml
14:06:18:******************************** Build ********************************
14:06:18:      Version: 7.4.4
14:06:18:         Date: Mar 4 2014
14:06:18:         Time: 20:26:54
14:06:18:      SVN Rev: 4130
14:06:18:       Branch: fah/trunk/client
14:06:18:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
14:06:18:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
14:06:18:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
14:06:18:     Platform: win32 XP
14:06:18:         Bits: 32
14:06:18:         Mode: Release
14:06:18:******************************* System ********************************
14:06:18:          CPU: Intel(R) Pentium(R) CPU G620 @ 2.60GHz
14:06:18:       CPU ID: GenuineIntel Family 6 Model 42 Stepping 7
14:06:18:         CPUs: 2
14:06:18:       Memory: 3.98GiB
14:06:18:  Free Memory: 2.64GiB
14:06:18:      Threads: WINDOWS_THREADS
14:06:18:   OS Version: 6.2
14:06:18:  Has Battery: false
14:06:18:   On Battery: false
14:06:18:   UTC Offset: 0
14:06:18:          PID: 8060
14:06:18:          CWD: C:/Users/User/AppData/Roaming/FAHClient
14:06:18:           OS: Windows 10 Pro
14:06:18:      OS Arch: AMD64
14:06:18:         GPUs: 1
14:06:18:        GPU 0: NVIDIA:4 GM107 [GeForce GTX 750 Ti]
14:06:18:         CUDA: 5.0
14:06:18:  CUDA Driver: 8000
14:06:18:Win32 Service: false
14:06:18:***********************************************************************
14:06:18:<config>
14:06:18:  <!-- Network -->
14:06:18:  <proxy v=':8080'/>
14:06:18:
14:06:18:  <!-- Slot Control -->
14:06:18:  <power v='full'/>
14:06:18:
14:06:18:  <!-- User Information -->
14:06:18:  <passkey v='********************************'/>
14:06:18:  <team v='142900'/>
14:06:18:  <user v='Silverpulser'/>
14:06:18:
14:06:18:  <!-- Folding Slots -->
14:06:18:  <slot id='1' type='GPU'/>
14:06:18:</config>
14:06:18:Trying to access database...
14:06:18:Successfully acquired database lock
14:06:18:Enabled folding slot 01: READY gpu:0:GM107 [GeForce GTX 750 Ti]
14:06:18:WU00:FS01:Connecting to 171.67.108.45:80
14:06:20:WU00:FS01:Assigned to work server 140.163.4.231
14:06:20:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GM107 [GeForce GTX 750 Ti] from 140.163.4.231
14:06:20:WU00:FS01:Connecting to 140.163.4.231:8080
14:06:25:WU00:FS01:Downloading 16.92MiB
14:06:31:WU00:FS01:Download 8.87%
14:06:37:WU00:FS01:Download 94.20%
14:06:37:WU00:FS01:Download complete
14:06:37:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:11712 run:8 clone:9 gen:60 core:0x21 unit:0x000000538ca304e7581e62291aa2523c
14:06:37:WU00:FS01:Starting
14:06:37:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/User/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21.exe -dir 00 -suffix 01 -version 704 -lifeline 8060 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
14:06:37:WU00:FS01:Started FahCore on PID 2592
14:06:37:WU00:FS01:Core PID:2756
14:06:37:WU00:FS01:FahCore 0x21 started
14:06:39:WU00:FS01:0x21:*********************** Log Started 2017-01-08T14:06:39Z ***********************
14:06:39:WU00:FS01:0x21:Project: 11712 (Run 8, Clone 9, Gen 60)
14:06:39:WU00:FS01:0x21:Unit: 0x000000538ca304e7581e62291aa2523c
14:06:39:WU00:FS01:0x21:CPU: 0x00000000000000000000000000000000
14:06:39:WU00:FS01:0x21:Machine: 1
14:06:39:WU00:FS01:0x21:Reading tar file core.xml
14:06:39:WU00:FS01:0x21:Reading tar file integrator.xml
14:06:39:WU00:FS01:0x21:Reading tar file state.xml
14:06:39:WU00:FS01:0x21:Reading tar file system.xml
14:06:39:WU00:FS01:0x21:Digital signatures verified
14:06:39:WU00:FS01:0x21:Folding@home GPU Core21 Folding@home Core
14:06:39:WU00:FS01:0x21:Version 0.0.17
14:06:49:WU00:FS01:0x21:Completed 0 out of 7500000 steps (0%)
14:06:49:WU00:FS01:0x21:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
14:11:56:20:127.0.0.1:New Web connection
14:16:57:WU00:FS01:0x21:Completed 75000 out of 7500000 steps (1%)
14:26:42:WU00:FS01:0x21:Completed 150000 out of 7500000 steps (2%)
14:36:58:WU00:FS01:0x21:Completed 225000 out of 7500000 steps (3%)
14:47:07:WU00:FS01:0x21:Completed 300000 out of 7500000 steps (4%)
14:56:57:WU00:FS01:0x21:Completed 375000 out of 7500000 steps (5%)
15:06:41:WU00:FS01:0x21:Completed 450000 out of 7500000 steps (6%)
15:16:27:WU00:FS01:0x21:Completed 525000 out of 7500000 steps (7%)
15:26:10:WU00:FS01:0x21:Completed 600000 out of 7500000 steps (8%)
15:35:54:WU00:FS01:0x21:Completed 675000 out of 7500000 steps (9%)
15:45:37:WU00:FS01:0x21:Completed 750000 out of 7500000 steps (10%)
15:55:24:WU00:FS01:0x21:Completed 825000 out of 7500000 steps (11%)
16:05:08:WU00:FS01:0x21:Completed 900000 out of 7500000 steps (12%)
16:14:51:WU00:FS01:0x21:Completed 975000 out of 7500000 steps (13%)
16:24:36:WU00:FS01:0x21:Completed 1050000 out of 7500000 steps (14%)
16:34:19:WU00:FS01:0x21:Completed 1125000 out of 7500000 steps (15%)
16:44:03:WU00:FS01:0x21:Completed 1200000 out of 7500000 steps (16%)
16:53:48:WU00:FS01:0x21:Completed 1275000 out of 7500000 steps (17%)
17:03:32:WU00:FS01:0x21:Completed 1350000 out of 7500000 steps (18%)
17:13:16:WU00:FS01:0x21:Completed 1425000 out of 7500000 steps (19%)
17:22:59:WU00:FS01:0x21:Completed 1500000 out of 7500000 steps (20%)
17:32:45:WU00:FS01:0x21:Completed 1575000 out of 7500000 steps (21%)
17:42:29:WU00:FS01:0x21:Completed 1650000 out of 7500000 steps (22%)
17:52:12:WU00:FS01:0x21:Completed 1725000 out of 7500000 steps (23%)
18:01:57:WU00:FS01:0x21:Completed 1800000 out of 7500000 steps (24%)
18:11:41:WU00:FS01:0x21:Completed 1875000 out of 7500000 steps (25%)
18:21:24:WU00:FS01:0x21:Completed 1950000 out of 7500000 steps (26%)
18:31:09:WU00:FS01:0x21:Completed 2025000 out of 7500000 steps (27%)
18:40:53:WU00:FS01:0x21:Completed 2100000 out of 7500000 steps (28%)
18:50:38:WU00:FS01:0x21:Completed 2175000 out of 7500000 steps (29%)
18:55:51:48:127.0.0.1:New Web connection
19:00:52:WU00:FS01:0x21:Completed 2250000 out of 7500000 steps (30%)
19:11:40:WU00:FS01:0x21:Completed 2325000 out of 7500000 steps (31%)
19:22:28:WU00:FS01:0x21:Completed 2400000 out of 7500000 steps (32%)
19:32:59:WU00:FS01:0x21:Completed 2475000 out of 7500000 steps (33%)
And now the stalled WU

Code: Select all

*********************** Log Started 2017-01-08T09:49:27Z ***********************
09:49:27:************************* Folding@home Client *************************
09:49:27:      Website: http://folding.stanford.edu/
09:49:27:    Copyright: (c) 2009-2014 Stanford University
09:49:27:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
09:49:27:         Args: 
09:49:27:       Config: C:/Users/User/AppData/Roaming/FAHClient/config.xml
09:49:27:******************************** Build ********************************
09:49:27:      Version: 7.4.4
09:49:27:         Date: Mar 4 2014
09:49:27:         Time: 20:26:54
09:49:27:      SVN Rev: 4130
09:49:27:       Branch: fah/trunk/client
09:49:27:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
09:49:27:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
09:49:27:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
09:49:27:     Platform: win32 XP
09:49:27:         Bits: 32
09:49:27:         Mode: Release
09:49:27:******************************* System ********************************
09:49:27:          CPU: Intel(R) Pentium(R) CPU G620 @ 2.60GHz
09:49:27:       CPU ID: GenuineIntel Family 6 Model 42 Stepping 7
09:49:27:         CPUs: 2
09:49:27:       Memory: 3.98GiB
09:49:27:  Free Memory: 2.59GiB
09:49:27:      Threads: WINDOWS_THREADS
09:49:27:   OS Version: 6.2
09:49:27:  Has Battery: false
09:49:27:   On Battery: false
09:49:27:   UTC Offset: 0
09:49:27:          PID: 7988
09:49:27:          CWD: C:/Users/User/AppData/Roaming/FAHClient
09:49:27:           OS: Windows 10 Pro
09:49:27:      OS Arch: AMD64
09:49:27:         GPUs: 1
09:49:27:        GPU 0: NVIDIA:4 GM107 [GeForce GTX 750 Ti]
09:49:27:         CUDA: 5.0
09:49:27:  CUDA Driver: 8000
09:49:27:Win32 Service: false
09:49:27:***********************************************************************
09:49:27:<config>
09:49:27:  <!-- Network -->
09:49:27:  <proxy v=':8080'/>
09:49:27:
09:49:27:  <!-- Slot Control -->
09:49:27:  <power v='full'/>
09:49:27:
09:49:27:  <!-- User Information -->
09:49:27:  <passkey v='********************************'/>
09:49:27:  <team v='142900'/>
09:49:27:  <user v='Silverpulser'/>
09:49:27:
09:49:27:  <!-- Folding Slots -->
09:49:27:  <slot id='1' type='GPU'/>
09:49:27:</config>
09:49:27:Trying to access database...
09:49:27:Successfully acquired database lock
09:49:27:Enabled folding slot 01: READY gpu:0:GM107 [GeForce GTX 750 Ti]
09:49:27:WU00:FS01:Connecting to 171.67.108.45:80
09:49:29:WU00:FS01:Assigned to work server 140.163.4.244
09:49:29:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GM107 [GeForce GTX 750 Ti] from 140.163.4.244
09:49:29:WU00:FS01:Connecting to 140.163.4.244:8080
09:54:49:20:127.0.0.1:New Web connection
09:55:33:FS01:Paused
09:55:38:FS01:Unpaused
10:48:15:67:127.0.0.1:New Web connection
14:01:12:100:127.0.0.1:New Web connection
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: 140.163.4.244:8080 not sending WU

Post by bruce »

Do you see any differences between these two, (excepting the numbers)?

Code: Select all

14:06:18:WU00:FS01:Connecting to 171.67.108.45:80
14:06:20:WU00:FS01:Assigned to work server 140.163.4.231
14:06:20:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GM107 [GeForce GTX 750 Ti] from 140.163.4.231
14:06:20:WU00:FS01:Connecting to 140.163.4.231:8080
14:06:25:WU00:FS01:Downloading 16.92MiB
14:06:31:WU00:FS01:Download 8.87%
. . .

Code: Select all

09:49:27:WU00:FS01:Connecting to 171.67.108.45:80
09:49:29:WU00:FS01:Assigned to work server 140.163.4.244
09:49:29:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GM107 [GeForce GTX 750 Ti] from 140.163.4.244
09:49:29:WU00:FS01:Connecting to 140.163.4.244:8080
. . .
In the former, a connection to the work server 140.163.4.231 is completed and data starts downloading.
In the latter, a connection to the work server 140.163.4.244 is not completed ... nor is an error reported.

It's possible that there's a strange problem with 140.163.4.244 and this topic suggests that this is a likely option.

Another, likely option is that you've run into a known bug with the 7.4.4 version of FAHClient. If an internet connection fails for any reason, the client will continue to process some connections but will hang when establishing some new connections. I think that hang happened when the client attempted to connect to 140.163.4.244.

You have two options.
1) When it does hang, restart FAHClient. (Probably that means reboot.)
2) Upgrade to the V7.4.16 beta FAHClient.

If you choose to remain until there's a new released version, do everything you can to avoid things that will produce a (first) failure of one of FAHClient's connections.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: 140.163.4.244:8080 not sending WU

Post by bruce »

If anybody is running V7.4.15 or .16 and your system attempts to connect to 140.163.4.244, I'd be interested in seeing the log.
ComputerGenie
Posts: 236
Joined: Mon Dec 12, 2016 4:06 am

Re: 140.163.4.244:8080 not sending WU

Post by ComputerGenie »

bruce wrote:If anybody is running V7.4.15 or .16 and your system attempts to connect to 140.163.4.244, I'd be interested in seeing the log.
Any way to tell without keeping a constant eye on it or pouring through the logs?
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: 140.163.4.244:8080 not sending WU

Post by bruce »

Pour through the logs.

If you use FAHClient to view the log, you can filter our information from other slots. Unfortunately, filtering for errors doesn't help since you're not seeing an error message.

Personally, I prefer to display the log in an editor so I can simply search for "Project:" or things like that.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: 140.163.4.244:8080 not sending WU

Post by bruce »

I think this is the same problem noted here: viewtopic.php?f=18&t=29546
140.163.4.244 and .231 are on the same subnet.
JohnChodera
Pande Group Member
Posts: 470
Joined: Fri Feb 22, 2013 9:59 pm

Re: 140.163.4.244:8080 not sending WU

Post by JohnChodera »

This is really odd. I'm not seeing any signs of DDOS on these machines, but we've asked the Networking team to look into what might be going on.

I've restarted the affected servers, just in case.
silverpulser
Posts: 106
Joined: Sat Nov 10, 2012 9:06 am

Re: 140.163.4.244:8080 not sending WU

Post by silverpulser »

bruce wrote:
You have two options.
1) When it does hang, restart FAHClient. (Probably that means reboot.)
2) Upgrade to the V7.4.16 beta FAHClient.

If you choose to remain until there's a new released version, do everything you can to avoid things that will produce a (first) failure of one of FAHClient's connections.
OK, I have now installed FAH V7.4.16. The WU has downloaded and now running as normal. Tomorrow will be the test under the identical condition where I have set the WU to Finish and set the Task Scheduler to turn off the machine after I have gone to bed!

Fingers crossed!

UPDATE

WU downloaded and working normally.

Connected as follows:-

08:12:04:WU00:FS01:Connecting to 171.67.108.105:8080
Post Reply