Work servers 140.163.4.231 & 128.252.203.4 downloads broken

Moderators: Site Moderators, FAHC Science Team

TristanChen
Posts: 21
Joined: Tue May 30, 2017 4:55 am

Work servers 140.163.4.231 & 128.252.203.4 downloads broken

Post by TristanChen »

Came home today from work to find 8 out of 9 machines stuck downloading work. For some reason, the work servers' data transfer rate have slowed to a crawl. Downloading a 16MB work unit was taking 3hrs or more and/or simply timing out. Is anyone else having this problem?!

Just to test my network, I reinstalled BOINC-GPUGrid on these computers, and they were able to download 130MB work units in roughly 90secs... Now running GPUGrid on all machines until this issue is resolved...
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Work servers downloads broken

Post by bruce »

that's not enough information. Which servers were exhibiting this slow download?
This topic is "issues with a specific server" for a reason ... you need to specify which server(s) are potentially having problems.

I downloaded 16mb from server 140.163.4.231 in about 2 seconds a couple of hours ago.

Code: Select all

00:41:30:WU02:FS02:Connecting to 65.254.110.245:8080
00:41:31:WU02:FS02:Assigned to work server 140.163.4.231
00:41:31:WU02:FS02:Requesting new work unit for slot 02: RUNNING gpu:1:GM206 [GeForce GTX 960] 2308 from 140.163.4.231
00:41:31:WU02:FS02:Connecting to 140.163.4.231:8080
00:41:31:WU02:FS02:Downloading 16.51MiB
00:41:33:WU02:FS02:Download complete
00:41:33:WU02:FS02:Received Unit: id:02 ... 
toTOW
Site Moderator
Posts: 6307
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Work servers downloads broken

Post by toTOW »

It might be the timeout bug ... Restart the client (easiest way is to reboot since it doesn't always exit when hit by the timeout bug).
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
TAMUmpower
Posts: 12
Joined: Mon Jan 08, 2018 9:02 pm

Re: Work servers downloads broken

Post by TAMUmpower »

Dude mines been doing the same thing for a few days. I have 4 rigs with 2 cards each. Usually only one of the 2 cards gets stuck downloading and when I look at the log it looks like it took forever to download like 25% of the WU and then it freezes and never finishes download. I don't see why there isnt a timeout setting to try and reject the download and try again if it takes longer than a minute to download.
Joe_H
Site Admin
Posts: 7867
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Work servers downloads broken

Post by Joe_H »

There is a check, it works most of the time. The bug is that sometimes the client does not detect that a network upload or download has stalled.

As for just a minute, that would be much too small an interval for many folders' systems. With the current client it will recheck after about 15 minutes.

Altogether though, just as with the first poster you have provided next to no usable information. At the minimum we would need data as to when the downloads failed and which server was being connected to by the client to get a new WU.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
TAMUmpower
Posts: 12
Joined: Mon Jan 08, 2018 9:02 pm

Re: Work servers downloads broken

Post by TAMUmpower »

Mod edits: removed auto-download link, copied & posted log here per directions in Welcome Topic, removed excess status lines - j

Theres a log file of one of my rigs. One card kept folding, the other froze on the download.

Code: Select all

*********************** Log Started 2018-08-12T15:54:40Z ***********************
15:54:40:************************* Folding@home Client *************************
15:54:40:    Website: http://folding.stanford.edu/
15:54:40:  Copyright: (c) 2009-2014 Stanford University
15:54:40:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
15:54:40:       Args: --child --lifeline 1737 /etc/fahclient/config.xml --run-as
15:54:40:             fahclient --pid-file=/var/run/fahclient.pid --daemon
15:54:40:     Config: /etc/fahclient/config.xml
15:54:40:******************************** Build ********************************
15:54:40:    Version: 7.4.4
15:54:40:       Date: Mar 4 2014
15:54:40:       Time: 12:02:38
15:54:40:    SVN Rev: 4130
15:54:40:     Branch: fah/trunk/client
15:54:40:   Compiler: GNU 4.4.7
15:54:40:    Options: -std=gnu++98 -O3 -funroll-loops -mfpmath=sse -ffast-math
15:54:40:             -fno-unsafe-math-optimizations -msse2
15:54:40:   Platform: linux2 3.2.0-1-amd64
15:54:40:       Bits: 64
15:54:40:       Mode: Release
15:54:40:******************************* System ********************************
15:54:40:        CPU: Intel(R) Pentium(R) CPU G4560 @ 3.50GHz
15:54:40:     CPU ID: GenuineIntel Family 6 Model 158 Stepping 9
15:54:40:       CPUs: 4
15:54:40:     Memory: 7.68GiB
15:54:40:Free Memory: 6.93GiB
15:54:40:    Threads: POSIX_THREADS
15:54:40: OS Version: 4.15
15:54:40:Has Battery: false
15:54:40: On Battery: false
15:54:40: UTC Offset: -4
15:54:40:        PID: 1739
15:54:40:        CWD: /var/lib/fahclient
15:54:40:         OS: Linux 4.15.0-30-generic x86_64
15:54:40:    OS Arch: AMD64
15:54:40:       GPUs: 2
15:54:40:      GPU 0: NVIDIA:7 GP102 [GeForce GTX 1080 Ti] 11380
15:54:40:      GPU 1: NVIDIA:7 GP102 [GeForce GTX 1080 Ti] 11380
15:54:40:       CUDA: 6.1
15:54:40:CUDA Driver: 9000
15:54:40:***********************************************************************
15:54:40:<config>
15:54:40:  <!-- Client Control -->
15:54:40:  <fold-anon v='true'/>
15:54:40:
15:54:40:  <!-- Folding Slot Configuration -->
15:54:40:  <client-type v='advanced'/>
15:54:40:
15:54:40:  <!-- HTTP Server -->
15:54:40:  <allow v='127.0.0.1 192.168.1.12'/>
15:54:40:
15:54:40:  <!-- Network -->
15:54:40:  <proxy v=':8080'/>
15:54:40:
15:54:40:  <!-- Remote Command Server -->
15:54:40:  <password v='**********'/>
15:54:40:
15:54:40:  <!-- Slot Control -->
15:54:40:  <power v='full'/>
15:54:40:
15:54:40:  <!-- User Information -->
15:54:40:  <passkey v='********************************'/>
15:54:40:  <team v='224497'/>
15:54:40:  <user v='Blackzilla_ALL_1A91a7YpY6gkZZ8rEvQtLazRwUbTJ7ZcT2'/>
15:54:40:
15:54:40:  <!-- Folding Slots -->
15:54:40:  <slot id='0' type='GPU'/>
15:54:40:  <slot id='1' type='GPU'/>
15:54:40:</config>
15:54:40:Switching to user fahclient
15:54:40:Trying to access database...
15:54:40:Successfully acquired database lock
15:54:40:Enabled folding slot 00: READY gpu:0:GP102 [GeForce GTX 1080 Ti] 11380
15:54:40:Enabled folding slot 01: READY gpu:1:GP102 [GeForce GTX 1080 Ti] 11380
15:54:40:WU00:FS00:Starting
15:54:40:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 00 -suffix 01 -version 704 -lifeline 1739 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
15:54:40:WU00:FS00:Started FahCore on PID 1749
15:54:40:WU00:FS00:Core PID:1756
15:54:40:WU00:FS00:FahCore 0x21 started
15:54:40:WU01:FS01:Connecting to 65.254.110.245:80
15:54:41:WU00:FS00:0x21:*********************** Log Started 2018-08-12T15:54:40Z ***********************
15:54:41:WU00:FS00:0x21:Project: 11713 (Run 3, Clone 149, Gen 337)
15:54:41:WU00:FS00:0x21:Unit: 0x0000019a8ca304e75a5cfe1cfc73b6fc
15:54:41:WU00:FS00:0x21:CPU: 0x00000000000000000000000000000000
15:54:41:WU00:FS00:0x21:Machine: 0
15:54:41:WU00:FS00:0x21:Digital signatures verified
15:54:41:WU00:FS00:0x21:Folding@home GPU Core21 Folding@home Core
15:54:41:WU00:FS00:0x21:Version 0.0.18
15:54:41:WU00:FS00:0x21:  Found a checkpoint file
15:54:42:WU01:FS01:Assigned to work server 140.163.4.231
15:54:42:WU01:FS01:Requesting new work unit for slot 01: READY gpu:1:GP102 [GeForce GTX 1080 Ti] 11380 from 140.163.4.231
15:54:42:WU01:FS01:Connecting to 140.163.4.231:8080
15:54:43:WU01:FS01:Downloading 16.52MiB
15:54:46:WU00:FS00:0x21:Completed 6500000 out of 7500000 steps (86%)
15:54:46:WU00:FS00:0x21:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
15:54:47:WU01:FS01:Download complete
15:54:47:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:11713 run:24 clone:380 gen:128 core:0x21 unit:0x000000b58ca304e75adf7de9bfe3390a
15:54:47:WU01:FS01:Starting
15:54:47:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 01 -suffix 01 -version 704 -lifeline 1739 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
15:54:47:WU01:FS01:Started FahCore on PID 1914
15:54:47:WU01:FS01:Core PID:1918
15:54:47:WU01:FS01:FahCore 0x21 started
15:54:47:WU01:FS01:0x21:*********************** Log Started 2018-08-12T15:54:47Z ***********************
15:54:47:WU01:FS01:0x21:Project: 11713 (Run 24, Clone 380, Gen 128)
15:54:47:WU01:FS01:0x21:Unit: 0x000000b58ca304e75adf7de9bfe3390a
15:54:47:WU01:FS01:0x21:CPU: 0x00000000000000000000000000000000
15:54:47:WU01:FS01:0x21:Machine: 1
15:54:47:WU01:FS01:0x21:Reading tar file core.xml
15:54:47:WU01:FS01:0x21:Reading tar file integrator.xml
15:54:47:WU01:FS01:0x21:Reading tar file state.xml
15:54:47:WU01:FS01:0x21:Reading tar file system.xml
15:54:47:WU01:FS01:0x21:Digital signatures verified
15:54:47:WU01:FS01:0x21:Folding@home GPU Core21 Folding@home Core
15:54:47:WU01:FS01:0x21:Version 0.0.18
15:54:51:WU01:FS01:0x21:Completed 0 out of 7500000 steps (0%)
15:54:51:WU01:FS01:0x21:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
15:55:18:WU00:FS00:0x21:Completed 6525000 out of 7500000 steps (87%)
15:56:22:WU01:FS01:0x21:Completed 75000 out of 7500000 steps (1%)
15:56:54:WU00:FS00:0x21:Completed 6600000 out of 7500000 steps (88%)
15:57:54:WU01:FS01:0x21:Completed 150000 out of 7500000 steps (2%)
15:58:31:WU00:FS00:0x21:Completed 6675000 out of 7500000 steps (89%)
15:59:26:WU01:FS01:0x21:Completed 225000 out of 7500000 steps (3%)
16:00:08:WU00:FS00:0x21:Completed 6750000 out of 7500000 steps (90%)

16:11:47:WU01:FS01:0x21:Completed 825000 out of 7500000 steps (11%)
16:13:05:WU00:FS00:0x21:Completed 7350000 out of 7500000 steps (98%)
16:13:19:WU01:FS01:0x21:Completed 900000 out of 7500000 steps (12%)
16:14:42:WU00:FS00:0x21:Completed 7425000 out of 7500000 steps (99%)
16:14:42:WU02:FS00:Connecting to 65.254.110.245:80
16:14:42:WU02:FS00:Assigned to work server 140.163.4.231
16:14:42:WU02:FS00:Requesting new work unit for slot 00: RUNNING gpu:0:GP102 [GeForce GTX 1080 Ti] 11380 from 140.163.4.231
16:14:42:WU02:FS00:Connecting to 140.163.4.231:8080
16:14:43:WU02:FS00:Downloading 16.52MiB
16:14:47:WU02:FS00:Download complete
16:14:47:WU02:FS00:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:11713 run:26 clone:244 gen:184 core:0x21 unit:0x000000fb8ca304e75adf7ebf3caffda3
16:14:51:WU01:FS01:0x21:Completed 975000 out of 7500000 steps (13%)
16:16:18:WU00:FS00:0x21:Completed 7500000 out of 7500000 steps (100%)
16:16:19:WU00:FS00:0x21:Saving result file logfile_01.txt
16:16:19:WU00:FS00:0x21:Saving result file checkpointState.xml
16:16:19:WU00:FS00:0x21:Saving result file checkpt.crc
16:16:19:WU00:FS00:0x21:Saving result file log.txt
16:16:19:WU00:FS00:0x21:Saving result file positions.xtc
16:16:19:WU00:FS00:0x21:Folding@home Core Shutdown: FINISHED_UNIT
16:16:20:WU00:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
16:16:20:WU00:FS00:Sending unit results: id:00 state:SEND error:NO_ERROR project:11713 run:3 clone:149 gen:337 core:0x21 unit:0x0000019a8ca304e75a5cfe1cfc73b6fc
16:16:20:WU00:FS00:Uploading 11.78MiB to 140.163.4.231
16:16:20:WU00:FS00:Connecting to 140.163.4.231:8080
16:16:20:WU02:FS00:Starting
16:16:20:WU02:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 02 -suffix 01 -version 704 -lifeline 1739 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
16:16:20:WU02:FS00:Started FahCore on PID 3345
16:16:20:WU02:FS00:Core PID:3349
16:16:20:WU02:FS00:FahCore 0x21 started
16:16:20:WU02:FS00:0x21:*********************** Log Started 2018-08-12T16:16:20Z ***********************
16:16:20:WU02:FS00:0x21:Project: 11713 (Run 26, Clone 244, Gen 184)
16:16:20:WU02:FS00:0x21:Unit: 0x000000fb8ca304e75adf7ebf3caffda3
16:16:20:WU02:FS00:0x21:CPU: 0x00000000000000000000000000000000
16:16:20:WU02:FS00:0x21:Machine: 0
16:16:20:WU02:FS00:0x21:Reading tar file core.xml
16:16:20:WU02:FS00:0x21:Reading tar file integrator.xml
16:16:20:WU02:FS00:0x21:Reading tar file state.xml
16:16:20:WU02:FS00:0x21:Reading tar file system.xml
16:16:20:WU02:FS00:0x21:Digital signatures verified
16:16:20:WU02:FS00:0x21:Folding@home GPU Core21 Folding@home Core
16:16:20:WU02:FS00:0x21:Version 0.0.18
16:16:21:WU00:FS00:Upload complete
16:16:21:WU00:FS00:Server responded WORK_ACK (400)
16:16:21:WU00:FS00:Final credit estimate, 128605.00 points
16:16:21:WU00:FS00:Cleaning up
16:16:24:WU02:FS00:0x21:Completed 0 out of 7500000 steps (0%)
16:16:24:WU02:FS00:0x21:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
16:16:24:WU01:FS01:0x21:Completed 1050000 out of 7500000 steps (14%)
16:17:57:WU01:FS01:0x21:Completed 1125000 out of 7500000 steps (15%)
16:18:01:WU02:FS00:0x21:Completed 75000 out of 7500000 steps (1%)

23:08:41:WU00:FS00:0x21:Completed 4850000 out of 5000000 steps (97%)
23:09:35:WU00:FS00:0x21:Completed 4900000 out of 5000000 steps (98%)
23:09:45:WU02:FS01:0x21:Completed 6150000 out of 7500000 steps (82%)
23:10:30:WU00:FS00:0x21:Completed 4950000 out of 5000000 steps (99%)
23:10:31:WU01:FS00:Connecting to 65.254.110.245:80
23:10:31:WU01:FS00:Assigned to work server 140.163.4.231
23:10:31:WU01:FS00:Requesting new work unit for slot 00: RUNNING gpu:0:GP102 [GeForce GTX 1080 Ti] 11380 from 140.163.4.231
23:10:31:WU01:FS00:Connecting to 140.163.4.231:8080
23:11:17:WU02:FS01:0x21:Completed 6225000 out of 7500000 steps (83%)
23:11:25:WU00:FS00:0x21:Completed 5000000 out of 5000000 steps (100%)
23:11:26:WU00:FS00:0x21:Saving result file logfile_01.txt
23:11:26:WU00:FS00:0x21:Saving result file checkpointState.xml
23:11:26:WU00:FS00:0x21:Saving result file checkpt.crc
23:11:26:WU00:FS00:0x21:Saving result file log.txt
23:11:26:WU00:FS00:0x21:Saving result file positions.xtc
23:11:26:WU00:FS00:0x21:Folding@home Core Shutdown: FINISHED_UNIT
23:11:26:WU00:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
23:11:26:WU00:FS00:Sending unit results: id:00 state:SEND error:NO_ERROR project:13816 run:0 clone:1672 gen:99 core:0x21 unit:0x0000007880fccb045b3a656c8b4bb79b
23:11:26:WU00:FS00:Uploading 6.49MiB to 128.252.203.4
23:11:26:WU00:FS00:Connecting to 128.252.203.4:8080
23:11:28:WU00:FS00:Upload complete
23:11:28:WU00:FS00:Server responded WORK_ACK (400)
23:11:28:WU00:FS00:Final credit estimate, 74280.00 points
23:11:28:WU00:FS00:Cleaning up
23:11:36:WU01:FS00:Downloading 16.51MiB
23:11:45:WU01:FS00:Download 0.76%
23:12:50:WU02:FS01:0x21:Completed 6300000 out of 7500000 steps (84%)
23:14:22:WU02:FS01:0x21:Completed 6375000 out of 7500000 steps (85%)
23:14:24:WU01:FS00:Download 1.14%
23:14:43:WU01:FS00:Download 1.89%
23:14:52:WU01:FS00:Download 2.27%
23:15:00:WU01:FS00:Download 2.65%
23:15:14:WU01:FS00:Download 3.03%
23:15:32:WU01:FS00:Download 3.41%
23:15:54:WU02:FS01:0x21:Completed 6450000 out of 7500000 steps (86%)
23:17:26:WU02:FS01:0x21:Completed 6525000 out of 7500000 steps (87%)
23:17:46:WU01:FS00:Download 4.16%
23:17:55:WU01:FS00:Download 4.92%
23:18:02:WU01:FS00:Download 5.30%
23:18:58:WU02:FS01:0x21:Completed 6600000 out of 7500000 steps (88%)
23:20:13:WU01:FS00:Download 6.06%
23:20:22:WU01:FS00:Download 6.81%
23:20:28:WU01:FS00:Download 7.19%
23:20:30:WU02:FS01:0x21:Completed 6675000 out of 7500000 steps (89%)
23:21:01:WU01:FS00:Download 7.57%
23:21:07:WU01:FS00:Download 7.95%
23:22:02:WU02:FS01:0x21:Completed 6750000 out of 7500000 steps (90%)
23:22:16:WU01:FS00:Download 8.71%
23:22:55:WU01:FS00:Download 9.46%
23:23:07:WU01:FS00:Download 10.60%
23:23:36:WU02:FS01:0x21:Completed 6825000 out of 7500000 steps (91%)
23:23:39:WU01:FS00:Download 10.98%
23:23:49:WU01:FS00:Download 11.36%
23:23:57:WU01:FS00:Download 11.73%
23:25:08:WU02:FS01:0x21:Completed 6900000 out of 7500000 steps (92%)
23:26:08:WU01:FS00:Download 12.49%
23:26:17:WU01:FS00:Download 12.87%
23:26:40:WU02:FS01:0x21:Completed 6975000 out of 7500000 steps (93%)
23:27:27:WU01:FS00:Download 14.01%
23:28:12:WU02:FS01:0x21:Completed 7050000 out of 7500000 steps (94%)
23:29:44:WU02:FS01:0x21:Completed 7125000 out of 7500000 steps (95%)
23:31:16:WU02:FS01:0x21:Completed 7200000 out of 7500000 steps (96%)
23:32:49:WU02:FS01:0x21:Completed 7275000 out of 7500000 steps (97%)
23:34:21:WU02:FS01:0x21:Completed 7350000 out of 7500000 steps (98%)
23:34:40:WU01:FS00:Download 14.38%
23:35:53:WU02:FS01:0x21:Completed 7425000 out of 7500000 steps (99%)
23:35:54:WU00:FS01:Connecting to 65.254.110.245:80
23:35:54:WU00:FS01:Assigned to work server 140.163.4.231
23:35:54:WU00:FS01:Requesting new work unit for slot 01: RUNNING gpu:1:GP102 [GeForce GTX 1080 Ti] 11380 from 140.163.4.231
23:35:54:WU00:FS01:Connecting to 140.163.4.231:8080
23:35:55:WU00:FS01:Downloading 16.52MiB
23:36:01:WU00:FS01:Download 81.36%
23:36:02:WU00:FS01:Download complete
23:36:02:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:11713 run:5 clone:283 gen:169 core:0x21 unit:0x000000d58ca304e75adf741d567e8cd7
23:37:25:WU02:FS01:0x21:Completed 7500000 out of 7500000 steps (100%)
23:37:26:WU02:FS01:0x21:Saving result file logfile_01.txt
23:37:26:WU02:FS01:0x21:Saving result file checkpointState.xml
23:37:26:WU02:FS01:0x21:Saving result file checkpt.crc
23:37:26:WU02:FS01:0x21:Saving result file log.txt
23:37:26:WU02:FS01:0x21:Saving result file positions.xtc
23:37:26:WU02:FS01:0x21:Folding@home Core Shutdown: FINISHED_UNIT
23:37:27:WU02:FS01:FahCore returned: FINISHED_UNIT (100 = 0x64)
23:37:27:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:11713 run:19 clone:246 gen:219 core:0x21 unit:0x0000011f8ca304e75adf7b205c128479
23:37:27:WU02:FS01:Uploading 11.81MiB to 140.163.4.231
23:37:27:WU02:FS01:Connecting to 140.163.4.231:8080
23:37:27:WU00:FS01:Starting
23:37:27:WU00:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/cores.foldingathome.org/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 00 -suffix 01 -version 704 -lifeline 1739 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
23:37:27:WU00:FS01:Started FahCore on PID 6613
23:37:27:WU00:FS01:Core PID:6617
23:37:27:WU00:FS01:FahCore 0x21 started
23:37:27:WU00:FS01:0x21:*********************** Log Started 2018-08-12T23:37:27Z ***********************
23:37:27:WU00:FS01:0x21:Project: 11713 (Run 5, Clone 283, Gen 169)
23:37:27:WU00:FS01:0x21:Unit: 0x000000d58ca304e75adf741d567e8cd7
23:37:27:WU00:FS01:0x21:CPU: 0x00000000000000000000000000000000
23:37:27:WU00:FS01:0x21:Machine: 1
23:37:27:WU00:FS01:0x21:Reading tar file core.xml
23:37:27:WU00:FS01:0x21:Reading tar file integrator.xml
23:37:27:WU00:FS01:0x21:Reading tar file state.xml
23:37:27:WU00:FS01:0x21:Reading tar file system.xml
23:37:27:WU00:FS01:0x21:Digital signatures verified
23:37:27:WU00:FS01:0x21:Folding@home GPU Core21 Folding@home Core
23:37:27:WU00:FS01:0x21:Version 0.0.18
23:37:28:WU02:FS01:Upload complete
23:37:29:WU02:FS01:Server responded WORK_ACK (400)
23:37:29:WU02:FS01:Final credit estimate, 132936.00 points
23:37:29:WU02:FS01:Cleaning up
23:37:30:WU00:FS01:0x21:Completed 0 out of 7500000 steps (0%)
23:37:30:WU00:FS01:0x21:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
23:39:02:WU00:FS01:0x21:Completed 75000 out of 7500000 steps (1%)
23:40:34:WU00:FS01:0x21:Completed 150000 out of 7500000 steps (2%)
23:42:06:WU00:FS01:0x21:Completed 225000 out of 7500000 steps (3%)
23:43:38:WU00:FS01:0x21:Completed 300000 out of 7500000 steps (4%)
23:45:10:WU00:FS01:0x21:Completed 375000 out of 7500000 steps (5%)
23:46:42:WU00:FS01:0x21:Completed 450000 out of 7500000 steps (6%)
23:48:15:WU00:FS01:0x21:Completed 525000 out of 7500000 steps (7%)
23:49:47:WU00:FS01:0x21:Completed 600000 out of 7500000 steps (8%)
23:51:19:WU00:FS01:0x21:Completed 675000 out of 7500000 steps (9%)
23:52:51:WU00:FS01:0x21:Completed 750000 out of 7500000 steps (10%)
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Work servers downloads broken

Post by bruce »

There is some kind of an undiagnosed problem that seems to be affecting the server 140.163.4.231

starting at
23:10:31:WU01:FS00:Assigned to work server 140.163.4.231
23:10:31:WU01:FS00:Requesting new work unit for slot 00: RUNNING gpu:0:GP102 [GeForce GTX 1080 Ti] 11380 from 140.163.4.23
23:10:31:WU01:FS00:Connecting to 140.163.4.231:8080

a WU begins downloading extremely slowly an eventually hangs at
23:27:27:WU01:FS00:Download 14.01%

In the same log, you have many other WUs downloaded from several servers, including that particular one and in all of the other cases, the download (or upload) completed within 5 to 10 seconds. I wish I knew a way to determine why that one download was different than all the others.
Joe_H
Site Admin
Posts: 7867
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Work servers downloads broken

Post by Joe_H »

Besides Bruce's comments, I would suggest upgrading to the current version of the client. There were a number of networking improvements made since 7.4.4 which was released nearly 4 years ago.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
toTOW
Site Moderator
Posts: 6307
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Work servers downloads broken

Post by toTOW »

So this is the famous timeout bug, existing since the first versions of v7 client, and still present in 7.5.1 ... :D
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
TAMUmpower
Posts: 12
Joined: Mon Jan 08, 2018 9:02 pm

Re: Work servers downloads broken

Post by TAMUmpower »

Weird that it wouldn’t be on the current version. They are Linux machines that were setup only months ago straight from the FAH website. Not sure why this issue just started now, they have been running fine for awhile.

I’ll have to check if there is a newer Linux client when I get home
Joe_H
Site Admin
Posts: 7867
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Work servers downloads broken

Post by Joe_H »

toTOW wrote:So this is the famous timeout bug, existing since the first versions of v7 client, and still present in 7.5.1 ... :D
The bug may still be present, but in my experience the 7.5.1 client recovers from a stalled download or upload much more often. It does take about 15 minutes before kicking in sometimes and restarting the connection. Worked quite well last week when my DSL connection kept dropping, finally found the corroded phone wire connection caused by a week of very humid, rainy weather.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
TAMUmpower
Posts: 12
Joined: Mon Jan 08, 2018 9:02 pm

Re: Work servers downloads broken

Post by TAMUmpower »

So I checked the version on all my rigs and it says 7.4.4 but when I try to download the current version, the exact way I installed it a few months ago it just says the new version is already installed. Must be some thing with the linux 7.5.1 that doesnt update the about section of the client?

And yep got home from a 2 day trip and half my cards were cold, not doing crap. This needs to get fixed asap

Checking the logs and editing on which servers the download froze on after about 20% just like before:
140.163.4.231
Last edited by TAMUmpower on Wed Aug 15, 2018 4:24 am, edited 1 time in total.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Work servers downloads broken

Post by bruce »

I assume the log you posted earlier has been updated. Please show us the current version.


What toes "top" or the equivalent tell you? Is a user named "fahclient" running? Does it have rwx permissions to the files in /varl/lib/fahclient?

Try "sudo /etc/init.d/FAHClient start"
Presumably it will tell you "fahclient seems to be already running with PID *"
TAMUmpower
Posts: 12
Joined: Mon Jan 08, 2018 9:02 pm

Re: Work servers downloads broken

Post by TAMUmpower »

Well I have everything on a linksys switch now instead of 2 chained together routers and they are freezing much slower. Maybe 1 every 12 hours instead of 3-4. Not sure if they are fixing anything or the hardware switch did anything. Waiting to see if it gets better on it's own. It ran fine for months so it shouldnt really have been anything on my end.
TAMUmpower
Posts: 12
Joined: Mon Jan 08, 2018 9:02 pm

Re: Work servers downloads broken

Post by TAMUmpower »

Yea well today, I've had to restart like 5 cards. They just keep freezing on the download just like on the log posted above. This is ridiculous. I'm about to just sell it all and be done with it
Post Reply