Send Errors - 155.247.164.213 & .214

Moderators: Site Moderators, FAHC Science Team

masterofthepenkins
Posts: 3
Joined: Mon Mar 16, 2020 5:33 pm

Re: Send Errors - 155.247.164.213 & .214

Post by masterofthepenkins »

I have a similar error based on logs and number of retries (hasn't succeeded in 30+ hours).

However, my project is for project 11753, but the work server/collection server is 213 and 214 respectively.
qoo
Posts: 1
Joined: Tue Mar 17, 2020 7:06 pm

Re: Send Errors - 155.247.164.213 & .214

Post by qoo »

Same here... Send error since 3 day´s

Code: Select all

19:06:25:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:11758 run:0 clone:2440 gen:0 core:0x22 unit:0x000000019bf7a4d55e6d77167d9507bd
19:06:25:WU01:FS01:Uploading 55.24MiB to 155.247.164.213
19:06:25:WU01:FS01:Connecting to 155.247.164.213:8080
19:06:26:WARNING:WU01:FS01:Exception: Failed to send results to work server: Transfer failed
19:06:26:WU01:FS01:Trying to send results to collection server
19:06:26:WU01:FS01:Uploading 55.24MiB to 155.247.164.214
19:06:26:WU01:FS01:Connecting to 155.247.164.214:8080
19:06:26:ERROR:WU01:FS01:Exception: Transfer failed
RedDeckWins
Posts: 1
Joined: Tue Mar 17, 2020 7:11 pm

Re: Send Errors - 155.247.164.213 & .214

Post by RedDeckWins »

I'm hitting the same issue for the same servers - 213/214. The client was able to upload results to other servers.
DolphinsCry
Posts: 1
Joined: Tue Mar 17, 2020 8:14 pm

Re: Send Errors - 155.247.164.213 & .214

Post by DolphinsCry »

First, many thanks for the massive effort that has been put into providing more WU's and server capacity

But, same here, finished a WU for 11758. Got this WU @ 2020-03-17T14:57:31Z, many retries sending like:

Code: Select all

19:43:18:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:11758 run:0 clone:1831 gen:0 core:0x22 unit:0x000000039bf7a4d55e6d77149ebd3ca9
19:43:18:WU00:FS01:Uploading 55.24MiB to 155.247.164.213
19:43:18:WU00:FS01:Connecting to 155.247.164.213:8080
19:43:18:WARNING:WU00:FS01:Exception: Failed to send results to work server: Transfer failed
19:43:18:WU00:FS01:Trying to send results to collection server
19:43:18:WU00:FS01:Uploading 55.24MiB to 155.247.164.214
19:43:18:WU00:FS01:Connecting to 155.247.164.214:8080
19:43:18:ERROR:WU00:FS01:Exception: Transfer failed
Then seconds later a different WU finishes and is send to the same server w/o problem:

Code: Select all

19:44:54:WU02:FS00:0xa7:Completed 250000 out of 250000 steps (100%)
19:44:55:WU02:FS00:0xa7:Saving result file ..\logfile_01.txt
19:44:55:WU02:FS00:0xa7:Saving result file frame5.trr
19:44:55:WU02:FS00:0xa7:Saving result file md.log
19:44:55:WU02:FS00:0xa7:Saving result file science.log
19:44:55:WU02:FS00:0xa7:Saving result file traj_comp.xtc
19:44:55:WU02:FS00:0xa7:Folding@home Core Shutdown: FINISHED_UNIT
19:44:56:WU02:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
19:44:56:WU02:FS00:Sending unit results: id:02 state:SEND error:NO_ERROR project:14328 run:3 clone:3980 gen:5 core:0xa7 unit:0x000000069bf7a4d65e6d1121784e7dfc
19:44:56:WU02:FS00:Uploading 4.96MiB to 155.247.164.214
19:44:56:WU02:FS00:Connecting to 155.247.164.214:8080
19:44:58:WU02:FS00:Upload complete
19:44:58:WU02:FS00:Server responded WORK_ACK (400)
19:44:58:WU02:FS00:Final credit estimate, 3238.00 points
19:44:58:WU02:FS00:Cleaning up
timkroeger
Posts: 1
Joined: Tue Mar 17, 2020 8:40 pm
Hardware configuration: Intel(R) Core(TM) i7-5820K CPU @ 3.3GHz OC @ 3.4GHz, 12 CPUs
Asus ROG STRIX NVIDIA GeForce GTX 1050 Ti GP107
Windows 10 Enterprise
Location: Germany

Re: Send Errors - 155.247.164.213 & .214

Post by timkroeger »

Just chipping in that I have the same problems with 11758, everything else works fine and I've successfully uploaded and downloaded other GPU WUs.
Assigned: 2020-03-16T11:34:00Z
Timeout: 2020-03-17T11:34:00Z

Code: Select all

20:22:57:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:11758 run:0 clone:1923 gen:0 core:0x22 unit:0x000000029bf7a4d55e6d771456cb16f4
20:22:57:WU00:FS01:Uploading 55.24MiB to 155.247.164.213
20:22:57:WU00:FS01:Connecting to 155.247.164.213:8080
20:22:58:WARNING:WU00:FS01:Exception: Failed to send results to work server: Transfer failed
20:22:58:WU00:FS01:Trying to send results to collection server
20:22:58:WU00:FS01:Uploading 55.24MiB to 155.247.164.214
20:22:58:WU00:FS01:Connecting to 155.247.164.214:8080
20:22:58:ERROR:WU00:FS01:Exception: Transfer failed
Fold on!
Tim
Joe_H
Site Admin
Posts: 7857
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Send Errors - 155.247.164.213 & .214

Post by Joe_H »

The person managing this project, and some others on this server has been notified, and is looking into getting this fixed.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
tparikka
Posts: 1
Joined: Wed Mar 18, 2020 3:23 am

Re: Send Errors - 155.247.164.213 & .214

Post by tparikka »

I am also having issues submitting WU11758 to both .213 and .214. Same logs as others here.

System:
CPU: Intel Core i5-7600K CPU @ 3.80GHz
CPU ID: GenuineIntel Family 6 Model 158 Stepping 9
CPUs: 4
Memory: 15.93 GiB
Free Memory: 6.68 GiB
Threads: WINDOWS_THREADS
OS Version: 6.2
Has Battery: false
On Battery: false
UTC Offset: -5
PID: 6708
CWD: C:\Users\%USER%\AppData\Roaming\FAHClient
OS: Windows 10 Enterprise
OS Arch: AMD64
GPUs: 1
GPU0: Bus: 1 Slot: 0 NVIDIA: 8 TU104 [GeForce RTX 2070 Super] 8218
CUDA Device 0: Platform: 0 Device: 0 Bus: 1 Slot: 0 Compute: 7.5 Driver: 10.2
OpenCL Device 0: Platform: 0 Device: 0 Bus: 1 Slot: 0 Compute: 1.2 Driver: 442.59
Win32 Service: false
octatone
Posts: 1
Joined: Wed Mar 18, 2020 6:37 am

Re: Send Errors - 155.247.164.213 & .214

Post by octatone »

Same here for PCRG 11758
06:31:50:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:11758 run:0 clone:777 gen:0 core:0x22 unit:0x000000039bf7a4d55e6d7710389d680a
06:31:50:WU01:FS01:Uploading 55.24MiB to 155.247.164.213
06:31:50:WU01:FS01:Connecting to 155.247.164.213:8080
06:31:53:WARNING:WU01:FS01:Exception: Failed to send results to work server: Transfer failed
06:31:53:WU01:FS01:Trying to send results to collection server
06:31:53:WU01:FS01:Uploading 55.24MiB to 155.247.164.214
06:31:53:WU01:FS01:Connecting to 155.247.164.214:8080
06:31:54:ERROR:WU01:FS01:Exception: Transfer failed
Craig2.0
Posts: 1
Joined: Wed Mar 18, 2020 9:44 am

Re: Send Errors - 155.247.164.213 & .214

Post by Craig2.0 »

I've got the same issue with the same servers, except I've been trying to upload the same work unit for 4 days now. At one point the upload started and got to 0.27% before the transfer failed again. I would very much appreciate a client config command to switch to a different server.
aka_daryl
Posts: 19
Joined: Sun Mar 15, 2020 12:55 am

Re: Send Errors - 155.247.164.213 & .214

Post by aka_daryl »

just a quick heads-up that I'm experiencing the same issue with WU11758
JoranZeno
Posts: 1
Joined: Wed Mar 18, 2020 12:33 pm

Re: Send Errors - 155.247.164.213 & .214

Post by JoranZeno »

Same issue here with Project 11758 unable to upload.
Qwarkman
Posts: 3
Joined: Wed Mar 18, 2020 12:54 pm
Hardware configuration: System 1: AMD Ryzen 5 3600, 32GB DDR4 3600, 2x nVidia GTX 980Ti 6GB
System 2: Intel i7 2600k, 16GB DDR3, nVidia GTX 970

Re: Send Errors - 155.247.164.213 & .214

Post by Qwarkman »

I have 2 WU's trying to send for a while now. One for more than two days and another for almost 12hours. Both 11758.
Klutz
Posts: 7
Joined: Tue Mar 17, 2020 10:35 am

Re: Send Errors - 155.247.164.213 & .214

Post by Klutz »

According to the https://apps.foldingathome.org/serverstats server stats page, 155.247.164.214 is up & running, but when you hover over the "Has CS" column, it shows that it can't connect to several work servers, 155.247.164.213 among them.

Is there any remedy for this, except resetting on the server side? Could I do this locally by editing my Hosts file to point to a different set of servers?
sswilson
Posts: 90
Joined: Mon Dec 17, 2007 12:34 am
Hardware configuration: ASUS Crosshair IV Formula / AMD 1090T / 4X2 Gig GSkill Pi PC3-12800 / Corsair TX750W PSU / Sparkle GTX275 Plus / CoolerMaster Cosmos S / MCP655 WC Pump / MCR320 Rad / 6X Yate Loons / PA120.1 / 2X Scythe Ultra Kaze / Enzotech Luna WB / Dell Ultrasharp 2209WA

Gigabyte P35-DQ6 / Q6600 / 2X 1G 1066 Firestix / "Baked" XFX GTX 280 (RIP again :( ) / MSI GTS 450 Cyclone OC /PC P&C 750W Silencer / MCR220-QP-Res / DD DDCPX-Pro / Apogee GT / Highspeed PC Tech Station / Samsung 931BF / BenQ Q9T4
Location: Moncton, New Brunswick, Canada

Re: Send Errors - 155.247.164.213 & .214

Post by sswilson »

Is there a point where it would make sense to just delete the work file and carry on?

My "stuck" WU is 11758 (0, 2135, 0) and it has 155.247.164.214 as the collection server.

I'll leave it for now, but since some folks have reported that .214 is collecting certain WUs, I'm wondering if there aren't a small group of WUs that have been orphaned and will more than likely never be accepted.
davidcoton
Posts: 1102
Joined: Wed Nov 05, 2008 3:19 pm
Location: Cambridge, UK

Re: Send Errors - 155.247.164.213 & .214

Post by davidcoton »

AFAICT it is just because of the vastly increased workload on the servers. The servers' owner is aware and investigating, alongside trying to bring more servers online.
Servers are bound to certain projects, largely because the project needs to get back to the right geographical location. This makes it undesirable and currently not possible for users to select alternatives -- the other servers would not know how to handle your work unit.
Since progressing a project requires units to be returned so that the next one can be generated, it is not in anyone's interests for either the server or the client to "lose" a work unit.
Image
Post Reply