Page 1 of 1

aws1 and aws2.foldingathome.org both not responding

Posted: Fri May 01, 2020 12:40 am
by tbonse
Both of these servers (3.133.76.19 and 3.21.157.11 show an uptime of 1 hour, but are not accepting communication to upload completed jobs.

Code: Select all

******************************* Date: 2020-05-01 *******************************
00:32:33:WU01:FS02:0x22:Completed 7840000 out of 8000000 steps (98%)
00:33:35:WU01:FS02:0x22:Completed 7920000 out of 8000000 steps (99%)
00:34:36:WU01:FS02:0x22:Completed 8000000 out of 8000000 steps (100%)
00:34:37:WU01:FS02:0x22:Saving result file ..\logfile_01.txt
00:34:37:WU01:FS02:0x22:Saving result file checkpointState.xml
00:34:37:WU01:FS02:0x22:Saving result file checkpt.crc
00:34:37:WU01:FS02:0x22:Saving result file positions.xtc
00:34:38:WU01:FS02:0x22:Saving result file science.log
00:34:38:WU01:FS02:0x22:Folding@home Core Shutdown: FINISHED_UNIT
00:34:39:WU01:FS02:FahCore returned: FINISHED_UNIT (100 = 0x64)
00:34:39:WU01:FS02:Sending unit results: id:01 state:SEND error:NO_ERROR project:16433 run:1408 clone:3 gen:3 core:0x22 unit:0x0000000803854c135e9a4efe84a09c2f
00:34:39:WU01:FS02:Uploading 59.00MiB to 3.133.76.19
00:34:39:WU01:FS02:Connecting to 3.133.76.19:8080
00:35:00:WARNING:WU01:FS02:WorkServer connection failed on port 8080 trying 80
00:35:00:WU01:FS02:Connecting to 3.133.76.19:80
00:35:22:WARNING:WU01:FS02:Exception: Failed to send results to work server: Failed to connect to 3.133.76.19:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
00:35:22:WU01:FS02:Trying to send results to collection server
00:35:22:WU01:FS02:Uploading 59.00MiB to 3.21.157.11
00:35:22:WU01:FS02:Connecting to 3.21.157.11:8080
00:35:43:WARNING:WU01:FS02:WorkServer connection failed on port 8080 trying 80
00:35:43:WU01:FS02:Connecting to 3.21.157.11:80
00:36:05:ERROR:WU01:FS02:Exception: Failed to connect to 3.21.157.11:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
00:36:05:WU01:FS02:Sending unit results: id:01 state:SEND error:NO_ERROR project:16433 run:1408 clone:3 gen:3 core:0x22 unit:0x0000000803854c135e9a4efe84a09c2f
00:36:05:WU01:FS02:Uploading 59.00MiB to 3.133.76.19
00:36:05:WU01:FS02:Connecting to 3.133.76.19:8080
00:36:26:WARNING:WU01:FS02:WorkServer connection failed on port 8080 trying 80
00:36:26:WU01:FS02:Connecting to 3.133.76.19:80
00:36:48:WARNING:WU01:FS02:Exception: Failed to send results to work server: Failed to connect to 3.133.76.19:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
00:36:48:WU01:FS02:Trying to send results to collection server
00:36:48:WU01:FS02:Uploading 59.00MiB to 3.21.157.11
00:36:48:WU01:FS02:Connecting to 3.21.157.11:8080
00:37:09:WARNING:WU01:FS02:WorkServer connection failed on port 8080 trying 80
00:37:09:WU01:FS02:Connecting to 3.21.157.11:80
00:37:31:ERROR:WU01:FS02:Exception: Failed to connect to 3.21.157.11:80: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
00:37:31:WU01:FS02:Sending unit results: id:01 state:SEND error:NO_ERROR project:16433 run:1408 clone:3 gen:3 core:0x22 unit:0x0000000803854c135e9a4efe84a09c2f
00:37:31:WU01:FS02:Uploading 59.00MiB to 3.133.76.19
00:37:31:WU01:FS02:Connecting to 3.133.76.19:8080
00:37:52:WARNING:WU01:FS02:WorkServer connection failed on port 8080 trying 80

Re: aws1 and aws2.foldingathome.org both not responding

Posted: Fri May 01, 2020 12:43 am
by tbonse
Update:

3.133.76.19:80 did finally start responding.

I believe the assessment from a prior thread about these cloud servers being poorly suited to the task of supporting the FAH work was very apt. It seems that just about every time there is a server malfunctioning, it is either an Azure or AWS server.

Re: aws1 and aws2.foldingathome.org both not responding

Posted: Fri May 01, 2020 2:55 am
by ChristianVirtual
aws2 still not collaborative

Code: Select all

02:51:52:ERROR:WU00:FS01:Exception: Failed to connect to 3.21.157.11:80: Connection timed out

Re: aws1 and aws2.foldingathome.org both not responding

Posted: Tue May 05, 2020 5:58 am
by Crunchtimer
Hi!
I believe I've had all of the aforementioned problems with AWS GPU server and it seems all be down to GPU driver issues for me.
Getting CPU-only-AWS-servers crunching, I never had problems only following a guide on Medium on EC2 and Folding@home by Julien Simon, installing Fahclient and Fahcontrol + config.xml-file; using the download links on the Faolding@home support-page for Linux manual installtion.

However getting GPU going for a g4dn.xlarge was something else. I finally follow the exect steps mentioned in one of the responses to the guide linked above, and got it working with an Amazon LInux 2. It's a huge difference running GPU.

Now my only problem is that AWS are not responding to my 'Limit Increase: EC2 Instances' request for addition G4 vCPU limit.
The first time it only took a couple of hours to get the increase of 4 vCPU, but now I've waited +24hours with nothing but the initial automatic reply.

Good luck everyone!

Re: aws1 and aws2.foldingathome.org both not responding

Posted: Tue May 05, 2020 9:53 am
by anandhanju
Hi Crunchtimer, welcome to Folding and to the forum.

The issues being discussed here in this topic relate to the two Folding@home work servers hosted on AWS, which were having connection issues when assigning work or accepting results. This doesn't involve running the Folding client (CPU or GPU) on AWS virtual machines, which I believe is what you're doing.

Re: aws1 and aws2.foldingathome.org both not responding

Posted: Tue May 05, 2020 5:20 pm
by Crunchtimer
Hi, yes you're right I didn't check close enough.
However, you will get the same error message even though the assignment servers are up due to wrong GPU-drivers.

I guess I jinxed everything posting here as my GPU has been unable to get assignments all day :(

Code: Select all

17:07:11:WU00:FS01:Connecting to 128.252.203.10:8080
17:07:26:WU00:FS01:Downloading 50.73MiB
17:08:00:WU00:FS01:Download 0.74%
17:08:47:WU00:FS01:Connecting to 65.254.110.245:80
17:08:48:WARNING:WU00:FS01:Failed to get assignment from '65.254.110.245:80': Empty work server assignment
17:08:48:WU00:FS01:Connecting to 18.218.241.186:80
17:08:48:WARNING:WU00:FS01:Failed to get assignment from '18.218.241.186:80': Empty work server assignment
17:08:48:ERROR:WU00:FS01:Exception: Could not get an assignment
One of the server are not even in the serverstats list.
What to do? Wait?

So setting up a GPU server today for the first time musn't be easy ......

Re: aws1 and aws2.foldingathome.org both not responding

Posted: Tue May 05, 2020 7:16 pm
by Neil-B
Crunchtimer wrote:One of the server are not even in the serverstats list.
It is - but just under a different guise :) ... It is an Assignment Server see … viewtopic.php?f=18&t=34034&p=323083&hil ... ip#p323085

Re: aws1 and aws2.foldingathome.org both not responding

Posted: Tue May 05, 2020 7:38 pm
by Crunchtimer
Neil-B wrote:
Crunchtimer wrote:One of the server are not even in the serverstats list.
It is - but just under a different guise :) ... It is an Assignment Server see … viewtopic.php?f=18&t=34034&p=323083&hil ... ip#p323085
Ah, I see thanks! Well now I've upgraded from 7.4.4. to 7.6.13 and then it uses FQDN assign1-4.foldingathome.org:80 instead of ip.
Still Failed to get an assignment until I killed the process for FahCore_a7 as it wouldn´t let me control it anymore.
A reboot later and GPU working again, magic!

Can't send result - 3.133.76.19

Posted: Wed May 20, 2020 3:58 pm
by ppbering
Hi,
New french guy here, and I have an issue trying to send result to the 3.133.76.19 server.
Tried opening the adress trhu firefox and no result neither.
Is there a problem with this server ?

Thanks

Here are the logs :

Code: Select all

15:48:26:WU00:FS01:Upload 1.44%
15:48:26:WARNING:WU00:FS01:Exception: Failed to send results to work server: Transfer failed
15:54:04:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:14439 run:0 clone:1858 gen:24 core:0x22 unit:0x0000002803854c135ea0a3014b7d5a75
15:54:04:WU00:FS01:Uploading 78.07MiB to 3.133.76.19
15:54:04:WU00:FS01:Connecting to 3.133.76.19:8080
15:54:25:WARNING:WU00:FS01:WorkServer connection failed on port 8080 trying 80
15:54:25:WU00:FS01:Connecting to 3.133.76.19:80
15:54:46:WARNING:WU00:FS01:Exception: Failed to send results to work server: Failed to connect to 3.133.76.19:80: Une tentative de connexion a échoué car le parti connecté n’a pas répondu convenablement au-delà d’une certaine durée ou une connexion établie a échoué car l’hôte de connexion n’a pas répondu.
Regards

Re: Can't send result - 3.133.76.19

Posted: Wed May 20, 2020 5:30 pm
by ppbering
You can close this topic because for my client it's OK, the upload is complete.
Thanks guys.

Re: Can't send result - 3.133.76.19

Posted: Wed May 20, 2020 7:25 pm
by PantherX
Welcome to the F@H Forum ppbering,

I am glad that the issue is resolved for you. Please note that for future reference, when a F@H Server is under load, it might not accept new connections but your client will try to send the WU later so you can leave your client running. If after several hours, the issue isn't resolved, then you can search the Forum to see if there are recent posts or not about the Server in question.