[Bug] Client got stuck after initiating connection to WS

Moderators: Site Moderators, FAHC Science Team

[Bug] Client got stuck after initiating connection to WS

Postby Frogging101 » Sat Apr 18, 2020 5:30 pm

My FAHClient has been stuck since 21:32:00 UTC, yesterday (April 17, 2020).

The last lines in the log were as follows:
Code: Select all
21:32:00:WU00:FS00:Connecting to
21:32:00:WU00:FS00:Assigned to work server
21:32:00:WU00:FS00:Requesting new work unit for slot 00: READY cpu:24 from
21:32:00:WU00:FS00:Connecting to

Upon inspection, it appears that this connection is still in the ESTABLISHED state, some 15 hours later:

Code: Select all
Netid  State      Recv-Q Send-Q Local Address:Port                 Peer Address:Port               
tcp    ESTAB      0      0                    users:(("FAHClient",pid=30164,fd=12))

I would venture a theory that this connection is not actually still active; the connection simply died (without an RST) at a specific time when the client was waiting for a response, which it will never receive.

I could not clear this condition by pausing/unpausing, or using the request-id or request-ws commands. request-ws did connect to an AS and get an "Assigned to work server" message, but nothing else happened after that. The original socket to remained open in the ESTABLISHED state throughout these attempts to jog the client.

The FAHControl UI continued to display as the work server, with no next attempt. Here's the full queue-info output:
Code: Select all
  {"id": "00", "state": "DOWNLOAD", "error": "NO_ERROR", "project": 0, "run": 0, "clone": 0, "gen": 0, "core": "unknown", "unit": "0x00000000000000000000000000000000", "percentdone": "0.00%", "eta": "0.00 secs", "ppd": "0", "creditestimate": "0", "waitingon": "", "nextattempt": "0.00 secs", "timeremaining": "unknown time", "totalframes": 0, "framesdone": 0, "assigned": "<invalid>", "timeout": "<invalid>", "deadline": "<invalid>", "ws": "", "cs": "", "attempts": 0, "slot": "00", "tpf": "0.00 secs", "basecredit": "0"}                             

I had to restart to get it folding again. Sending SIGINT once did not close the client; the socket remained open. I had to send it again to force exit.
Posts: 85
Joined: Wed Mar 25, 2020 3:39 am
Location: Canada

Re: [Bug] Client got stuck after initiating connection to WS

Postby Jan » Sat Apr 18, 2020 8:54 pm

I know it sounds like a bit of a weird solution - but here is a report that might help. If you are already on the newest client 7.6.9, maybe try a reboot. If that doesnt work: Here is another approach.
Posts: 80
Joined: Tue Mar 31, 2020 7:46 pm

Re: [Bug] Client got stuck after initiating connection to WS

Postby PantherX » Sat Apr 18, 2020 10:46 pm

It seems that you have encountered this known bug: https://github.com/FoldingAtHome/fah-issues/issues/983

I will ad this link too.
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
User avatar
Site Moderator
Posts: 7020
Joined: Wed Dec 23, 2009 10:33 am
Location: Land Of The Long White Cloud

Return to V7.5.1 Public Release Windows/Linux/MacOS X

Who is online

Users browsing this forum: No registered users and 3 guests