The annoying "restart" incident.

If you're new to FAH and need help getting started or you have very basic questions, start here.

Moderators: Site Moderators, FAHC Science Team

The annoying "restart" incident.

Postby Ibringapples » Wed May 27, 2020 4:26 am

Hello all,

Due to the shortage of GPU WU I have to restart the linux FAH client.

When I do, the feeling of frustration is quite unpleasant because the service just doesn't obey me.

I have to kill the process to try to restart it. Or.. when finally I can stop it the service is up by itself. So I don't feel I have the control about this service.

Does anyone has a good way to do it?

Thanks a lot.:)
Ibringapples
 
Posts: 42
Joined: Fri Apr 10, 2020 4:53 am

Re: The annoying "restart" incident.

Postby JimboPalmer » Wed May 27, 2020 5:24 am

I have not noticed a shortage of GPU WUs, since about May 12. I run Wndows boxes, no disease specified, 3 Nvidia GPUs, two Pascals and a Turring. No Beta, no Advanced.

Is there a chance you are restricting your WUs in some way?

here is how to post your log

viewtopic.php?f=24&t=26036
Tsar of all the Rushers
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
JimboPalmer
 
Posts: 1954
Joined: Mon Feb 16, 2009 5:12 am
Location: Greenwood MS USA

Re: The annoying "restart" incident.

Postby NRT_AntiKytherA » Wed May 27, 2020 10:47 am

I have to kill the process to try to restart it. Or.. when finally I can stop it the service is up by itself. So I don't feel I have the control about this service.

Does anyone has a good way to do it?


Simplest would be to restart your machine which will signal the client to terminate gracefully preserving any running CPU work unit to the last save point.

More complex, restart the service and fahclient using systemd. anyhow pay attention to bruce's thoughts on this subject:

bruce wrote:
pcwolf wrote:When I become impatient waiting for WU downloads (i.e. considerable minutes/hours passing not Folding) I have found if I go to Manjaro System Settings and go to the SystemD tab, I can restart the "foldingathome.service" and when both the service and F@H Client return ... *BOOM* I immediately receive a new WU. :D This behavior is consistent and repeatable. I have two GPUs Folding and the previously engaged slot goes immediately back to a checkpoint and resumes flawlessly.


You may (or may not) be guilty of biased perception. Restarting the service does initiate a fresh attempt to get work rather than waiting up to an hour for the next automatic attempt, but I know of no reason why the restart would be any more likely to succeed than if the next attempt was initiated by the timer. It would seem most likely that the client simply says to the server "I/m asking for a new work unit for my hardware ( ... description)" rather than the request being equivalent to "I'm asking again for for a new work unit for my hardware ( ... description)" Why would the "again" message (if it's there) actually reduce your chances of getting a new assignment?
NRT_AntiKytherA
 
Posts: 111
Joined: Mon May 11, 2020 12:50 am

Re: The annoying "restart" incident.

Postby Ibringapples » Wed May 27, 2020 2:05 pm

Hello,

You're right...

Simply they are not running properly (WU)

Here the logs:

https://pastebin.com/cq0qB5Fd


Can you help me?

Thanks a lot. :)

---update--- 01

Now the only one that is not running is the CPU WU :!: :?:
But i'ts unstable. Suddenly 2 days :?:

---update--- 02

Now.. all are running but I've lost 2 CPU from the 4 ones I have.

Code: Select all
~$ nproc --all
4
:?:
Last edited by Ibringapples on Wed May 27, 2020 3:01 pm, edited 2 times in total.
Ibringapples
 
Posts: 42
Joined: Fri Apr 10, 2020 4:53 am

Re: The annoying "restart" incident.

Postby Ibringapples » Wed May 27, 2020 2:22 pm

NRT_AntiKytherA wrote:
I have to kill the process to try to restart it. Or.. when finally I can stop it the service is up by itself. So I don't feel I have the control about this service.

Does anyone has a good way to do it?


Simplest would be to restart your machine which will signal the client to terminate gracefully preserving any running CPU work unit to the last save point.




But.. reboot the machine could be a problem cause I have other services running inside.

Then, maybe with systemd? I have OpenRC and systemd but I prefer OpenRC...

Thanks. :)
Ibringapples
 
Posts: 42
Joined: Fri Apr 10, 2020 4:53 am

Re: The annoying "restart" incident.

Postby bruce » Sun May 31, 2020 3:15 am

Ibringapples wrote:Now the only one that is not running is the CPU WU :!: :?:

But i'ts unstable. Suddenly 2 days :?: [/quote]

Now.. all are running but I've lost 2 CPU from the 4 ones I have.

Each GPU requires one CPU thread to send and receive data between main RAM and the GPU. With 2 GPUs and 4 CPUs you can fold with the remaining two.
bruce
 
Posts: 19649
Joined: Thu Nov 29, 2007 11:13 pm
Location: So. Cal.

Re: The annoying "restart" incident.

Postby MeeLee » Sun May 31, 2020 6:40 am

You can safely run a script to use systemd to restart the service.
You can also use ssh to start it remotely.
Supposedly fahcontrol has a way to connect to a remote client.
MeeLee
 
Posts: 923
Joined: Tue Feb 19, 2019 11:16 pm

Re: The annoying "restart" incident.

Postby Ibringapples » Mon Jun 01, 2020 4:40 pm

MeeLee wrote:You can safely run a script to use systemd to restart the service.
You can also use ssh to start it remotely.
Supposedly fahcontrol has a way to connect to a remote client.


Actually no...


Code: Select all
~$ sudo /etc/init.d/FAHClient restart
Stopping fahclient ... OK
Starting fahclient ... FAIL


That's the awful issue here ...

:?:
Ibringapples
 
Posts: 42
Joined: Fri Apr 10, 2020 4:53 am

Re: The annoying "restart" incident.

Postby bruce » Mon Jun 01, 2020 11:15 pm

If you're running one or more CPU based slots (FAHCore_a7) that's not true.
MeeLee wrote:You can safely run a script to use systemd to restart the service.


Unfortunately, there's a bug in FAHCore_a7 which fails to sync it's open files before shutting down. You have to pause all CPU slots and give them time to close their files.
bruce
 
Posts: 19649
Joined: Thu Nov 29, 2007 11:13 pm
Location: So. Cal.

Re: The annoying "restart" incident.

Postby MeeLee » Thu Jun 04, 2020 1:32 am

I was under the assumption that you needed to use systemd for restarts.

Or, perhaps try Fahclient stop, and on another line fahclient start.
MeeLee
 
Posts: 923
Joined: Tue Feb 19, 2019 11:16 pm

Re: The annoying "restart" incident.

Postby bruce » Fri Jun 05, 2020 5:09 am

When I PAUSE a FAHCore_a7 WU, it can watch ir process for a bit before reporting that it has completed the stopping process. I have not evaluated whether that time varies with the project but I'd guess that it might. You need to allow at least that long before restarting, whether or not you use systemd. I have not heard if the bug will be fixed in the next version of the FAHCore, but I sure hope so.
bruce
 
Posts: 19649
Joined: Thu Nov 29, 2007 11:13 pm
Location: So. Cal.

Re: The annoying "restart" incident.

Postby MeeLee » Fri Jun 05, 2020 7:38 pm

I never had any issues on my system using the 'restart' function in terminal.
however, you could use the 'sleep' command to pause the script for an x-amount of seconds before going to the next
Eg:
Code: Select all
sudo /etc/init.d/fahclient stop
sleep 5
sudo /etc/init.d/fahclient start
MeeLee
 
Posts: 923
Joined: Tue Feb 19, 2019 11:16 pm


Return to New Donors start here

Who is online

Users browsing this forum: No registered users and 1 guest

cron