GPU error.

Moderators: Site Moderators, FAHC Science Team

Post Reply
MeeLee
Posts: 1375
Joined: Tue Feb 19, 2019 10:16 pm

GPU error.

Post by MeeLee »

Is there a way we can implement in the client some sort of reset button in FAHControl?
When FAH is configured for an x-amount of graphics cards, and one fails to load, or is removed for whatever reason, FAHControl gives an error; and it's impossible to access the slots list to remove one GPU slot.

I spent like 20 minutes trying to find that one post on how to reset//reconfigure the client without success.
I even reinstalled the client and control.
I believe in Windows that should do it, but not in Linux.

In the end, I had to plug in a spare graphics card I had laying around, just to access the slots list, to remove a slot.

It would be better if there's a way the slots list can be accessed, even without internet access, or when something like RMA a card happens. Not everyone has an extra card laying around.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: GPU error.

Post by bruce »

Please post your log showing the error. I've removed a slot and don't remember an error.

The only way to change the hardware or to add a GPU is to reinstall FAHClient. That's the only time it detects which GPU(s) you have and creates the slots to match their characteristics.
MeeLee
Posts: 1375
Joined: Tue Feb 19, 2019 10:16 pm

Re: GPU error.

Post by MeeLee »

The issue is not removing a slot in software.
The issue is when removing a graphics card from a pcie slot, without adjusting the software; and then trying to reload the software.
If any hardware crash happens, and a card needs to be removed, Linux users will have a tough time getting FAH to work, other than finding a way to reset it; if they forgot (or were unable) to remove the GPU slot BEFORE reboot.
Though sometimes hardware crashes cause an automatic system reboot, or crash, and will force the user to restart FAH when the faulty hardware issue has been solved.

When the system reboots, FAH gets stuck at the beginning showing a message in the likes of:
"...client "local" 127.0.0.1:36330: Option 'gpu-index' has no default and is not set.." and something more...
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: GPU error.

Post by bruce »

I don't understand removing the GPU. If rebooting resets it, you have no problem.

If rebooting cannot get the GPU running again, you're suggesting it's trash or RMA time. In either case, FAH cannot proceed so you might as well disable it or uninstall it until you have a working GPU again. Recovering the in-process WU is not possible unless you replace the GPU with an identical GPU that works.

There must be something I'm missing.
MeeLee
Posts: 1375
Joined: Tue Feb 19, 2019 10:16 pm

Re: GPU error.

Post by MeeLee »

Well,
Nvidia sees 4 cards.
System is shut down.
1 card is removed.
Now nvidia is seeing 3 cards.
OS and everything else have no problem with there only being 3 cards.
Except for FAH, which seems like it's looking for the 4th card.
The settings can't be accessed.
Reboot or reinstall don't work.
Some person on this forum had recently posted how to reset the client in Linux (by typing something in the terminal, I could again put my username and passkey and team in), and FAHControl would restart stock (with 1 CPU).
Then it was just a matter of adding GPU slots again.
I don;t know anymore what that command was. I should have written it down when I saw it.
des1957
Posts: 37
Joined: Fri Jan 04, 2013 3:20 pm

Re: GPU error.

Post by des1957 »

I have experienced this error. Whether in Windows or Linux. The folding software is folding fine. The error is in the control panel software. The only way I have found to repair this is manually remove the slot from the config file. If you have 4 gpus and remove one manually, fah control will still have 4 gpus listed in the config file. You need to edit the file and remove the last gpu,save the file and reboot fah or reboot your system. I run 3 rigs with 4 gpus each and have run into this issue many times. Good luck
foldy
Posts: 2061
Joined: Sat Dec 01, 2012 3:43 pm
Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441

Re: GPU error.

Post by foldy »

Yes, I can reproduce the issue on Windows 7. This is a FahControl bug https://github.com/FoldingAtHome/fah-issues/issues/1274
MeeLee
Posts: 1375
Joined: Tue Feb 19, 2019 10:16 pm

Re: GPU error.

Post by MeeLee »

@foldy,
in windows, is the issue resolved when reinstalling the software?
Because in Linux, the configuration file remains somewhere.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: GPU error.

Post by bruce »

Allowing the installer to re-create a configuration that has slot(s) that match the hardware that is a straightforward way of dealing with the problem.

We do not recommend manually editing config.xml but it's possible to delete the offending slot, replacing it with a slot configured for whatever supportable hardware is on your system.
MeeLee
Posts: 1375
Joined: Tue Feb 19, 2019 10:16 pm

Re: GPU error.

Post by MeeLee »

it would be easier to have an easy option to start from scratch, meaning zero slot usage, or access to config, even without internet.
I don't know why an internet connection is required, to change config and slots, or continue running a WU that's still good to fold...
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: GPU error.

Post by bruce »

I'm not sure why you're concerned about having an internet connection is a concern. Whenever you alter you reconfigure your hardware (like removing and adding a new GPU) you have to have re-validate that the drivers are current and then immediately you'll need to download a new WU. The WUs are closely tied to whatever hardware is going to process them. A WU that was obtained to be processed by GPU type X will need to be returned or dumped and you can't expect be able to continue processing it by GPU Y. An internet connection will be required..
foldy
Posts: 2061
Joined: Sat Dec 01, 2012 3:43 pm
Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441

Re: GPU error.

Post by foldy »

On my Linux Ubuntu the fah config file is here /etc/init.d/config.xml
bollix47
Posts: 2941
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: GPU error.

Post by bollix47 »

foldy wrote:On my Linux Ubuntu the fah config file is here /etc/init.d/config.xml
Interesting ... all my ubuntu setups show my configuration file is in /etc/fahclient/config.xml
Post Reply