99.99% and stuck

Moderators: Site Moderators, FAHC Science Team

Nuitari
Posts: 80
Joined: Sun Jun 09, 2019 4:03 am
Hardware configuration: 1x Nvidia 1050ti
1x Nvidia 1660Super
1x Nvidia GTX 660
1x Nvidia 1060 3gb
1x AMD rx570
2x AMD rx560
1x AMD Ryzen 7 PRO 1700
1x AMD Ryzen 7 3700X
1x AMD Phenom II
1x AMD A8-9600
1x Intel i5-4590S

Re: 99.99% and stuck

Post by Nuitari »

You should plan 2gb per GPU slot + whatever else is running as some project will use that much.
The upload is essentially the client telling the server that the work got dumped.

Hard to say what is going on from the information, you could install atop which will keep a snapshot every 10 minutes of the system usage, that way you could see any weird situation.
Image
CoronaCrusher
Posts: 1
Joined: Thu Apr 30, 2020 3:52 pm

Re: 99.99% and stuck

Post by CoronaCrusher »

I'm finding that for me it is always one particular type of work unit that stalls out and gets stuck at 99.99%. It's from Project 14436 (578, 4, 11) with a base credit of 53000 although I can't verify that those numbers in parentheses are the same. I've gotten at least 10 WUs like that that have stalled out and sometimes I can restart the computer and get it to revert to an old checkpoint often like 35 - 45% or so but half the time I have to just delete the WU to force my GPU to move on to a new one, otherwise it will sit at 99.99% for days... That type of WU doesn't always fail to complete on my system but when I do get one that's stuck, it's ALWAYS that WU. I'm running an i5-8600k @4.7GHz and dual liquid cooled GTX 1080s in SLI running at 2050MHz but I've run into this issue at base clock as well with different drivers so there has to be something different about that particular WU that causes it to fail fairly often...
qe4
Posts: 6
Joined: Mon Apr 27, 2020 12:22 pm

Re: 99.99% and stuck

Post by qe4 »

A little update... I tried underclocking all GPU's but that did not really help. I then disabled my Hawaii GPU and one 8GB RX480 that I suspected might be causing issues. Since then it has been running fine. There is still one failed WU from the Hawaii GPU that the client is attempting (but failing) to upload. But I am pretty happy that I got 2 of my RX480 running stable.
Joe_H
Site Admin
Posts: 7870
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: 99.99% and stuck

Post by Joe_H »

@Corona Crusher - post the log of one of these WUs that reach 99.99% and the beginning section of the log that shows the system info and client configuration.

The Welcome topic - viewtopic.php?f=66&t=26036 - has directions on finding and posting your log file.

However, most likely Project 14436 WUs push you GPU more than those from other projects and any overclock, factory or otherwise, needs reducing.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
1TM
Posts: 12
Joined: Sat Mar 28, 2020 5:22 am

Re: 99.99% and stuck

Post by 1TM »

qe4 wrote:issue: They get stuck at 99.99% and don't upload.
- How do I cancel/remove a WU that is stuck?
I also had a unit stuck at 99.99%. What seems to have helped was:
1. press Finish
2. wait for other WU (if running several) to reach a nearest save point such as 10% or 20 or,.. 90%
3. shut down the PC and switch off the power supply if it has a switch
4. restart - the FAHControl was able to find the checkpoint files and resume all WU runs (which 99% was at 30%)
Post Reply