Page 1 of 1

Bad State Detected

Posted: Mon Apr 27, 2020 7:10 am
by iceman1992
Is this just system instability? Or something with the WU?
Seems to continue just fine, now at 91%
Update: now has been uploaded, WORK_ACK.

Code: Select all

05:56:58:WU00:FS01:0x22:Completed 3600000 out of 8000000 steps (45%)
05:58:31:WU00:FS01:0x22:Completed 3680000 out of 8000000 steps (46%)
06:00:04:WU00:FS01:0x22:Completed 3760000 out of 8000000 steps (47%)
06:00:40:WU00:FS01:0x22:Bad State detected... attempting to resume from last good checkpoint. Is your system overclocked?
06:00:40:WU00:FS01:0x22:Following exception occured: Force RMSE error of 8.13081 with threshold of 5
06:01:50:WU00:FS01:0x22:Completed 3840000 out of 8000000 steps (48%)
06:03:24:WU00:FS01:0x22:Completed 3920000 out of 8000000 steps (49%)
06:04:57:WU00:FS01:0x22:Completed 4000000 out of 8000000 steps (50%)
06:06:30:WU00:FS01:0x22:Completed 4080000 out of 8000000 steps (51%)

Re: Bad State Detected

Posted: Mon Apr 27, 2020 7:37 am
by PantherX
It's a tough one... it could be a WU that's almost bad but not there yet (potentially the next one could be a bad WU). It could also be hardware related. If you haven't seen this message before and you have been folding for a while, then it is safe to ignore it. If you commonly see that message, then it could be related to your hardware.

Re: Bad State Detected

Posted: Mon Apr 27, 2020 8:21 am
by iceman1992
Okay, I'll ignore it. I don't remember ever seeing that before. Just thought I'd post in case the researchers need to be notified.

Re: Bad State Detected

Posted: Mon Apr 27, 2020 2:58 pm
by Joe_H
Could be that this particular project/WU pushed your GPU a bit harder, and that was enough to cause an error. If you see this again and you have an overclock factory or otherwise, try reducing it by a little.

Re: Bad State Detected

Posted: Mon Apr 27, 2020 3:15 pm
by iceman1992
Joe_H wrote:Could be that this particular project/WU pushed your GPU a bit harder, and that was enough to cause an error. If you see this again and you have an overclock factory or otherwise, try reducing it by a little.
Okay. Not an overclock, but a slight undervolt, GTX1660 1860MHz at 981mV. It hasn't happened again though, will keep monitoring.

Re: Bad State Detected

Posted: Sat Jul 04, 2020 7:13 am
by bruce
Every project has a range of acceptable errors. In this case, the value 5 was considered acceptable.based on the project's owner's estimate. He guessed wrong.

He could have allowed 10 and might have gotten good answers. He could have reconfigured the project to run with mixed precision and the sum would have been run in FP64 calculations and the error would have been much smaller and the project would run somewhat slower.

A lot depends on how many atoms contribute to that error sum.