Bad State Detected

Moderators: Site Moderators, FAHC Science Team

Post Reply
iceman1992
Posts: 527
Joined: Fri Mar 23, 2012 5:16 pm

Bad State Detected

Post by iceman1992 »

Is this just system instability? Or something with the WU?
Seems to continue just fine, now at 91%
Update: now has been uploaded, WORK_ACK.

Code: Select all

05:56:58:WU00:FS01:0x22:Completed 3600000 out of 8000000 steps (45%)
05:58:31:WU00:FS01:0x22:Completed 3680000 out of 8000000 steps (46%)
06:00:04:WU00:FS01:0x22:Completed 3760000 out of 8000000 steps (47%)
06:00:40:WU00:FS01:0x22:Bad State detected... attempting to resume from last good checkpoint. Is your system overclocked?
06:00:40:WU00:FS01:0x22:Following exception occured: Force RMSE error of 8.13081 with threshold of 5
06:01:50:WU00:FS01:0x22:Completed 3840000 out of 8000000 steps (48%)
06:03:24:WU00:FS01:0x22:Completed 3920000 out of 8000000 steps (49%)
06:04:57:WU00:FS01:0x22:Completed 4000000 out of 8000000 steps (50%)
06:06:30:WU00:FS01:0x22:Completed 4080000 out of 8000000 steps (51%)
PantherX
Site Moderator
Posts: 7020
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Bad State Detected

Post by PantherX »

It's a tough one... it could be a WU that's almost bad but not there yet (potentially the next one could be a bad WU). It could also be hardware related. If you haven't seen this message before and you have been folding for a while, then it is safe to ignore it. If you commonly see that message, then it could be related to your hardware.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
iceman1992
Posts: 527
Joined: Fri Mar 23, 2012 5:16 pm

Re: Bad State Detected

Post by iceman1992 »

Okay, I'll ignore it. I don't remember ever seeing that before. Just thought I'd post in case the researchers need to be notified.
Joe_H
Site Admin
Posts: 7867
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Bad State Detected

Post by Joe_H »

Could be that this particular project/WU pushed your GPU a bit harder, and that was enough to cause an error. If you see this again and you have an overclock factory or otherwise, try reducing it by a little.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
iceman1992
Posts: 527
Joined: Fri Mar 23, 2012 5:16 pm

Re: Bad State Detected

Post by iceman1992 »

Joe_H wrote:Could be that this particular project/WU pushed your GPU a bit harder, and that was enough to cause an error. If you see this again and you have an overclock factory or otherwise, try reducing it by a little.
Okay. Not an overclock, but a slight undervolt, GTX1660 1860MHz at 981mV. It hasn't happened again though, will keep monitoring.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Bad State Detected

Post by bruce »

Every project has a range of acceptable errors. In this case, the value 5 was considered acceptable.based on the project's owner's estimate. He guessed wrong.

He could have allowed 10 and might have gotten good answers. He could have reconfigured the project to run with mixed precision and the sum would have been run in FP64 calculations and the error would have been much smaller and the project would run somewhat slower.

A lot depends on how many atoms contribute to that error sum.
Post Reply