Moderators: Site Moderators, FAHC Science Team
astrorob wrote:without shutting down rosetta a 13416 finished OK on the RX5500. another 13416 is now running on the RX5500 and the frame times are back into the 6 minute range. i don't know if they tweaked it or it's just the natural variation in these WUs. clearly though looking back thru HFM i can see that i've had a lot of 13416 failures on the ATI GPU (and actually some failures on nvidia too...)
14:43:37:WU01:FS01:0x22:Completed 250000 out of 1000000 steps (25%)
14:46:17:WU01:FS01:0x22:Completed 260000 out of 1000000 steps (26%)
14:48:57:WU01:FS01:0x22:Completed 270000 out of 1000000 steps (27%)
14:51:41:WU01:FS01:0x22:Completed 280000 out of 1000000 steps (28%)
14:52:19:WU01:FS01:0x22:An exception occurred at step 282123: Particle coordinate is nan
14:52:19:WU01:FS01:0x22:Max number of attempts to resume from last checkpoint (2) reached. Aborting.
14:52:19:WU01:FS01:0x22:ERROR:114: Max number of attempts to resume from last checkpoint reached.
14:52:19:WU01:FS01:0x22:Saving result file ..\logfile_01.txt
14:52:19:WU01:FS01:0x22:Saving result file globals.csv
14:52:19:WU01:FS01:0x22:Saving result file science.log
14:52:19:WU01:FS01:0x22:Saving result file state.xml.bz2
14:52:19:WU01:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT
14:52:20:WARNING:WU01:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
14:52:20:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13416 run:717 clone:92 gen:1 core:0x22 unit:0x0000000712bc7d9a5f02af9cd79c7560
LazyDev wrote:I have SMT on my Ryzen 7 3700X disabled, only pushing 900K PPD on an RTX 2080 Super, with the WU having access to a full CPU core. Something is wrong with this WU.
kiore wrote:astrorob wrote:without shutting down rosetta a 13416 finished OK on the RX5500. another 13416 is now running on the RX5500 and the frame times are back into the 6 minute range. i don't know if they tweaked it or it's just the natural variation in these WUs. clearly though looking back thru HFM i can see that i've had a lot of 13416 failures on the ATI GPU (and actually some failures on nvidia too...)
There is still a large variation between units on 13416 so I think you are seeing the variation..
bruce wrote:FAHCore_22 does sometimes use more than one CPU thread per GPU. Studies have shown that FAHCore_22 can speed up the throughput of it's processing by using more CPU resources.
Nobody has yet studied how much a CPU thread that's freed up from CPU assignments will add to the GPU PPD compared to the loss of PPD for the CPU processing.
Return to Issues with a specific WU
Users browsing this forum: No registered users and 3 guests