ETA 10 days on RX550: Project 13416 run 1140 clone 293 gen 1

Moderators: Site Moderators, FAHC Science Team

GregC
Posts: 24
Joined: Wed May 20, 2020 12:36 am
Location: Houston, TX

Re: ETA 10 days on RX550: Project 13416 run 1140 clone 293 g

Post by GregC »

1) If WU times out or expires, should that not show on the WU status page, same as when it fails?

2) I'd be interested to see if I'm the only one with such an issue, GPU stopping and failing to make any progress toward finishing the WU
- if so, should I RMA the GPU card, or report this as software bug to AMD, OpenMM?
- if not
a) does it follow manufacturer, GPU generation, or model, or something else, such as OS?
b) can these (multiple) WUs be inspected for the reason why they stalled?
_r2w_ben
Posts: 285
Joined: Wed Apr 23, 2008 3:11 pm

Re: ETA 10 days on RX550: Project 13416 run 1140 clone 293 g

Post by _r2w_ben »

GregC wrote:I have FAHCore_A22 at below normal priority, and FAHCore_A07 at idle. Only 6 of 8 threads are assigned.
Good, that should leave enough CPU time available for FahCore_22.exe.

13416 writes to disk 10x more frequently than 13418. Is the FAH work folder on an SSD? Do you observe any CPU time used by antivirus/antimalware processes that could be causing FahCore_22.exe to pause while scans are happening?
GregC
Posts: 24
Joined: Wed May 20, 2020 12:36 am
Location: Houston, TX

Re: ETA 10 days on RX550: Project 13416 run 1140 clone 293 g

Post by GregC »

It's running on an SSD with plenty of free space. It's Microsoft's defaults. On that computer I have a fresh install of Windows, default AV. Not doing much besides folding. One free CPU thread can be seen in Process Explorer/System Information when GPU is actively engaged in folding. Two free CPU threads when this stalled out WU is sitting there.
_r2w_ben
Posts: 285
Joined: Wed Apr 23, 2008 3:11 pm

Re: ETA 10 days on RX550: Project 13416 run 1140 clone 293 g

Post by _r2w_ben »

Did you try pausing CPU folding when you had one of these slow work units?
GregC
Posts: 24
Joined: Wed May 20, 2020 12:36 am
Location: Houston, TX

Re: ETA 10 days on RX550: Project 13416 run 1140 clone 293 g

Post by GregC »

At night, I run both CPU and GPU, and during the day, GPU only. I think that doesn't matter, but I can't be 100% sure, so will try this, will pay attention to it.
GregC
Posts: 24
Joined: Wed May 20, 2020 12:36 am
Location: Houston, TX

Re: ETA 10 days on RX550: Project 13416 run 1140 clone 293 g

Post by GregC »

The Ryzen 3100-based PC was unstable even after I replaced the AMD 550 RX with Nvidia 1660 Super. Instability would show itself as an unexpected reset, once a day. I root-caused it to a high FCLK: was 1800MHz unstable, 1766MHz still unstable but better, 1733MHz stable.

I am now running the AMD 550 RX card in an Intel-based PC without issues.

Will try AMD 550 RX back in the Ryzen 3100-based PC to ensure the issue is resolved. Will keep you posted.
GregC
Posts: 24
Joined: Wed May 20, 2020 12:36 am
Location: Houston, TX

Re: ETA 10 days on RX550: Project 13416 run 1140 clone 293 g

Post by GregC »

All of this was due to hardware instability. Case closed.
Post Reply