are GPU WUs universal for NVidia & AMD?

Moderators: Site Moderators, FAHC Science Team

alxbelu
Posts: 109
Joined: Sat Mar 14, 2020 6:28 pm

Re: are GPU WUs universal for NVidia & AMD?

Post by alxbelu »

Ah, thanks for clarifying!
Official F@H Twitter (frequently updated): https://twitter.com/foldingathome
Official F@H Facebook: https://www.facebook.com/Foldinghome-136059519794607/

(I'm not affiliated with the F@H Team, just promoting these channels for official updates)
astrorob
Posts: 43
Joined: Sun Mar 15, 2020 7:59 pm

Re: are GPU WUs universal for NVidia & AMD?

Post by astrorob »

looks like since it started up again, i've had one BAD_WORK_UNIT on the AMD gpu but 2-3 good WUs. the BAD_WORK_UNIT error happened before any actual computation was launched (as far as i can tell) so maybe it was a corrupted download or something.
Image
MrFrizzy
Posts: 123
Joined: Fri Feb 14, 2020 4:48 am

Re: are GPU WUs universal for NVidia & AMD?

Post by MrFrizzy »

astrorob wrote:looks like since it started up again, i've had one BAD_WORK_UNIT on the AMD gpu but 2-3 good WUs. the BAD_WORK_UNIT error happened before any actual computation was launched (as far as i can tell) so maybe it was a corrupted download or something.
If you check the logs, you will likely see an error message about "Error invoking kernel sortShortList: clEnqueueNDRangeKernel (-5)". That is the error that has been hitting a lot of GCN based cards lately and it always hits immediately before computation really begins. If you don't see that message, you'll have to bump up the log verbosity to 3 by going into Configure - Advanced - Verbosity (at the bottom).
S1: AMD R5 3600 & Sapphire RX 5700 XT Reference @2.1GHz under water
S2: Intel Xeon E5-2620v3 & MSI GTX 1650

RX 5700 XT Project & PPD Tracking Spreadsheet

Image
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: are GPU WUs universal for NVidia & AMD?

Post by bruce »

The problem with "Error invoking kernel sortShortList: clEnqueueNDRangeKernel (-5)". has been identified. It's a resource allocation problem which has only appeared with certain sized proteins. Those protein studices won't be delivered to AMD GPUs until a more complete solution can be distributed.
IkkeDus
Posts: 14
Joined: Wed Jun 18, 2008 10:42 am
Hardware configuration: Q9550 @ 2.8 GHz
WIN10 x64
2x Radeon R9 280X-3GB
1x Radeon R9 7950-3GB
Location: Amsterdam, The Netherlands

Re: are GPU WUs universal for NVidia & AMD?

Post by IkkeDus »

I still see these errors on my AMD R9 280X-3GB
Project 11776

Code: Select all

14:43:44:WU02:FS02:0x22:*********************** Log Started 2020-03-30T14:43:43Z ***********************
14:43:44:WU02:FS02:0x22:*************************** Core22 Folding@home Core ***************************
14:43:44:WU02:FS02:0x22:       Type: 0x22
14:43:44:WU02:FS02:0x22:       Core: Core22
14:43:44:WU02:FS02:0x22:    Website: https://foldingathome.org/
14:43:44:WU02:FS02:0x22:  Copyright: (c) 2009-2018 foldingathome.org
14:43:44:WU02:FS02:0x22:     Author: John Chodera <john.chodera@choderalab.org> and Rafal Wiewiora
14:43:44:WU02:FS02:0x22:             <rafal.wiewiora@choderalab.org>
14:43:44:WU02:FS02:0x22:       Args: -dir 02 -suffix 01 -version 705 -lifeline 8512 -checkpoint 15
14:43:44:WU02:FS02:0x22:             -gpu-vendor amd -opencl-platform 0 -opencl-device 2 -gpu 2
14:43:44:WU02:FS02:0x22:     Config: <none>
14:43:44:WU02:FS02:0x22:************************************ Build *************************************
14:43:44:WU02:FS02:0x22:    Version: 0.0.2
14:43:44:WU02:FS02:0x22:       Date: Dec 6 2019
14:43:44:WU02:FS02:0x22:       Time: 21:30:31
14:43:44:WU02:FS02:0x22: Repository: Git
14:43:44:WU02:FS02:0x22:   Revision: abeb39247cc72df5af0f63723edafadb23d5dfbe
14:43:44:WU02:FS02:0x22:     Branch: HEAD
14:43:44:WU02:FS02:0x22:   Compiler: Visual C++ 2008
14:43:44:WU02:FS02:0x22:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
14:43:44:WU02:FS02:0x22:   Platform: win32 10
14:43:44:WU02:FS02:0x22:       Bits: 64
14:43:44:WU02:FS02:0x22:       Mode: Release
14:43:44:WU02:FS02:0x22:************************************ System ************************************
14:43:44:WU02:FS02:0x22:        CPU: Intel(R) Core(TM)2 Quad CPU Q9550 @ 2.83GHz
14:43:44:WU02:FS02:0x22:     CPU ID: GenuineIntel Family 6 Model 23 Stepping 10
14:43:44:WU02:FS02:0x22:       CPUs: 4
14:43:44:WU02:FS02:0x22:     Memory: 4.00GiB
14:43:44:WU02:FS02:0x22:Free Memory: 2.01GiB
14:43:44:WU02:FS02:0x22:    Threads: WINDOWS_THREADS
14:43:44:WU02:FS02:0x22: OS Version: 6.2
14:43:44:WU02:FS02:0x22:Has Battery: false
14:43:44:WU02:FS02:0x22: On Battery: false
14:43:44:WU02:FS02:0x22: UTC Offset: 2
14:43:44:WU02:FS02:0x22:        PID: 948
14:43:44:WU02:FS02:0x22:        CWD: C:\Users\\AppData\Roaming\FAHClient\work
14:43:44:WU02:FS02:0x22:         OS: Windows 10 Pro
14:43:44:WU02:FS02:0x22:    OS Arch: AMD64
14:43:44:WU02:FS02:0x22:********************************************************************************
14:43:44:WU02:FS02:0x22:Project: 11776 (Run 0, Clone 12075, Gen 5)
14:43:44:WU02:FS02:0x22:Unit: 0x00000011287234c95e743340b7368a72
14:43:44:WU02:FS02:0x22:Reading tar file core.xml
14:43:44:WU02:FS02:0x22:Reading tar file integrator.xml
14:43:44:WU02:FS02:0x22:Reading tar file state.xml
14:43:45:WU02:FS02:0x22:Reading tar file system.xml
14:43:47:WU02:FS02:0x22:Digital signatures verified
14:43:47:WU02:FS02:0x22:Folding@home GPU Core22 Folding@home Core
14:43:47:WU02:FS02:0x22:Version 0.0.2
14:44:17:WU02:FS02:0x22:ERROR:exception: Error invoking kernel sortShortList: clEnqueueNDRangeKernel (-5)
Q9550 @ 2.8 GHz | 2x R9 280X-3GB | HD 7950-3GB | Win10 x64

Image
Joe_H
Site Admin
Posts: 7870
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: are GPU WUs universal for NVidia & AMD?

Post by Joe_H »

IkkeDus wrote:I still see these errors on my AMD R9 280X-3GB
Project 11776
Reported to manager for this project.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
wuchzael
Posts: 8
Joined: Sun Mar 22, 2020 1:31 pm

Re: are GPU WUs universal for NVidia & AMD?

Post by wuchzael »

It's kinda stupid that Nvidia GPUs are favoured so heavily. In fact my undervolted Vega 64 crushes some of the bigger WUs faster than my 2060S and massively faster than my GTX 970. But the Vega gets the small WUs with low credits most of the time while even the slow 970 always gets the big ones. Thats just not efficient...
Joe_H
Site Admin
Posts: 7870
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: are GPU WUs universal for NVidia & AMD?

Post by Joe_H »

There are a lot of small WUs out at the moment in connections with F@H's participation in the COVID Moonshot project. Your AMD card is not the only one getting these, there are plenty of comments by nVidia card users posted here about the impact these have had on their credits.

As part of this they are testing to see what works best on what, and possibly manage the assignments better. But there are limits with the current assignment software on the servers as to haw far they can go in that direction. With so many WUs involved, higher end cards will still get these WUs hat times.

There was also an issue with some projects with atom counts within certain ranges. Any GCN based GPU, that includes your Vega 64, would fail with a "sortShortList" error, so for a time those projects assignment was restricted to nVidia and AMD Navi based GPUs. The latest release of Core_22, version 0.0.10, is supposed to fix that, and the assignment restrictions should have been removed.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: are GPU WUs universal for NVidia & AMD?

Post by bruce »

THe clEnqueueNDRangeKernel (-5) is a known problem which has been fixed in the latest version of FAHCore_22. I'm not sure which version is being distributed, though. You're running v0.0.2 which is pretty old. FAH should update the core automatically if you happen to be assigned a WU that requires a new version.

I would force an update to FAHCore_22 and see what you get.
Post Reply