Am I using the GPU?

It seems that a lot of GPU problems revolve around specific versions of drivers. Though AMD has their own support structure, you can often learn from information reported by others who fold.

Moderators: Site Moderators, FAHC Science Team

Post Reply
fa2001
Posts: 7
Joined: Sun Sep 20, 2015 9:49 pm

Am I using the GPU?

Post by fa2001 »

Hi, I think there may be a problem with the GPU folding on my system. When only the GPU folding is running, it's using about 700 % on my 4 core HT CPU. It also seems quite slow, I don't have much slack on the expiration date if running 24x7.

I have an AMD GPU and I'm running Scientific Linux 7, with the proprietary AMD drivers. How can I tell if it's using the GPU or if OpenCL is falling back on the CPU? Here's the log messages after unpausing:

Code: Select all

21:48:18:WU00:FS01:Starting
21:48:18:WU00:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/web.stanford.edu/~pande/Linux/AMD64/ATI/R600/Core_17.fah/FahCore_17 -dir 00 -suffix 01 -version 704 -lifeline 4913 -checkpoint 15 -gpu 0 -gpu-vendor ati
21:48:18:WU00:FS01:Started FahCore on PID 26535
21:48:18:WU00:FS01:Core PID:26539
21:48:18:WU00:FS01:FahCore 0x17 started
21:48:19:WU00:FS01:0x17:*********************** Log Started 2015-09-20T21:48:18Z ***********************
21:48:19:WU00:FS01:0x17:Project: 10467 (Run 0, Clone 277, Gen 197)
21:48:19:WU00:FS01:0x17:Unit: 0x00000179538b3db9538bc0ae5ea9affa
21:48:19:WU00:FS01:0x17:CPU: 0x00000000000000000000000000000000
21:48:19:WU00:FS01:0x17:Machine: 1
21:48:19:WU00:FS01:0x17:Digital signatures verified
21:49:09:Removing old file 'configs/config-20150915-160353.xml'
21:49:09:Saving configuration to /etc/fahclient/config.xml
21:49:09:<config>
21:49:09:  <!-- Network -->
21:49:09:  <proxy v=':8080'/>
21:49:09:
21:49:09:  <!-- User Information -->
21:49:09:  <team v='37651'/>
21:49:09:  <user v='fa2001'/>
21:49:09:
21:49:09:  <!-- Folding Slots -->
21:49:09:  <slot id='0' type='CPU'>
21:49:09:    <cpus v='3'/>
21:49:09:    <paused v='true'/>
21:49:09:  </slot>
21:49:09:  <slot id='1' type='GPU'/>
21:49:09:</config>
21:49:27:WU00:FS01:0x17:Completed 0 out of 5000000 steps (0%)
davidcoton
Posts: 1102
Joined: Wed Nov 05, 2008 3:19 pm
Location: Cambridge, UK

Re: Am I using the GPU?

Post by davidcoton »

There's not enough of the log to see what progress is being made. But the GPU core (Core17 in this bit of log) should not use more than one CPU thread. If it is , then probably the drivers are using the CPU as an OpenCL device. As to how to stop it, that is a different question -- possibly manually setting the index values for the slot (search the forum for something like "Setting GPU index"
Image
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Am I using the GPU?

Post by bruce »

When your system seems "quite slow" this is almost always a GPU problem. Your CPU slot may be running 700% (all but one of your CPUs) but there's a pretty good chance that pausing the CPU slot will remove the CPU load that you're worried about but will not fix the problem. On the other hand if you pause the GPU slot(s) the sluggishness will disappear but the 700% CPU will remain. Feel free to perform these suggested tests and decide what you want to do.

The explanation is as follows:
1) The operating system manages the CPU tasks by priority. Since FAHCore_XX runs a very low priority, anything else that wants to use the CPU gets priority so there is almost no sluggishness introduced.
2) GPUs have no operating system and ho concept of priority. Therefore if you interact with the keyboard or with the mouse and the CPU is busy folding, the OS will submit a request to update the screen and that simple request will be queued behind whatever GPU tasks are already queued up -- introducing a noticeable lag.

The recommended setting for the GPU is to fold only when your system is idle, suspending folding until you're no longer interacting with the system. A lot depends on how powerful your GPU is.
fa2001
Posts: 7
Joined: Sun Sep 20, 2015 9:49 pm

Re: Am I using the GPU?

Post by fa2001 »

davidcoton wrote:There's not enough of the log to see what progress is being made. But the GPU core (Core17 in this bit of log) should not use more than one CPU thread.
This is useful info, thanks. I will try to update the drivers this evening, and if it doesn't work I have to look at setting the GPU index (currently at -1).
fa2001
Posts: 7
Joined: Sun Sep 20, 2015 9:49 pm

Re: Am I using the GPU?

Post by fa2001 »

Seems like I've found the problem: When running FAHClient as my user I get the GPU going, but when running as the fahclient user it just won't happen. No messages in the FAH logs or the SELinux audit.log. And my user is not member of any special groups, so I have no idea what's going on. I did start FAHClient as my user first, after installing it, but I don't see how that could make a difference.

I've hacked the init script so start FAHClient as my user, this works fine I guess.

[Anyway, this is great, the ETA just went from 3 days to 9 hours... I wasn't planning to run it all the time, but now I'll have no problems completing the WU on time]
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Am I using the GPU?

Post by bruce »

When you start FAHClient as various uses, you often will have a different current directory. That means the working files and even the settings may change, depending mostly on how you start the client. FAHClient is designed to work continuously as a background service running as it's own user. All changes that you need to make as the main user are managed through FAHControl ... or if you're a strict no-GUI devotee, through the telnet interface. Either way, Linux permissions and files in the local directory are NOT managed by your user, but rather by the service, running in a predefined world.
Post Reply