Page 1 of 1

Am I using the GPU?

Posted: Sun Sep 20, 2015 9:56 pm
by fa2001
Hi, I think there may be a problem with the GPU folding on my system. When only the GPU folding is running, it's using about 700 % on my 4 core HT CPU. It also seems quite slow, I don't have much slack on the expiration date if running 24x7.

I have an AMD GPU and I'm running Scientific Linux 7, with the proprietary AMD drivers. How can I tell if it's using the GPU or if OpenCL is falling back on the CPU? Here's the log messages after unpausing:

Code: Select all

21:48:18:WU00:FS01:Starting
21:48:18:WU00:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/web.stanford.edu/~pande/Linux/AMD64/ATI/R600/Core_17.fah/FahCore_17 -dir 00 -suffix 01 -version 704 -lifeline 4913 -checkpoint 15 -gpu 0 -gpu-vendor ati
21:48:18:WU00:FS01:Started FahCore on PID 26535
21:48:18:WU00:FS01:Core PID:26539
21:48:18:WU00:FS01:FahCore 0x17 started
21:48:19:WU00:FS01:0x17:*********************** Log Started 2015-09-20T21:48:18Z ***********************
21:48:19:WU00:FS01:0x17:Project: 10467 (Run 0, Clone 277, Gen 197)
21:48:19:WU00:FS01:0x17:Unit: 0x00000179538b3db9538bc0ae5ea9affa
21:48:19:WU00:FS01:0x17:CPU: 0x00000000000000000000000000000000
21:48:19:WU00:FS01:0x17:Machine: 1
21:48:19:WU00:FS01:0x17:Digital signatures verified
21:49:09:Removing old file 'configs/config-20150915-160353.xml'
21:49:09:Saving configuration to /etc/fahclient/config.xml
21:49:09:<config>
21:49:09:  <!-- Network -->
21:49:09:  <proxy v=':8080'/>
21:49:09:
21:49:09:  <!-- User Information -->
21:49:09:  <team v='37651'/>
21:49:09:  <user v='fa2001'/>
21:49:09:
21:49:09:  <!-- Folding Slots -->
21:49:09:  <slot id='0' type='CPU'>
21:49:09:    <cpus v='3'/>
21:49:09:    <paused v='true'/>
21:49:09:  </slot>
21:49:09:  <slot id='1' type='GPU'/>
21:49:09:</config>
21:49:27:WU00:FS01:0x17:Completed 0 out of 5000000 steps (0%)

Re: Am I using the GPU?

Posted: Sun Sep 20, 2015 11:38 pm
by davidcoton
There's not enough of the log to see what progress is being made. But the GPU core (Core17 in this bit of log) should not use more than one CPU thread. If it is , then probably the drivers are using the CPU as an OpenCL device. As to how to stop it, that is a different question -- possibly manually setting the index values for the slot (search the forum for something like "Setting GPU index"

Re: Am I using the GPU?

Posted: Mon Sep 21, 2015 4:18 am
by bruce
When your system seems "quite slow" this is almost always a GPU problem. Your CPU slot may be running 700% (all but one of your CPUs) but there's a pretty good chance that pausing the CPU slot will remove the CPU load that you're worried about but will not fix the problem. On the other hand if you pause the GPU slot(s) the sluggishness will disappear but the 700% CPU will remain. Feel free to perform these suggested tests and decide what you want to do.

The explanation is as follows:
1) The operating system manages the CPU tasks by priority. Since FAHCore_XX runs a very low priority, anything else that wants to use the CPU gets priority so there is almost no sluggishness introduced.
2) GPUs have no operating system and ho concept of priority. Therefore if you interact with the keyboard or with the mouse and the CPU is busy folding, the OS will submit a request to update the screen and that simple request will be queued behind whatever GPU tasks are already queued up -- introducing a noticeable lag.

The recommended setting for the GPU is to fold only when your system is idle, suspending folding until you're no longer interacting with the system. A lot depends on how powerful your GPU is.

Re: Am I using the GPU?

Posted: Mon Sep 21, 2015 6:19 am
by fa2001
davidcoton wrote:There's not enough of the log to see what progress is being made. But the GPU core (Core17 in this bit of log) should not use more than one CPU thread.
This is useful info, thanks. I will try to update the drivers this evening, and if it doesn't work I have to look at setting the GPU index (currently at -1).

Re: Am I using the GPU?

Posted: Mon Sep 21, 2015 8:03 pm
by fa2001
Seems like I've found the problem: When running FAHClient as my user I get the GPU going, but when running as the fahclient user it just won't happen. No messages in the FAH logs or the SELinux audit.log. And my user is not member of any special groups, so I have no idea what's going on. I did start FAHClient as my user first, after installing it, but I don't see how that could make a difference.

I've hacked the init script so start FAHClient as my user, this works fine I guess.

[Anyway, this is great, the ETA just went from 3 days to 9 hours... I wasn't planning to run it all the time, but now I'll have no problems completing the WU on time]

Re: Am I using the GPU?

Posted: Tue Sep 22, 2015 12:27 am
by bruce
When you start FAHClient as various uses, you often will have a different current directory. That means the working files and even the settings may change, depending mostly on how you start the client. FAHClient is designed to work continuously as a background service running as it's own user. All changes that you need to make as the main user are managed through FAHControl ... or if you're a strict no-GUI devotee, through the telnet interface. Either way, Linux permissions and files in the local directory are NOT managed by your user, but rather by the service, running in a predefined world.