a7 core on gpu slot?

Moderators: Site Moderators, PandeGroup

Re: a7 core on gpu slot?

Postby Joe_H » Fri Jan 26, 2018 7:51 pm

I and some of the others are interested in seeing how this happened, and possibly getting it fixed in the client or server software so this misconfiguration will be caught in the future. JimboPalmer's advice has its merits, and you have followed that approach now that the WU has completed.

As I mentioned, there have been a few reports by OS X folders in connection with GPU folding. The ones I recall were on the official reddit, and information on how they got a slot configured that way was not provided. Also not available was the configuration information from the client log. With your report on a Windows system, this shows that there is a bug in the common code of the client or possibly at the server level.

You did mention that you set the GPU Index value to 0, that might be important in tracking this down. Normally you don't have to change it from the default value except in some cases involving multiple GPU's. In any case the slot is identified in the configuration info as a GPU slot.

Looking at the first log in your post just before this, I see that the first WU processed on this "GPU" slot was from an A4 project. That download would be shown in the log older than this as it and the WU for the other slot are already on the system at the start of this log. It failed with an IO_Error. I can't tell if that is related to the configuration of the "GPU" slot running a CPU folding core or something else, but the WU did go on to be processed successfully by someone else.

Then the slot downloaded the A7 WU and the processing core to use for it. For whatever reason the slot was treated as a CPU:1 for the request, so once assigned at that setting it would not use any more CPU threads.

Finally, one question that you raised was about the availability of A7 WU's. Currently they appear to be in low supply, I don't know of a specific reason for that. I could guess that older A4 projects might be getting a higher priority so the research on older projects could be finished, or that the persons working on the newer projects have the results they need for now and could be back later with more work. But those are just a couple of the possible reasons, and might not apply.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Joe_H
Site Admin
 
Posts: 4556
Joined: Tue Apr 21, 2009 4:41 pm
Location: W. MA

Re: a7 core on gpu slot?

Postby bruce » Fri Jan 26, 2018 8:00 pm

I do know that I've seen a error message when I add a GPU slot saying there are no more GPUs. If I have 1 GPU and I try to add a second GPU slot or if I have 2 GPUs and I try to add a 3rd slot, it makes sense that adding another GPU slot is unable to allocate another GPU.

I have not tried adding a GPU slot to a system that reports it has 0 GPUs. I think the right solution to the problem is to issue that message rather than creating a CPU:1 slot. Is that what we're looking for?

I suppose it's possible that having no GPUs and having one Unsupported GPU might be treated differently. Can somebody do some testing for me?
bruce
 
Posts: 22698
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: a7 core on gpu slot?

Postby Joe_H » Fri Jan 26, 2018 8:12 pm

I meant to include this in my previous reply, but I tried to reproduce this setup on my MacBook Pro. With no GPU listed in the client, creation of a "GPU" slot failed with the "no available GPU" message. This was on the 7.4.16 beta client. I will have to see if I can find a system with a detected, unsupported GPU.
Joe_H
Site Admin
 
Posts: 4556
Joined: Tue Apr 21, 2009 4:41 pm
Location: W. MA

Re: a7 core on gpu slot?

Postby ChristianVirtual » Fri Jan 26, 2018 9:49 pm

Is this a feature or a bug or even somewhere documented ? The reclassification of a slot ?
ImageImage
Please contribute your logs to http://ppd.fahmm.net
User avatar
ChristianVirtual
 
Posts: 1540
Joined: Tue May 28, 2013 12:14 pm
Location: 日本 東京

Re: a7 core on gpu slot?

Postby Joe_H » Fri Jan 26, 2018 10:54 pm

I am assuming this is a bug, can't tell whether it is in the client or somewhere on the server side. If you look at the log the slot is identified as being a GPU slot in the configuration section:
Code: Select all
16:07:11:  <!-- Folding Slots -->
16:07:11:  <slot id='0' type='CPU'>
16:07:11:    <paused v='true'/>
16:07:11:  </slot>
16:07:11:  <slot id='1' type='GPU'>
16:07:11:    <gpu-index v='0'/>
16:07:11:    <paused v='true'/>
16:07:11:  </slot>


After it returns the failed A4 WU on the "GPU" slot, this is what the assignment and download looks like:
Code: Select all
16:08:30:WU02:FS01:Connecting to 171.67.108.45:80
16:08:31:WU02:FS01:Assigned to work server 155.247.166.219
16:08:31:WU02:FS01:Requesting new work unit for slot 01: READY gpu:0:GT216 [Quadro FX 880M] from 155.247.166.219
16:08:31:WU02:FS01:Connecting to 155.247.166.219:8080
16:08:32:WU02:FS01:Downloading 142.01KiB
16:08:32:WU02:FS01:Download complete
16:08:32:WU02:FS01:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:13749 run:112 clone:18 gen:28 core:0xa7 unit:0x0000001d0002894b59d54c4cf186156b
16:08:32:WU02:FS01:Downloading core from http://fahwebx.stanford.edu/cores/Win32/AMD64/Core_a7.fah
16:08:32:WU02:FS01:Connecting to fahwebx.stanford.edu:80
16:08:33:WU02:FS01:FahCore a7: Downloading 6.63MiB
16:08:35:WU00:FS00:0xa4:Mapping NT from 6 to 6
16:08:35:WU00:FS00:0xa4:Completed 0 out of 250000 steps  (0%)
16:08:38:WU02:FS01:FahCore a7: Download complete
16:08:39:WU02:FS01:Valid core signature
16:08:39:WU02:FS01:Unpacked 18.35MiB to cores/fahwebx.stanford.edu/cores/Win32/AMD64/Core_a7.fah/FahCore_a7.exe
16:08:41:WU02:FS01:Unpacked 2.64MiB to cores/fahwebx.stanford.edu/cores/Win32/AMD64/Core_a7.fah/libfftw3f-3.dll
16:08:41:WU02:FS01:Starting
16:08:41:WU02:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/Nick/AppData/Roaming/FAHClient/cores/fahwebx.stanford.edu/cores/Win32/AMD64/Core_a7.fah/FahCore_a7.exe -dir 02 -suffix 01 -version 704 -lifeline 9244 -checkpoint 15 -gpu 0
16:08:41:WU02:FS01:Started FahCore on PID 9636
16:08:41:WU02:FS01:Core PID:9080
16:08:41:WU02:FS01:FahCore 0xa7 started
16:08:42:WU02:FS01:0xa7:*********************** Log Started 2018-01-25T16:08:41Z ***********************
16:08:42:WU02:FS01:0xa7:************************** Gromacs Folding@home Core ***************************
16:08:42:WU02:FS01:0xa7:       Type: 0xa7
16:08:42:WU02:FS01:0xa7:       Core: Gromacs
16:08:42:WU02:FS01:0xa7:    Website: http://folding.stanford.edu/
16:08:42:WU02:FS01:0xa7:  Copyright: (c) 2009-2016 Stanford University
16:08:42:WU02:FS01:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
16:08:42:WU02:FS01:0xa7:       Args: -dir 02 -suffix 01 -version 704 -lifeline 9636 -checkpoint 15 -gpu
16:08:42:WU02:FS01:0xa7:             0
16:08:42:WU02:FS01:0xa7:     Config: <none>
16:08:42:WU02:FS01:0xa7:************************************ Build *************************************
16:08:42:WU02:FS01:0xa7:    Version: 0.0.16
16:08:42:WU02:FS01:0xa7:       Date: Oct 31 2017
16:08:42:WU02:FS01:0xa7:       Time: 14:04:33
16:08:42:WU02:FS01:0xa7: Repository: Git
16:08:42:WU02:FS01:0xa7:   Revision: 2f0a8a3d0b0698be48154fe99a0216f289060932
16:08:42:WU02:FS01:0xa7:     Branch: master
16:08:42:WU02:FS01:0xa7:   Compiler: Visual C++ 2008
16:08:42:WU02:FS01:0xa7:    Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
16:08:42:WU02:FS01:0xa7:   Platform: win32 10
16:08:42:WU02:FS01:0xa7:       Bits: 64
16:08:42:WU02:FS01:0xa7:       Mode: Release
16:08:42:WU02:FS01:0xa7:       SIMD: sse2
16:08:42:WU02:FS01:0xa7:************************************ System ************************************
16:08:42:WU02:FS01:0xa7:        CPU: Unknown
16:08:42:WU02:FS01:0xa7:     CPU ID:
16:08:42:WU02:FS01:0xa7:       CPUs: 8
16:08:42:WU02:FS01:0xa7:     Memory: 11.93GiB
16:08:42:WU02:FS01:0xa7:Free Memory: 8.67GiB
16:08:42:WU02:FS01:0xa7:    Threads: WINDOWS_THREADS
16:08:42:WU02:FS01:0xa7: OS Version: 6.2
16:08:42:WU02:FS01:0xa7:Has Battery: true
16:08:42:WU02:FS01:0xa7: On Battery: false
16:08:42:WU02:FS01:0xa7: UTC Offset: 1
16:08:42:WU02:FS01:0xa7:        PID: 9080
16:08:42:WU02:FS01:0xa7:        CWD: C:\Users\Nick\AppData\Roaming\FAHClient\work
16:08:42:WU02:FS01:0xa7:         OS: Windows 10 Pro
16:08:42:WU02:FS01:0xa7:    OS Arch: AMD64
16:08:42:WU02:FS01:0xa7:********************************************************************************
16:08:42:WU02:FS01:0xa7:Project: 13749 (Run 112, Clone 18, Gen 28)
16:08:42:WU02:FS01:0xa7:Unit: 0x0000001d0002894b59d54c4cf186156b
16:08:42:WU02:FS01:0xa7:Reading tar file core.xml
16:08:42:WU02:FS01:0xa7:Reading tar file frame28.tpr
16:08:42:WU02:FS01:0xa7:Digital signatures verified
16:08:42:WU02:FS01:0xa7:Calling: mdrun -s frame28.tpr -o frame28.trr -cpt 15 -nt 1

The AS acts on it as a CPU:1 request and connects the client to the WS at 155.247.166.219 to get a WU.
Joe_H
Site Admin
 
Posts: 4556
Joined: Tue Apr 21, 2009 4:41 pm
Location: W. MA

Re: a7 core on gpu slot?

Postby bruce » Sat Jan 27, 2018 5:17 pm

When a new slot is created, whether it's called a GPU slot or a CPU slot, there are a number of critical steps. First, a GPU must be found and identified. Clearly in this case, that step will fail. At that point, the process of creating a slot should abort with the "no available GPU" message but that's not happening. [Bug: proper error processing not happening]

Instead, it proceeds to generate a new slot, and the only possible type of slot that can be generated is a CPU slot since that's the only folding hardware that's present. It's not surprising that it doesn't go back and correct the slot description (which wouldn't be visible if the slot-creation process had aborted.
bruce
 
Posts: 22698
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: a7 core on gpu slot?

Postby nickjd » Sat Jan 27, 2018 5:50 pm

Ah, thanks for figuring it out. I had no idea it would do that. I just assumed that it simply wouldn't work if the GPU wasn't supported. When I tried these GPUs again, I didn't realize that they weren't supported until later since the GPUs.txt had them listed in the file.
nickjd
 
Posts: 45
Joined: Mon Oct 20, 2008 2:05 pm

Re: a7 core on gpu slot?

Postby bruce » Sat Jan 27, 2018 6:11 pm

https://github.com/FoldingAtHome/fah-issues/issues/1220

When a GPU was once supported (with a now-deprecated FAHCore) they were changed to UNSUPPORTED (GPU subtype=0). I don't remember why they weren't just deleted from GPUs.txt.
bruce
 
Posts: 22698
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: a7 core on gpu slot?

Postby nickjd » Sat Jan 27, 2018 7:19 pm

If there is something that you would like me to test to troubleshoot any of these errors or test something, please let me know. I've been contributing (slowly but surely) to the community on all my computers (work and personal) basically 24/7 since late 2005 and I wish there was a way I could help in some other way. I imagine my electricity costs for the cause over the last decade have been thousands of dollars though :)

I'm sorry I had to be the one to find a problem but I'm glad you guys are on the way to figuring out what caused it.

One thing that I have tested is that on my tablet with some crappy Intel GPU, if I add a GPU slot in the same way as on the other two computers that had the problem (GPU index set to 0), unlike those computers it actually gives an error (GPU not found). It doesn't behave in the same way on the other computers with those old Quadro GPUs.
nickjd
 
Posts: 45
Joined: Mon Oct 20, 2008 2:05 pm

Re: a7 core on gpu slot?

Postby bruce » Sat Jan 27, 2018 9:30 pm

In the top section of the log, does that system say
> GPUS:1
> ...UNSUPPORTED...
or does it say
> GPUS:0
:?:
bruce
 
Posts: 22698
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: a7 core on gpu slot?

Postby nickjd » Sat Jan 27, 2018 9:56 pm

GPUS:0
This must be the difference.

Code: Select all
*********************** Log Started 2018-01-09T17:55:15Z ***********************
17:55:15:************************* Folding@home Client *************************
17:55:15:      Website: http://folding.stanford.edu/
17:55:15:    Copyright: (c) 2009-2014 Stanford University
17:55:15:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
17:55:15:         Args: --open-web-control
17:55:15:       Config: C:/Users/nditt/AppData/Roaming/FAHClient/config.xml
17:55:15:******************************** Build ********************************
17:55:15:      Version: 7.4.4
17:55:15:         Date: Mar 4 2014
17:55:15:         Time: 20:26:54
17:55:15:      SVN Rev: 4130
17:55:15:       Branch: fah/trunk/client
17:55:15:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
17:55:15:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
17:55:15:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
17:55:15:     Platform: win32 XP
17:55:15:         Bits: 32
17:55:15:         Mode: Release
17:55:15:******************************* System ********************************
17:55:15:          CPU: Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz
17:55:15:       CPU ID: GenuineIntel Family 6 Model 78 Stepping 3
17:55:15:         CPUs: 4
17:55:15:       Memory: 7.89GiB
17:55:15:  Free Memory: 3.77GiB
17:55:15:      Threads: WINDOWS_THREADS
17:55:15:   OS Version: 6.2
17:55:15:  Has Battery: true
17:55:15:   On Battery: false
17:55:15:   UTC Offset: 1
17:55:15:          PID: 13868
17:55:15:          CWD: C:/Users/nditt/AppData/Roaming/FAHClient
17:55:15:           OS: Windows 10 Pro
17:55:15:      OS Arch: AMD64
17:55:15:         GPUs: 0
17:55:15:         CUDA: Not detected
17:55:15:Win32 Service: false
17:55:15:***********************************************************************
nickjd
 
Posts: 45
Joined: Mon Oct 20, 2008 2:05 pm

Re: a7 core on gpu slot?

Postby Joe_H » Sat Jan 27, 2018 10:00 pm

nickjd wrote:I'm sorry I had to be the one to find a problem but I'm glad you guys are on the way to figuring out what caused it.

Actually, as I mentioned, not the first to report the issue. But you have contributed more by providing enough detail to possibly get this fixed at some point.
Joe_H
Site Admin
 
Posts: 4556
Joined: Tue Apr 21, 2009 4:41 pm
Location: W. MA

Previous

Return to V7.4.4 Public Release Windows/Linux/MacOS X (deprecated)

Who is online

Users browsing this forum: No registered users and 2 guests

cron