NV 760&470 LinuxMint14 Bad PlatformID Size

It seems that a lot of GPU problems revolve around specific versions of drivers. Though NVidia has their own support structure, you can often learn from information reported by others who fold.

Moderators: Site Moderators, PandeGroup

Re: NV 760&470 LinuxMint14 Bad PlatformID Size

Postby davidcoton » Sat Nov 30, 2013 11:28 pm

I run Ubuntu, not Mint, and only a single GPU, but here are some thoughts:

1) Core17 needs a CPU to feed it. You have all 16 CPUs allocated to the CPU slot. Try running the CPU slot without GPUs. Get that working first. Then reduce the CPUs allocated to the CPU slot (in advanced control, configure|slots|cpu|edit and set CPUs allocated to 12). Then get the GPU slots working, one at a time. Finally, you could set up a second CPU slot to use the last two CPUs -- or you could leave them for non-folding use. [The number of CPUs allocated to a slot has to factorise without prime numbers above 5, or some WUs will fail. So 16 is good -- but does not allow Core17 to run without interference. 15 will work with no more than one Core17 GPU. 14 is bad (factor 7), 13 awful, 12 good.] If this fails, try running each GPU singly, with CPU folding paused.

2) 86C is high, but not alarming for a GPU (usually safe to 100C -- but most folders prefer cooler), so that is not likely the cause of failure. I would recommend improved cooling, but there are probably other more serious issues.

3) What PSU do you have? Can it supply enough current on the 12V lines for 2 GPUs? Or even one??

David
Image
davidcoton
 
Posts: 940
Joined: Wed Nov 05, 2008 3:19 pm
Location: Cambridge, UK

Re: NV 760&470 LinuxMint14 Bad PlatformID Size

Postby apaseall » Sun Dec 01, 2013 12:26 am

@davidcoton
Hi and thanks for your post.
I have changed down to 12 cores :)
CPU slot 16 works fine, tried that first. Wanted to get gpu going. Then bought a better card for some other reason. Thought I would move the old one over and fold with it.

The temp problem is rather silly really. The 470 is the card that has nothing plugged into it. I want to fold on it. Fair enough. When I fold on it temps look reasonable.
Start up the 760 and the temp soars. Gets quite hot. Thing is though that I was not reading the temps correctly.
It is the 470 that gets toasty NOT the 760.
Turns out that the exhaust from the 760 blows directly onto the back of the 470.

Some form of duct will happen real soon :D

PSU ? corsait atx1200i so should be plenty spare for those two cards after feeding the 2 xeons :D
apaseall
 
Posts: 13
Joined: Sat Nov 30, 2013 7:16 pm

Re: NV 760&470 LinuxMint14 Bad PlatformID Size

Postby apaseall » Sun Dec 01, 2013 9:38 pm

Well this is annoying.
I have 2 GPU slots. Both have the triple -1 settings. FAHControl provides a description for each slot which states which gpu it is.

It is WRONG.

I have 2 different GPUs. 470 & 760. FAHControl shows them as GK104 [GeForce GTX 760] & GK100 [GeForce GTX 470]

Both GPU paused.

Say I fold with GK104 [GeForce GTX 760]
Psensor GPU1 temp rises.
Nvidia X Server Settings shows a rise in temperature. It also shows a change in performance level from 0 to 3.
But for GPU1 (Geforce GTX 470).

Manual check - stick hand in each exhaust to see which one is hot.
Yes the 470 is toasty.

So FAHControl lies when it describes the GPU as a 760.

If I pause GK104 [GeForce GTX 760] and fold with GK100 [GeForce GTX 470] ...
Psensor shows the temp rise and fall as one GPU cools under no load whilst the other heats up as it munches.
Same with Nvidia X Server Settings.

FAHControl tells lies :( naughty FAHControl.
apaseall
 
Posts: 13
Joined: Sat Nov 30, 2013 7:16 pm

Re: NV 760&470 LinuxMint14 Bad PlatformID Size

Postby P5-133XL » Sun Dec 01, 2013 10:02 pm

This is the instance where manually adjusting the opencl-index, and gpu-index (values start at 0 with -1 being automatic) to force the slot descriptions to match the observed temps/video card has some value. Work with one slot at a time till that slot matches the observed and then move to the next. I will agree that it is a pain to do this and I know it shouldn't be necessary, but I know of no better way. Be methodical and you will solve it.
Image
P5-133XL
 
Posts: 4034
Joined: Sun Dec 02, 2007 4:36 am
Location: Salem. OR USA

Re: NV 760&470 LinuxMint14 Bad PlatformID Size

Postby bruce » Mon Dec 02, 2013 12:23 am

Either GPU can be detected first (gpu 0) and the other one will be detected second (gpu 1). FAH may be detecting them in the opposite order from what you want them to be but that's not the same as a lie. For more information, search for the open function "lspci" which is used by FAH.
bruce
 
Posts: 22616
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: NV 760&470 LinuxMint14 Bad PlatformID Size

Postby 7im » Mon Dec 02, 2013 1:08 am

Did you add the 2nd GPU after the client was already installed?
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
User avatar
7im
 
Posts: 14648
Joined: Thu Nov 29, 2007 4:30 pm
Location: Arizona

Re: NV 760&470 LinuxMint14 Bad PlatformID Size

Postby apaseall » Mon Dec 02, 2013 6:25 pm

To be clear about this, the lie is the description, not the gpu number.
Both cards were present when the app was installed.

Found another lie. Installed app on laptop, it wrongly reports that the gpu is a GT 555M where as in reality it is a GT 540M.

I will try using the index values for the 470 & 760 to see if they are described correctly.
apaseall
 
Posts: 13
Joined: Sat Nov 30, 2013 7:16 pm

Re: NV 760&470 LinuxMint14 Bad PlatformID Size

Postby apaseall » Mon Dec 02, 2013 6:39 pm

lspic reports [among the big list]
0c:00.0 VGA compatible controller: NVIDIA Corporation Device 1187 (rev a1)
08:00.0 VGA compatible controller: NVIDIA Corporation GF100 [GeForce GTX 470] (rev a3)

470 is before the 760.

Just deleted the existing gpu slots. made new ones. gpu0 is still reported as 760 with gpu1 as 470.
Pausing both and folding with one at a time continues to behave incorrectly.
Namely 470 running actually loads the 760 ie temp rise with nvidia-msi reporting memory usage.

So I stand by my comment, the descriptions are wrong, FAHControl is telling lies.
apaseall
 
Posts: 13
Joined: Sat Nov 30, 2013 7:16 pm

Re: NV 760&470 LinuxMint14 Bad PlatformID Size

Postby 7im » Mon Dec 02, 2013 7:30 pm

apaseall wrote:
Found another lie. Installed app on laptop, it wrongly reports that the gpu is a GT 555M where as in reality it is a GT 540M.



The GPU vendors used the same Device ID for multiple models of GPUs. The GPUs.txt file is based on the GPU description as listed inside the OEM driver. If the device is listed multiple times, the first example is typically used. And the description in the GPUs.txt file is purely cosmetic, so there is no functionality difference. I wouldn't call that a lie, but to each their own. It is at best a misnomer. ;)
User avatar
7im
 
Posts: 14648
Joined: Thu Nov 29, 2007 4:30 pm
Location: Arizona

Re: NV 760&470 LinuxMint14 Bad PlatformID Size

Postby jimerickson » Tue Dec 03, 2013 2:14 am

no the indexes are merely assigned wrong. like 7im said its a misnomer.
jimerickson
 
Posts: 682
Joined: Tue May 27, 2008 11:56 pm
Location: ames, iowa

Re: NV 760&470 LinuxMint14 Bad PlatformID Size

Postby bruce » Tue Dec 03, 2013 4:57 am

jimerickson wrote:no the indexes are merely assigned wrong.


As I already said, wrong compared to what?

Most likely they're 0 and 1 so that makes they right.

If they're in the order preferred by lspci rather than the order preferred by you all that means is that you two don't agree.

LSPCI does not follow the order of the slots on your pci bus.
bruce
 
Posts: 22616
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: NV 760&470 LinuxMint14 Bad PlatformID Size

Postby PantherX » Tue Dec 03, 2013 1:55 pm

apaseall wrote:...Pausing both and folding with one at a time continues to behave incorrectly.
Namely 470 running actually loads the 760 ie temp rise with nvidia-msi reporting memory usage.

So I stand by my comment, the descriptions are wrong, FAHControl is telling lies.

To clarify, this is what you are seeing:
Physical GTX 470 maps to F@H GPU Slot 760
Physical GTX 760 maps to F@H GPU Slot 470

If yes, then it is a bug but not very serious one. The reason being, WUs assigned to either GPUs would fold successfully. However Keplers are more efficient using FahCore_17 while Fermis are better on FahCore_15.

Furthermore, please note that this is the first time that GPU folding is natively being supported on Linux, thus, few bugs could be expected.

Please note that the physical layout of the GPU in the PCI-E Slots may not match the numbering in FAHControl, as stated above.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Chrome Folding App (Beta) Ӂ Troubleshooting "Bad WUs" Ӂ Troubleshooting Server Connectivity Issues
User avatar
PantherX
Site Moderator
 
Posts: 6321
Joined: Wed Dec 23, 2009 9:33 am

Re: NV 760&470 LinuxMint14 Bad PlatformID Size

Postby 7im » Tue Dec 03, 2013 2:06 pm

It's an easy fix. Follow the procedure I linked to on page one of this thread. It will straighten out the indexes.
User avatar
7im
 
Posts: 14648
Joined: Thu Nov 29, 2007 4:30 pm
Location: Arizona

Previous

Return to Problems with NVidia drivers

Who is online

Users browsing this forum: No registered users and 1 guest

cron