Folding with Tesla K20C

It seems that a lot of GPU problems revolve around specific versions of drivers. Though NVidia has their own support structure, you can often learn from information reported by others who fold.

Moderators: Site Moderators, PandeGroup

Folding with Tesla K20C

Postby N0OA » Thu Jun 13, 2013 7:12 pm

Does anyone have experience folding with the Tesla K20C? How should it be setup and what experience do folks have with its PPD? I am currently getting a 37101 on a 7626 work unit configured as "client-type=BIGADV". Is it configured right? What can I expect for PPD out of this card?

Thanks in advance for any advice or thoughts...

N0OA
User avatar
N0OA
 
Posts: 41
Joined: Wed Feb 13, 2013 6:55 am
Location: Minnesota

Re: Folding with Tesla K20C

Postby 7im » Thu Jun 13, 2013 7:28 pm

bigadv only applies to CPU work units, not GPU.

Not many have Tesla's and I've seen no PPD numbers posted, but there is this thread showing FAHBench performance against many other GPUs, so you can see relative performance. http://foldingforum.org/viewtopic.php?f=38&t=23440
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
User avatar
7im
 
Posts: 14648
Joined: Thu Nov 29, 2007 4:30 pm
Location: Arizona

Re: Folding with Tesla K20C

Postby N0OA » Fri Jun 14, 2013 4:55 pm

Thanks for the client-type correction. I didn't catch that in the V7 documentation. I will take a look at the link to see what to expect. The Tesla isn't the fastest card I have - but it runs very cool and has a nice compact form factor for the performance it seems to return.

-N0OA
User avatar
N0OA
 
Posts: 41
Joined: Wed Feb 13, 2013 6:55 am
Location: Minnesota

Re: Folding with Tesla K20C

Postby Quisarious » Fri Jun 14, 2013 7:25 pm

As far as FAH is concerned, a K20C is a detuned gtx780. There are posted PPD estimates for the 780, just divide tpfs by ~0.65 (K20C runs at ~700 core clock, while the 780 will boost to 1000-1200) to get a good estimate for the tesla.
Quisarious
 
Posts: 54
Joined: Thu Dec 13, 2012 6:16 pm

Re: Folding with Tesla K20C

Postby jaysenw » Sat Aug 10, 2013 4:52 pm

Hello;

I have 2 Tesla C1060's. I have not been able to successfully get them to work on FAH. The Tesla card begins folding, then I eventually stops at 99.99%. I have tried removing and replacing the slot. Deleting work units and restarting. The same problem occurs. Always gets to 99.99% on the GUI, never to the same percentage in the log.

I have attached to most recent snippet of the system log for that slot below. Does anyone know what this issue is and possibly how to fix it?
Code: Select all
07:07:34:WU02:FS01:Cleaning up
07:07:34:WU01:FS01:Connecting to assign-GPU.stanford.edu:80
07:07:34:WU01:FS01:News: Welcome to Folding@Home
07:07:34:WU01:FS01:Assigned to work server 171.67.108.21
07:07:34:WU01:FS01:Requesting new work unit for slot 01: READY gpu:1:GT200 [Tesla C1060] from 171.67.108.21
07:07:34:WU01:FS01:Connecting to 171.67.108.21:8080
07:07:35:WU01:FS01:Downloading 61.92KiB
07:07:35:WU01:FS01:Download complete
07:07:35:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:10501 run:162 clone:1 gen:1340 core:0x11 unit:0x00000b466652eda54b6ea7a700003f4b
07:07:35:WU01:FS01:Starting
07:07:35:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/NVIDIA/G80/Core_11.fah/FahCore_11.exe -dir 01 -suffix 01 -version 703 -lifeline 2692 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
07:07:35:WU01:FS01:Started FahCore on PID 1524
07:07:35:WU01:FS01:Core PID:3524
07:07:35:WU01:FS01:FahCore 0x11 started
07:07:36:WU01:FS01:0x11:
07:07:36:WU01:FS01:0x11:*------------------------------*
07:07:36:WU01:FS01:0x11:Folding@Home GPU Core
07:07:36:WU01:FS01:0x11:Version 1.31 (Tue Sep 15 10:57:42 PDT 2009)
07:07:36:WU01:FS01:0x11:
07:07:36:WU01:FS01:0x11:Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
07:07:36:WU01:FS01:0x11:Build host: amoeba
07:07:36:WU01:FS01:0x11:Board Type: Nvidia
07:07:36:WU01:FS01:0x11:Core      :
07:07:36:WU01:FS01:0x11:Preparing to commence simulation
07:07:36:WU01:FS01:0x11:- Looking at optimizations...
07:07:36:WU01:FS01:0x11:DeleteFrameFiles: successfully deleted file=01/wudata_01.ckp
07:07:36:WU01:FS01:0x11:- Created dyn
07:07:36:WU01:FS01:0x11:- Files status OK
07:07:36:WU01:FS01:0x11:- Expanded 62895 -> 336763 (decompressed 535.4 percent)
07:07:36:WU01:FS01:0x11:Called DecompressByteArray: compressed_data_size=62895 data_size=336763, decompressed_data_size=336763 diff=0
07:07:36:WU01:FS01:0x11:- Digital signature verified
07:07:36:WU01:FS01:0x11:
07:07:36:WU01:FS01:0x11:Project: 10501 (Run 162, Clone 1, Gen 1340)
07:07:36:WU01:FS01:0x11:
07:07:36:WU01:FS01:0x11:Assembly optimizations on if available.
07:07:36:WU01:FS01:0x11:Entering M.D.
07:07:41:WU01:FS01:0x11:Tpr hash 01/wudata_01.tpr:  3117068995 2761589855 2641796126 1345936202 2531672404
07:07:41:WU01:FS01:0x11:
07:07:41:WU01:FS01:0x11:Calling fah_main args: 14 usage=100
07:07:41:WU01:FS01:0x11:
07:07:42:WU01:FS01:0x11:Working on Protein
07:07:43:WU01:FS01:0x11:Client config unavailable.
07:07:43:WU01:FS01:0x11:Starting GUI Server
07:08:48:WU01:FS01:0x11:Completed 1%
07:09:53:WU01:FS01:0x11:Completed 2%

Thanks for any input you may have...


Jaysen
jaysenw
 
Posts: 6
Joined: Sat Aug 10, 2013 4:45 pm

Re: Folding with Tesla K20C

Postby bruce » Sat Aug 10, 2013 5:40 pm

When FAH's control application eventually stops at 99.99% there's a definite problem with your GPU or it's drivers. If you look at the log, you will find that folding stopped before reaching 99.99%. You'll also find that there was a driver reset error logged by Windows (if you're running Windows) at that same time and you may have seen the message. Once a driver reset occurs, FAH cannot continue processing but FAH's Control application continues to (incorrectly) report progress until it reaches 99.99%.

Driver resets are caused by a GPU that has hung. That hang may be due to overclocking, due to overheating, due to flaky drivers or due to defective hardware. You need to figure out what's wrong with your system.
bruce
 
Posts: 22616
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Folding with Tesla K20C

Postby GreyWhiskers » Sat Aug 10, 2013 10:10 pm

If it just stopped because of a GPU/driver reset, the system should be able to recover by rebooting the computer. Upon restart, the FAH software should recover to the last checkpoint. It wouldn't be a bad idea to let the system "rest" for a little time (unspecified duration) until restart if the root cause was thermal.

If this doesn't work, then there was something else going on that caused the client to purge the files.

Was the log snippet posted above the bottom of the log when the FAH software quit? If not, it would be interesting to see from the very end of the log any warnings or errors that had been logged.

07:07:36:WU01:FS01:0x11:Project: 10501 (Run 162, Clone 1, Gen 1340)
User avatar
GreyWhiskers
 
Posts: 767
Joined: Mon Oct 25, 2010 5:57 am
Location: Saratoga, California USA

Re: Folding with Tesla K20C

Postby jaysenw » Fri Aug 16, 2013 1:22 am

Hmmm. I have tried the rebooting thing, but it still is unable to recover. I'll install some updated drivers and see what the dealio is. If it IS flaky hardware, do you guys recommend software that I can use to test the load and use of the card? I'm thinking like a CPU torture test but for Tesla's instead...

I'll post when I find out my next step. Thanks for the recommendations so far.

:)

Jaysen
jaysenw
 
Posts: 6
Joined: Sat Aug 10, 2013 4:45 pm

Re: Folding with Tesla K20C

Postby N0OA » Fri Aug 16, 2013 4:53 am

Hi Jaysenw,

It looks like from your log that you are downloading Core_11. What client-type do you have defined for your slot with the Tesla in it. I would suggest that you set the client-type to advanced so that you use the core_17 which will run much better on the Tesla cards. I am running the Core_17 on my Tesla K20C without any issues at all.

N0OA
User avatar
N0OA
 
Posts: 41
Joined: Wed Feb 13, 2013 6:55 am
Location: Minnesota

Re: Folding with Tesla K20C

Postby AndyE » Fri Aug 16, 2013 4:27 pm

N0OA,
would you mind sharing some perf numbers of your K20C?

thanks,
Andy
AndyE
 
Posts: 34
Joined: Tue Mar 19, 2013 10:52 pm

Re: Folding with Tesla K20C

Postby bruce » Sat Aug 17, 2013 5:58 am

That gpu uses the GK110 which really shouldn't be getting assignments for FahCore_11. I would have expected assignments for FahCore_15 prior to setting the client-type to advanced and FahCore_17 after.

You didn't answer my (implied) question: Is this Windows or Linux?

Is the file GPUs.txt present, and if so when was it created?
bruce
 
Posts: 22616
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Folding with Tesla K20C

Postby Joe_H » Sat Aug 17, 2013 5:20 pm

@jaysenw - Could you post the beginning of your log that shows the system configuration, etc.

Your questions are about folding with a Tesla C1060 which is based on a different GPU that the Tesla K20C. It appears from what I read on wikipedia to be based on the same GPU as the GTX 285, and recommendations for settings to use and which projects it will fold well will be different.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Joe_H
Site Admin
 
Posts: 4533
Joined: Tue Apr 21, 2009 4:41 pm
Location: W. MA

Re: Folding with Tesla K20C

Postby jaysenw » Tue Aug 20, 2013 5:15 am

Sure thing! One i power down theist em and reboot ill send entire log.

With the hardware recommendations made earlier, I custom built a reducer fan and have been able to get the card balanced at 89 degrees under full load. With this, it folds a work unit in about 65 minutes. Is this a reasonable speed given the processor?

I'll get the logs tomorrow and tell you guys when I get a chance to tommowow after school, no rest fortune wicked med students...
jaysenw
 
Posts: 6
Joined: Sat Aug 10, 2013 4:45 pm

Re: Folding with Tesla K20C

Postby Jesse_V » Thu Aug 22, 2013 4:51 am

Some projects consist of workunits that take longer than workunits from other projects. This can be due to different protein sizes or complexity, or for other reasons. Points Per Day is often a much more accurate yardstick.
Pen tester at Cigital/Synopsys
User avatar
Jesse_V
 
Posts: 2773
Joined: Mon Jul 18, 2011 4:44 am
Location: USA

Re: Folding with Tesla K20C

Postby n_w95482 » Thu Aug 22, 2013 7:14 am

jaysenw wrote:Sure thing! One i power down theist em and reboot ill send entire log.

With the hardware recommendations made earlier, I custom built a reducer fan and have been able to get the card balanced at 89 degrees under full load. With this, it folds a work unit in about 65 minutes. Is this a reasonable speed given the processor?

I'll get the logs tomorrow and tell you guys when I get a chance to tommowow after school, no rest fortune wicked med students...

Hmm, that's a bit warm for that card to be running. From what I've noticed, Tesla cards are usually underclocked compared to their GeForce equivalent, presumably to maximize stability. That, in turn, should lower temperatures.

How is airflow in the case/around the card? I'd also check to see if the card's heatsink and fan need to be cleaned out, and possibly apply fresh thermal paste.
Folding since December 2003. In memory of my mother, who lost her battle with cancer.

Image
n_w95482
 
Posts: 64
Joined: Tue May 01, 2012 12:46 am
Location: California

Next

Return to Problems with NVidia drivers

Who is online

Users browsing this forum: No registered users and 1 guest

cron