new core download killed two GPU's

Moderators: slegrand, Site Moderators, PandeGroup

new core download killed two GPU's

Postby dschief » Sat Dec 08, 2012 5:56 pm

fedora rig has been rock solid stable since 2010, today both gpu's errored out! Fah_core 15 dated 12-8-2012 crapped out looking for
Cudart32_30_14.dll & cufft32_30_14.dll

has there been a re-write to the core in the last few days?? since this box has not errored in months
User avatar
dschief
 
Posts: 245
Joined: Tue Dec 04, 2007 5:56 am

Re: new core download killed two GPU's

Postby kiore » Sat Dec 08, 2012 6:00 pm

dschief wrote:fedora rig has been rock solid stable since 2010, today both gpu's errored out! Fah_core 15 dated 12-8-2012 crapped out looking for
Cudart32_30_14.dll & cufft32_30_14.dll

has there been a re-write to the core in the last few days?? since this box has not errored in months


There has been a server error causing non Fermi or Kepler cards to get the wrong core and work units. If this is your issue have a read around there are threads on this already with a fix posted . If you are using a Fermi or Kepler card please post your log so we can see what killing two GPUs means.
Image

Rebuild underway.
kiore
 
Posts: 945
Joined: Fri Jan 16, 2009 5:45 pm
Location: USA

Re: new core download killed two GPU's

Postby mmonnin » Sat Dec 08, 2012 6:01 pm

mmonnin
 
Posts: 331
Joined: Wed Dec 05, 2007 1:27 am

Re: new core download killed two GPU's

Postby dschief » Sat Dec 08, 2012 6:04 pm

cards are a 9800GTX+ & GTX275 both EVGA

Code: Select all
Launch directory: Z:\home\jim\gtx1
Executable: Z:\home\jim\gtx1\fah6.exe
Arguments: -gpu 0 -forcegpu nvidia_g80

[17:53:54] - Ask before connecting: No
[17:53:54] - User name: dschief (Team 13761)
[17:53:54] - User ID: xxxxxxxxxxxxx
[17:53:54] - Machine ID: 2
[17:53:54]
[17:53:55] Loaded queue successfully.
[17:53:55]
[17:53:55] + Processing work unit
[17:53:55] Core required: FahCore_15.exe
[17:53:55] Core found.
[17:53:55] Working on queue slot 01 [December 8 17:53:55 UTC]
[17:53:55] + Working ...
err:module:import_dll Library cudart32_30_14.dll (which is needed by L"Z:\\home\\jim\\gtx1\\FahCore_15.exe") not found
err:module:import_dll Library cufft32_30_14.dll (which is needed by L"Z:\\home\\jim\\gtx1\\FahCore_15.exe") not found
err:module:LdrInitializeThunk Main exe initialization for L"Z:\\home\\jim\\gtx1\\FahCore_15.exe" failed, status c0000135
[17:53:59] CoreStatus = C0000135 (-1073741515)
[17:53:59] Client-core communications error: ERROR 0xc0000135
[17:53:59] This is a sign of more serious problems, shutting down.
User avatar
dschief
 
Posts: 245
Joined: Tue Dec 04, 2007 5:56 am

Re: new core download killed two GPU's

Postby Sluxor » Sat Dec 08, 2012 7:52 pm

I had the same problem today (but on windows).

The .dll's required by Core_15 are not included with the GPU2 (6.23) console client I'm running, so I had to download the GPU3 (6.41) console client and copy the missing .dll's (not the .exe) to my client folding directory.
Now it seems to work just fine. The same fix should work for you.
Sluxor
 
Posts: 1
Joined: Sat Dec 08, 2012 5:37 pm

Re: new core download killed two GPU's

Postby kiore » Sat Dec 08, 2012 9:30 pm

dschief wrote:cards are a 9800GTX+ & GTX275 both EVGA

Code: Select all
Launch directory: Z:\home\jim\gtx1
Executable: Z:\home\jim\gtx1\fah6.exe
Arguments: -gpu 0 -forcegpu nvidia_g80

[17:53:54] - Ask before connecting: No
[17:53:54] - User name: dschief (Team 13761)
[17:53:54] - User ID: xxxxxxxxxxxxx
[17:53:54] - Machine ID: 2
[17:53:54]
[17:53:55] Loaded queue successfully.
[17:53:55]
[17:53:55] + Processing work unit
[17:53:55] Core required: FahCore_15.exe
[17:53:55] Core found.
[17:53:55] Working on queue slot 01 [December 8 17:53:55 UTC]
[17:53:55] + Working ...
err:module:import_dll Library cudart32_30_14.dll (which is needed by L"Z:\\home\\jim\\gtx1\\FahCore_15.exe") not found
err:module:import_dll Library cufft32_30_14.dll (which is needed by L"Z:\\home\\jim\\gtx1\\FahCore_15.exe") not found
err:module:LdrInitializeThunk Main exe initialization for L"Z:\\home\\jim\\gtx1\\FahCore_15.exe" failed, status c0000135
[17:53:59] CoreStatus = C0000135 (-1073741515)
[17:53:59] Client-core communications error: ERROR 0xc0000135
[17:53:59] This is a sign of more serious problems, shutting down.


This is the same core 15 issue for non fermi kepler cards , follow the links in the 3rd post to see how to resolve this.
kiore
 
Posts: 945
Joined: Fri Jan 16, 2009 5:45 pm
Location: USA

Re: new core download killed two GPU's

Postby codysluder » Sun Dec 09, 2012 7:11 am

mmonnin wrote:Follow these threads for updates:
viewtopic.php?f=85&t=23183
viewtopic.php?f=85&t=23185


Also viewtopic.php?f=85&t=23200
codysluder
 
Posts: 2128
Joined: Sun Dec 02, 2007 12:43 pm

Re: new core download killed two GPU's

Postby dschief » Sun Dec 09, 2012 4:11 pm

Long status update:

1. tried using the 2 dll from 6.41, got a Cuda version mismatch error
2. tried to update Fedora with 3.0 Cuda Toolkit & wrapper = totally trashed my system
3. Went the nuclear option = Wiped System & Loaded a copy of Vista I had laying around, 10+ hrs later

Now I'm seeing the memtest error on the 9800GTX+ , an 8054 wu core 15
the GTX275 in the same box is fine , also a 8054 wu , core 15

side note : a 9800GTX+ in another box under XP is running fine, but it has a core 11 wu.

My gut feeling is something in core 15 needs review?

will now revisit the other posts & try to get a better handel on the big picture
User avatar
dschief
 
Posts: 245
Joined: Tue Dec 04, 2007 5:56 am

Re: new core download killed two GPU's

Postby mmonnin » Mon Dec 10, 2012 3:14 am

Here is a thread on the memtest errors:
viewtopic.php?f=85&t=23200
mmonnin
 
Posts: 331
Joined: Wed Dec 05, 2007 1:27 am

Re: new core download killed two GPU's

Postby bruce » Mon Dec 10, 2012 3:44 am

dschief wrote:side note : a 9800GTX+ in another box under XP is running fine, but it has a core 11 wu.

My gut feeling is something in core 15 needs review?


FahCore_11 works on G80/Tesla GPUs (and used to work on early ATI GPUs before they were deprecated.) FahCore_15 works on Fermi and Kepler GPUs.

If you read all the discussions from Saturday, the Assignment Server was messing up and it assigned WUs for Fermi/Kepler when it should have been assigning WUs for G80/Tesla. It's not FahCore_15 that's the problem, it's a WU that was improperly assigned that doesn't work on your hardware.
bruce
 
Posts: 20806
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.


Return to unOfficial Linux GPU (WINE wrapper) (3rd party support)

Who is online

Users browsing this forum: No registered users and 1 guest

cron