Linux GPU3 (fermi) Corestatus = 63 (99)

Moderators: slegrand, Site Moderators, PandeGroup

Re: Linux GPU3 (fermi) Corestatus = 63 (99)

Postby Xaero » Thu Mar 10, 2011 6:33 am

Looks like I may not be out of the water yet - my secondary GPU just failed another WU, the primary gpu and the smp client are still chomping away just fine, and the secondary GPU is in the same wineprefix as the primary GPU so the configuration is identical, they even share the same executeables, just different conf files and working directories.
log:
Code: Select all
06:30:06] - Ask before connecting: No
[06:30:06] - User name: zero-x[49036] (Team 86565)
[06:30:06] - User ID: 1AEE385E55B8B045
[06:30:06] - Machine ID: 3
[06:30:06]
[06:30:06] Gpu species not recognized.
[06:30:06] Loaded queue successfully.
[06:30:06]
[06:30:06] + Processing work unit
[06:30:06] Core required: FahCore_15.exe
[06:30:06] Core found.
[06:30:06] Working on queue slot 00 [March 10 06:30:06 UTC]
[06:30:06] + Working ...
[06:30:06] - Calling '.\FahCore_15.exe -dir work/ -suffix 00 -nice 19 -checkpoint 15 -verbose -lifeline 38 -version 630'

[06:30:06] - Autosending finished units... [March 10 06:30:06 UTC]
[06:30:06] Trying to send all finished work units
[06:30:06] + No unsent completed units remaining.
[06:30:06] - Autosend completed
[06:30:06]
[06:30:06] *------------------------------*
[06:30:06] Folding@Home GPU Core
[06:30:06] Version 2.15 (Tue Nov 16 09:05:18 PST 2010)
[06:30:06]
[06:30:06] Build host: SimbiosNvdWin7
[06:30:06] Board Type: NVIDIA/CUDA
[06:30:06] Core      : x=15
[06:30:06]  Window's signal control handler registered.
[06:30:06] Preparing to commence simulation
[06:30:06] - Ensuring status. Please wait.
[06:30:15] - Looking at optimizations...
[06:30:15] - Working with standard loops on this execution.
[06:30:15] Examination of work files indicates 8 consecutive improper terminations of core.
[06:30:15] sizeof(CORE_PACKET_HDR) = 512 file=<>
[06:30:15] - Expanded 43062 -> 169787 (decompressed 394.2 percent)
[06:30:15] Called DecompressByteArray: compressed_data_size=43062 data_size=169787, decompressed_data_size=169787 diff=0
[06:30:15] - Digital signature verified
[06:30:15]
[06:30:15] Project: 6800 (Run 6808, Clone 0, Gen 25)
[06:30:15]
[06:30:15] Entering M.D.
[06:30:18] Tpr hash work/wudata_00.tpr:  2326219967 4063451721 1478549370 762046317 33964083
[06:30:18] Working on PEPTIDE (1-42)
[06:30:18] Client config found, loading data.
[06:30:20] CoreStatus = 63 (99)
[06:30:20] + Error starting Folding@home core.
[06:30:25]

Back to square one with this GPU?
Xaero
 
Posts: 11
Joined: Wed Feb 04, 2009 8:15 am

Re: Linux GPU3 (fermi) Corestatus = 63 (99)

Postby HendricksSA » Thu Mar 10, 2011 6:53 am

What can you tell us about the line at 06:30:15 with "Examination of work files indicates 8 ..."? I've not seen that before. The error is the same as the other GPU but your other log did not have that line. Thoughts?
HendricksSA
 
Posts: 544
Joined: Fri Jun 26, 2009 4:34 am

Re: Linux GPU3 (fermi) Corestatus = 63 (99)

Postby Xaero » Thu Mar 10, 2011 1:31 pm

I have no idea, it fixed itself by rebooting, running the same cores, didn't even download fresh ones, just started folding the ones that it had already downloaded. Just weird.
Xaero
 
Posts: 11
Joined: Wed Feb 04, 2009 8:15 am

Re: Linux GPU3 (fermi) Corestatus = 63 (99)

Postby Sidicas » Thu Mar 10, 2011 1:45 pm

After it starts failing several times in a row like that, can you try running the memtest_g80 on the card that's failing?
http://folding.stanford.edu/English/DownloadUtils

I'm thinking it might be dropping off as a CUDA device.. And you're sure the card isn't overheating?
Sidicas
 
Posts: 232
Joined: Sun Feb 17, 2008 4:46 pm

Re: Linux GPU3 (fermi) Corestatus = 63 (99)

Postby Xaero » Thu Mar 10, 2011 6:16 pm

Both cards eventually stopped folding and that was when I rebooted, I'm not sure what changed, but I am 100% certain both cards are not overheating, maximum temp on them was 40C (liquid cooled), the nvidia driver may have goofed internally... has not done it since.
Xaero
 
Posts: 11
Joined: Wed Feb 04, 2009 8:15 am

Re: Linux GPU3 (fermi) Corestatus = 63 (99)

Postby kromberg » Wed Jun 15, 2011 11:52 pm

I just swapped out a 275 GTX for a 460 GTX and now I am getting the Corestatus = 63 (99) error on each wu/core it trys to run. I am running driver: 260.19.44 and version 2.3 of the cuda toolkit. Do I need to upgrade the driver and cuda toolkit? If so, what versions should I run?

OK, got done running the memtestG80 app and no errors. So I am guessing it is the version of drivers right?


OK..... updated to the latest NVIDIA drivers and code toolkit:
Code: Select all
# nvidia-smi -a -q

==============NVSMI LOG==============

Timestamp                       : Wed Jun 15 18:53:12 2011

Driver Version                  : 275.09.07

Attached GPUs                   : 1

GPU 0:6:0
    Product Name                : GeForce GTX 460
    Display Mode                : N/A
    Persistence Mode            : Disabled
    Driver Model
        Current                 : N/A
        Pending                 : N/A
    Serial Number               : N/A
    GPU UUID                    : N/A
    Inforom Version
        OEM Object              : N/A
        ECC Object              : N/A
        Power Management Object : N/A
    PCI
        Bus                     : 6
        Device                  : 0
        Domain                  : 0
        Device Id               : E2210DE
        Bus Id                  : 0:6:0
    Fan Speed                   : 40 %
    Memory Usage
        Total                   : 767 Mb
        Used                    : 2 Mb
        Free                    : 764 Mb
    Compute Mode                : Default
    Utilization
        Gpu                     : N/A
        Memory                  : N/A
    Ecc Mode
        Current                 : N/A
        Pending                 : N/A
    ECC Errors
        Volatile
            Single Bit           
                Device Memory   : N/A
                Register File   : N/A
                L1 Cache        : N/A
                L2 Cache        : N/A
                Total           : N/A
            Double Bit           
                Device Memory   : N/A
                Register File   : N/A
                L1 Cache        : N/A
                L2 Cache        : N/A
                Total           : N/A
        Aggregate
            Single Bit           
                Device Memory   : N/A
                Register File   : N/A
                L1 Cache        : N/A
                L2 Cache        : N/A
                Total           : N/A
            Double Bit           
                Device Memory   : N/A
                Register File   : N/A
                L1 Cache        : N/A
                L2 Cache        : N/A
                Total           : N/A
    Temperature
        Gpu                     : 46 C
    Power Readings
        Power State             : N/A
        Power Management        : N/A
        Power Draw              : N/A
        Power Limit             : N/A
    Clocks
        Graphics                : N/A
        SM                      : N/A
        Memory                  : N/A


cuda toolkit:
Code: Select all
cudatoolkit_3.2.16_linux_32_fedora13.run


gpu client:
Code: Select all
wget http://www.stanford.edu/~friedrim/.Folding@home-Win32-GPU_XP-631.zip -O Folding@home-Win32-GPU_XP-631.zip


version of wine I am running:
Code: Select all
[root@glaurung gpu]# which wine
/usr/bin/wine
[root@glaurung gpu]# wine --version
wine-1.3.19
[root@glaurung gpu]# file /usr/bin/wine
/usr/bin/wine: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, stripped


FAHlog.txt output:
Code: Select all
wine Folding@home-Win32-GPU.exe -forcegpu nvidia_fermi -gpu 0

Note: Please read the license agreement (Folding@home-Win32-GPU.exe -license). Further
use of this software requires that you have read and accepted this agreement.

[00:59:31] cudaRuntime lib not found.
[00:59:31] Gpu species not recognized.


--- Opening Log file [June 16 00:59:31 UTC]


# Windows GPU Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.30r1

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: Z:\root\fah\gpu
Executable: Z:\root\fah\gpu\Folding@home-Win32-GPU.exe
Arguments: -forcegpu nvidia_fermi -gpu 0

[00:59:31] - Ask before connecting: No
[00:59:31] - User name: glaurung_dg_gpu (Team 122202)
[00:59:31] - User ID: 32747246509E9759
[00:59:31] - Machine ID: 2
[00:59:31]
[00:59:31] Gpu species not recognized.
[00:59:31] Work directory not found. Creating...
[00:59:31] Could not open work queue, generating new queue...
[00:59:31] - Preparing to get new work unit...
[00:59:31] Cleaning up work directory
[00:59:31] + Attempting to get work packet
[00:59:31] Passkey found
[00:59:31] Gpu species not recognized.
[00:59:31] - Connecting to assignment server
[00:59:31] - Successful: assigned to (171.64.65.64).
[00:59:31] + News From Folding@Home: Welcome to Folding@Home
[00:59:31] Loaded queue successfully.
[00:59:31] Gpu species not recognized.
[00:59:32] + Closed connections
[00:59:32]
[00:59:32] + Processing work unit
[00:59:32] Core required: FahCore_15.exe
[00:59:32] Core not found.
[00:59:32] - Core is not present or corrupted.
[00:59:32] - Attempting to download new core...
[00:59:32] + Downloading new core: FahCore_15.exe
[00:59:32] + 10240 bytes downloaded
[00:59:32] + 20480 bytes downloaded
[00:59:32] + 30720 bytes downloaded
[00:59:32] + 40960 bytes downloaded
[00:59:32] + 51200 bytes downloaded
[00:59:32] + 61440 bytes downloaded
[00:59:32] + 71680 bytes downloaded
[00:59:32] + 81920 bytes downloaded
[00:59:32] + 92160 bytes downloaded
[00:59:32] + 102400 bytes downloaded
[00:59:32] + 112640 bytes downloaded
[00:59:32] + 122880 bytes downloaded
[00:59:32] + 133120 bytes downloaded
[00:59:32] + 143360 bytes downloaded
[00:59:32] + 153600 bytes downloaded
[00:59:33] + 163840 bytes downloaded
[00:59:33] + 174080 bytes downloaded
[00:59:33] + 184320 bytes downloaded
[00:59:33] + 194560 bytes downloaded
[00:59:33] + 204800 bytes downloaded
[00:59:33] + 215040 bytes downloaded
[00:59:33] + 225280 bytes downloaded
[00:59:33] + 235520 bytes downloaded
[00:59:33] + 245760 bytes downloaded
[00:59:33] + 256000 bytes downloaded
[00:59:33] + 266240 bytes downloaded
[00:59:33] + 276480 bytes downloaded
[00:59:33] + 286720 bytes downloaded
[00:59:33] + 296960 bytes downloaded
[00:59:33] + 307200 bytes downloaded
[00:59:33] + 317440 bytes downloaded
[00:59:33] + 327680 bytes downloaded
[00:59:33] + 337920 bytes downloaded
[00:59:33] + 348160 bytes downloaded
[00:59:33] + 358400 bytes downloaded
[00:59:33] + 368640 bytes downloaded
[00:59:33] + 378880 bytes downloaded
[00:59:33] + 389120 bytes downloaded
[00:59:33] + 399360 bytes downloaded
[00:59:33] + 409600 bytes downloaded
[00:59:33] + 419840 bytes downloaded
[00:59:33] + 430080 bytes downloaded
[00:59:33] + 440320 bytes downloaded
[00:59:33] + 450560 bytes downloaded
[00:59:33] + 460800 bytes downloaded
[00:59:33] + 471040 bytes downloaded
[00:59:33] + 481280 bytes downloaded
[00:59:33] + 491520 bytes downloaded
[00:59:33] + 501760 bytes downloaded
[00:59:33] + 512000 bytes downloaded
[00:59:33] + 522240 bytes downloaded
[00:59:33] + 532480 bytes downloaded
[00:59:33] + 542720 bytes downloaded
[00:59:33] + 552960 bytes downloaded
[00:59:33] + 563200 bytes downloaded
[00:59:33] + 573440 bytes downloaded
[00:59:33] + 583680 bytes downloaded
[00:59:33] + 593920 bytes downloaded
[00:59:34] + 604160 bytes downloaded
[00:59:34] + 614400 bytes downloaded
[00:59:34] + 624640 bytes downloaded
[00:59:34] + 634880 bytes downloaded
[00:59:34] + 645120 bytes downloaded
[00:59:34] + 655360 bytes downloaded
[00:59:34] + 665600 bytes downloaded
[00:59:34] + 675840 bytes downloaded
[00:59:34] + 686080 bytes downloaded
[00:59:34] + 696320 bytes downloaded
[00:59:34] + 706560 bytes downloaded
[00:59:34] + 716800 bytes downloaded
[00:59:34] + 727040 bytes downloaded
[00:59:34] + 737280 bytes downloaded
[00:59:34] + 747520 bytes downloaded
[00:59:34] + 757760 bytes downloaded
[00:59:34] + 768000 bytes downloaded
[00:59:34] + 778240 bytes downloaded
[00:59:34] + 788480 bytes downloaded
[00:59:34] + 798720 bytes downloaded
[00:59:34] + 808960 bytes downloaded
[00:59:34] + 819200 bytes downloaded
[00:59:34] + 829440 bytes downloaded
[00:59:34] + 839680 bytes downloaded
[00:59:34] + 849920 bytes downloaded
[00:59:34] + 860160 bytes downloaded
[00:59:34] + 870400 bytes downloaded
[00:59:34] + 880640 bytes downloaded
[00:59:34] + 890880 bytes downloaded
[00:59:34] + 901120 bytes downloaded
[00:59:34] + 911360 bytes downloaded
[00:59:34] + 921600 bytes downloaded
[00:59:34] + 931840 bytes downloaded
[00:59:34] + 942080 bytes downloaded
[00:59:34] + 952320 bytes downloaded
[00:59:34] + 962560 bytes downloaded
[00:59:34] + 972800 bytes downloaded
[00:59:34] + 983040 bytes downloaded
[00:59:34] + 993280 bytes downloaded
[00:59:34] + 1003520 bytes downloaded
[00:59:34] + 1013760 bytes downloaded
[00:59:34] + 1024000 bytes downloaded
[00:59:34] + 1034240 bytes downloaded
[00:59:35] + 1044480 bytes downloaded
[00:59:35] + 1054720 bytes downloaded
[00:59:35] + 1064960 bytes downloaded
[00:59:35] + 1075200 bytes downloaded
[00:59:35] + 1085440 bytes downloaded
[00:59:35] + 1095680 bytes downloaded
[00:59:35] + 1105920 bytes downloaded
[00:59:35] + 1116160 bytes downloaded
[00:59:35] + 1126400 bytes downloaded
[00:59:35] + 1136640 bytes downloaded
[00:59:35] + 1146880 bytes downloaded
[00:59:35] + 1157120 bytes downloaded
[00:59:35] + 1167360 bytes downloaded
[00:59:35] + 1177600 bytes downloaded
[00:59:35] + 1187840 bytes downloaded
[00:59:35] + 1198080 bytes downloaded
[00:59:35] + 1208320 bytes downloaded
[00:59:35] + 1218560 bytes downloaded
[00:59:35] + 1228800 bytes downloaded
[00:59:35] + 1239040 bytes downloaded
[00:59:35] + 1249280 bytes downloaded
[00:59:35] + 1259520 bytes downloaded
[00:59:35] + 1269760 bytes downloaded
[00:59:35] + 1280000 bytes downloaded
[00:59:35] + 1290240 bytes downloaded
[00:59:35] + 1300480 bytes downloaded
[00:59:35] + 1310720 bytes downloaded
[00:59:35] + 1320960 bytes downloaded
[00:59:35] + 1331200 bytes downloaded
[00:59:35] + 1341440 bytes downloaded
[00:59:35] + 1351680 bytes downloaded
[00:59:35] + 1361920 bytes downloaded
[00:59:35] + 1372160 bytes downloaded
[00:59:35] + 1382400 bytes downloaded
[00:59:35] + 1392640 bytes downloaded
[00:59:35] + 1402880 bytes downloaded
[00:59:35] + 1413120 bytes downloaded
[00:59:35] + 1422551 bytes downloaded
[00:59:35] Verifying core Core_15.fah...
[00:59:35] Signature is VALID
[00:59:35]
[00:59:35] Trying to unzip core FahCore_15.exe
[00:59:36] Decompressed FahCore_15.exe (3903488 bytes) successfully
[00:59:41] + Core successfully engaged
[00:59:46]
[00:59:46] + Processing work unit
[00:59:46] Core required: FahCore_15.exe
[00:59:46] Core found.
[00:59:46] Working on queue slot 01 [June 16 00:59:46 UTC]
[00:59:46] + Working ...
[00:59:46]
[00:59:46] *------------------------------*
[00:59:46] Folding@Home GPU Core
[00:59:46] Version 2.15 (Tue Nov 16 09:05:18 PST 2010)
[00:59:46]
[00:59:46] Build host: SimbiosNvdWin7
[00:59:46] Board Type: NVIDIA/CUDA
[00:59:46] Core      : x=15
[00:59:46]  Window's signal control handler registered.
[00:59:46] Preparing to commence simulation
[00:59:46] - Looking at optimizations...
[00:59:46] DeleteFrameFiles: successfully deleted file=work/wudata_01.ckp
[00:59:46] - Created dyn
[00:59:46] - Files status OK
[00:59:46] sizeof(CORE_PACKET_HDR) = 512 file=<>
[00:59:46] - Expanded 42033 -> 162639 (decompressed 386.9 percent)
[00:59:46] Called DecompressByteArray: compressed_data_size=42033 data_size=162639, decompressed_data_size=162639 diff=0
[00:59:46] - Digital signature verified
[00:59:46]
[00:59:46] Project: 6805 (Run 6345, Clone 3, Gen 34)
[00:59:46]
[00:59:46] Assembly optimizations on if available.
[00:59:46] Entering M.D.
[00:59:48] Tpr hash work/wudata_01.tpr:  1091224314 2153210205 496246769 1627704551 1467455123
[00:59:48] Working on ALZHEIMER'S DISEASE AMYLOID
[00:59:48] Client config found, loading data.
[00:59:52] CoreStatus = 63 (99)
[00:59:52] + Error starting Folding@home core.
[00:59:57]
[00:59:57] + Processing work unit
[00:59:57] Core required: FahCore_15.exe
[00:59:57] Core found.
[00:59:57] Working on queue slot 01 [June 16 00:59:57 UTC]
[00:59:57] + Working ...
[00:59:57]
[00:59:57] *------------------------------*
[00:59:57] Folding@Home GPU Core
[00:59:57] Version 2.15 (Tue Nov 16 09:05:18 PST 2010)
[00:59:57]
[00:59:57] Build host: SimbiosNvdWin7
[00:59:57] Board Type: NVIDIA/CUDA
[00:59:57] Core      : x=15
[00:59:57]  Window's signal control handler registered.
[00:59:57] Preparing to commence simulation
[00:59:57] - Ensuring status. Please wait.
^C
Folding@Home Client Shutdown.


What am I missing or doing wrong here......
kromberg
 
Posts: 90
Joined: Sat Nov 07, 2009 4:36 pm

Re: Linux GPU3 (fermi) Corestatus = 63 (99)

Postby kromberg » Thu Jun 16, 2011 10:51 am

Got it working! Missed getting the cuda wrappers that match the version of the cuda toolkit. This setup/config page I found very helpful:

http://linuxfah.info/index.php?title=Fo ... a13_x86-64

The whole process for getting a nvidia card to fold under linux is way too complicated. For the love of God I wish the effort could be made to make a native linux GPU client. For the free use of hardware, it could at least be made easy or easier.
kromberg
 
Posts: 90
Joined: Sat Nov 07, 2009 4:36 pm

Previous

Return to unOfficial Linux GPU (WINE wrapper) (3rd party support)

Who is online

Users browsing this forum: No registered users and 1 guest

cron