FAULTY project:13816

Moderators: Site Moderators, PandeGroup

FAULTY project:13816

Postby Yavanius » Sun Apr 14, 2019 2:08 am

I've been getting failures after the last 2 recent nVidia driver updates. Not sure if it's driver related because I fired up BOINC and it seems to run fine for GPU.

Work Units keep failing moments after starting, but I did see once that a workunit did run.

Here's the log:

Code: Select all
01:57:10:WU01:FS00:Connecting to 140.163.4.231:8080
01:57:10:WU00:FS00:Connecting to 65.254.110.245:8080
01:57:11:WU01:FS00:Upload complete
01:57:11:WU01:FS00:Server responded WORK_ACK (400)
01:57:11:WU01:FS00:Cleaning up
01:57:12:WU00:FS00:Assigned to work server 128.252.203.4
01:57:12:WU00:FS00:Requesting new work unit for slot 00: READY gpu:0:GM107 [GeForce GTX 960M] from 128.252.203.4
01:57:12:WU00:FS00:Connecting to 128.252.203.4:8080
01:57:13:WU00:FS00:Downloading 11.66MiB
...
01:58:23:WU00:FS00:Download complete
01:58:23:WU00:FS00:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:13816 run:0 clone:2408 gen:185 core:0x21 unit:0x000000f480fccb045b3f7910546ffc49
01:58:23:WU00:FS00:Starting
01:58:23:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\David\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/Win32/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21.exe -dir 00 -suffix 01 -version 705 -lifeline 9092 -checkpoint 14 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
01:58:23:WU00:FS00:Started FahCore on PID 1696
01:58:23:WU00:FS00:Core PID:2124
01:58:23:WU00:FS00:FahCore 0x21 started
01:58:25:WU00:FS00:0x21:*********************** Log Started 2019-04-14T01:58:24Z ***********************
01:58:25:WU00:FS00:0x21:Project: 13816 (Run 0, Clone 2408, Gen 185)
01:58:25:WU00:FS00:0x21:Unit: 0x000000f480fccb045b3f7910546ffc49
01:58:25:WU00:FS00:0x21:CPU: 0x00000000000000000000000000000000
01:58:25:WU00:FS00:0x21:Machine: 0
01:58:25:WU00:FS00:0x21:Reading tar file core.xml
01:58:25:WU00:FS00:0x21:Reading tar file integrator.xml
01:58:25:WU00:FS00:0x21:Reading tar file state.xml
01:58:25:WU00:FS00:0x21:Reading tar file system.xml
01:58:25:WU00:FS00:0x21:Digital signatures verified
01:58:25:WU00:FS00:0x21:Folding@home GPU Core21 Folding@home Core
01:58:25:WU00:FS00:0x21:Version 0.0.18
01:58:28:WU00:FS00:0x21:ERROR:exception: Error initializing context: clGetDeviceInfo (-5)
01:58:28:WU00:FS00:0x21:Saving result file logfile_01.txt
01:58:28:WU00:FS00:0x21:Saving result file log.txt
01:58:28:WU00:FS00:0x21:Folding@home Core Shutdown: BAD_WORK_UNIT
01:58:28:WARNING:WU00:FS00:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
01:58:28:WU00:FS00:Sending unit results: id:00 state:SEND error:FAULTY project:13816 run:0 clone:2408 gen:185 core:0x21 unit:0x000000f480fccb045b3f7910546ffc49
01:58:28:WU00:FS00:Uploading 7.50KiB to 128.252.203.4
01:58:28:WU00:FS00:Connecting to 128.252.203.4:8080
01:58:29:WU01:FS00:Connecting to 65.254.110.245:8080
01:58:29:WU00:FS00:Upload complete
01:58:29:WU00:FS00:Server responded WORK_ACK (400)
01:58:29:WU00:FS00:Cleaning up
User avatar
Yavanius
 
Posts: 87
Joined: Thu Nov 03, 2016 4:55 am
Location: 92408

Re: FAULTY project:13816

Postby Joe_H » Sun Apr 14, 2019 2:22 am

Were your driver updates from MS, or directly from nVidia? This error message:
Code: Select all
01:58:28:WU00:FS00:0x21:ERROR:exception: Error initializing context: clGetDeviceInfo (-5)

looks to be from missing OpenCL support on your system. This often happens after Windows updates video drivers, their downloads usually do not include anything beyond the video driver itself, and lack OpenCL support.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Joe_H
Site Admin
 
Posts: 4385
Joined: Tue Apr 21, 2009 4:41 pm
Location: W. MA

Re: FAULTY project:13816

Postby Yavanius » Sun Apr 14, 2019 5:42 am

The driver came from nVidia through the nVidia GeForce Experience program. I got the initial new driver about a week ago and then I just saw a notification today about a newer driver and the issue persisted. First time I had an issue.

The previous new driver I even tried reinstalling and restarting as sometimes the nVidia driver hasn't initialize correctly in the past, although once again, I haven't had any issues in quite a whiles.
User avatar
Yavanius
 
Posts: 87
Joined: Thu Nov 03, 2016 4:55 am
Location: 92408

Re: FAULTY project:13816

Postby Yavanius » Sun Apr 14, 2019 5:53 am

Checking the status of Einstein@home nVidia OpenCL client runs, it appears that the results validated successfully. I'd expect them to not run or fail validation if OpenCL support wasn't working.
User avatar
Yavanius
 
Posts: 87
Joined: Thu Nov 03, 2016 4:55 am
Location: 92408

Re: FAULTY project:13816

Postby foldy » Sun Apr 14, 2019 8:15 am

I guess Einstein@home GPU client runs using CUDA while FAH runs using OpenCL. Just download driver from nvidia.com and install it => FAH will work.
foldy
 
Posts: 1362
Joined: Sat Dec 01, 2012 3:43 pm

Re: FAULTY project:13816

Postby toTOW » Sun Apr 14, 2019 1:19 pm

Joe_H wrote:Were your driver updates from MS, or directly from nVidia? This error message:
Code: Select all
01:58:28:WU00:FS00:0x21:ERROR:exception: Error initializing context: clGetDeviceInfo (-5)

looks to be from missing OpenCL support on your system. This often happens after Windows updates video drivers, their downloads usually do not include anything beyond the video driver itself, and lack OpenCL support.

No you're wrong, this error code doesn't indicate that OpenCL is missing, it indicates that the GPU is resetting or is not available.

Can we see the lines showing the client startup ?
Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.

FAH-Addict : latest news, tests and reviews about Folding@Home project.

Image
User avatar
toTOW
Site Moderator
 
Posts: 8639
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France

Re: FAULTY project:13816

Postby Yavanius » Sun Apr 14, 2019 6:20 pm

toTOW wrote:No you're wrong, this error code doesn't indicate that OpenCL is missing, it indicates that the GPU is resetting or is not available.

Can we see the lines showing the client startup ?


I shut down Folding all the way and then restarted it.


Code: Select all
*********************** Log Started 2019-04-14T18:07:31Z ***********************
18:07:31:************************* Folding@home Client *************************
18:07:31:        Website: https://foldingathome.org/
18:07:31:      Copyright: (c) 2009-2018 foldingathome.org
18:07:31:         Author: Joseph Coffland <joseph@cauldrondevelopment.com>
18:07:31:           Args: --open-web-control
18:07:31:         Config: C:\Users\~~~\AppData\Roaming\FAHClient\config.xml
18:07:31:******************************** Build ********************************
18:07:31:        Version: 7.5.1
18:07:31:           Date: May 11 2018
18:07:31:           Time: 13:06:32
18:07:31:     Repository: Git
18:07:31:       Revision: 4705bf53c635f88b8fe85af7675557e15d491ff0
18:07:31:         Branch: master
18:07:31:       Compiler: Visual C++ 2008
18:07:31:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
18:07:31:       Platform: win32 10
18:07:31:           Bits: 32
18:07:31:           Mode: Release
18:07:31:******************************* System ********************************
18:07:31:            CPU: Intel(R) Core(TM) i7-4720HQ CPU @ 2.60GHz
18:07:31:         CPU ID: GenuineIntel Family 6 Model 60 Stepping 3
18:07:31:           CPUs: 8
18:07:31:         Memory: 7.89GiB
18:07:31:    Free Memory: 4.02GiB
18:07:31:        Threads: WINDOWS_THREADS
18:07:31:     OS Version: 6.2
18:07:31:    Has Battery: true
18:07:31:     On Battery: false
18:07:31:     UTC Offset: -7
18:07:31:            PID: 2440
18:07:31:            CWD: C:\Users\~~~\AppData\Roaming\FAHClient
18:07:31:             OS: Windows 10 Home
18:07:31:        OS Arch: AMD64
18:07:31:           GPUs: 1
18:07:31:          GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:4 GM107 [GeForce GTX 960M]
18:07:31:  CUDA Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:5.0 Driver:10.1
18:07:31:OpenCL Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:1.2 Driver:425.31
18:07:31:OpenCL Device 1: Platform:1 Device:0 Bus:NA Slot:NA Compute:1.2 Driver:20.19
18:07:31:  Win32 Service: false
18:07:31:***********************************************************************
18:07:31:<config>
18:07:31:  <!-- Folding Core -->
18:07:31:  <checkpoint v='14'/>
18:07:31:  <core-priority v='low'/>
18:07:31:
18:07:31:  <!-- Network -->
18:07:31:  <proxy v=':8080'/>
18:07:31:
18:07:31:  <!-- Slot Control -->
18:07:31:  <power v='medium'/>
18:07:31:
18:07:31:  <!-- User Information -->
18:07:31:  <passkey v='********************************'/>
18:07:31:  <team v='11'/>
18:07:31:  <user v='Yavanius'/>
18:07:31:
18:07:31:  <!-- Folding Slots -->
18:07:31:  <slot id='0' type='GPU'/>
18:07:31:</config>
18:07:31:Trying to access database...
18:07:31:Successfully acquired database lock
18:07:31:Enabled folding slot 00: READY gpu:0:GM107 [GeForce GTX 960M]
18:07:31:WU00:FS00:Connecting to 65.254.110.245:8080
18:07:33:WU00:FS00:Assigned to work server 128.252.203.4
18:07:33:WU00:FS00:Requesting new work unit for slot 00: READY gpu:0:GM107 [GeForce GTX 960M] from 128.252.203.4
18:07:33:WU00:FS00:Connecting to 128.252.203.4:8080
18:07:34:WU00:FS00:Downloading 11.66MiB
18:07:39:7:127.0.0.1:New Web connection
18:07:40:WU00:FS00:Download 9.11%
18:07:46:WU00:FS00:Download 26.26%
18:07:52:WU00:FS00:Download 39.13%
18:07:58:WU00:FS00:Download 73.43%
18:08:02:WU00:FS00:Download complete
18:08:02:WU00:FS00:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:13816 run:0 clone:1101 gen:164 core:0x21 unit:0x000000d080fccb045b3a6568d62e05b9
18:08:02:WU00:FS00:Starting
18:08:02:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\David\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/Win32/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21.exe -dir 00 -suffix 01 -version 705 -lifeline 2440 -checkpoint 14 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
18:08:02:WU00:FS00:Started FahCore on PID 11776
18:08:02:WU00:FS00:Core PID:12612
18:08:02:WU00:FS00:FahCore 0x21 started
18:08:04:WU00:FS00:0x21:*********************** Log Started 2019-04-14T18:08:04Z ***********************
18:08:04:WU00:FS00:0x21:Project: 13816 (Run 0, Clone 1101, Gen 164)
18:08:04:WU00:FS00:0x21:Unit: 0x000000d080fccb045b3a6568d62e05b9
18:08:04:WU00:FS00:0x21:CPU: 0x00000000000000000000000000000000
18:08:04:WU00:FS00:0x21:Machine: 0
18:08:04:WU00:FS00:0x21:Reading tar file core.xml
18:08:04:WU00:FS00:0x21:Reading tar file integrator.xml
18:08:04:WU00:FS00:0x21:Reading tar file state.xml
18:08:04:WU00:FS00:0x21:Reading tar file system.xml
18:08:04:WU00:FS00:0x21:Digital signatures verified
18:08:04:WU00:FS00:0x21:Folding@home GPU Core21 Folding@home Core
18:08:04:WU00:FS00:0x21:Version 0.0.18
18:08:07:WU00:FS00:0x21:ERROR:exception: Error initializing context: clGetDeviceInfo (-5)



OpenCL Device 1 I believe is the built-in Intel GPU. I'd have thought the CUDA version be the same as the driver number (like with the OpenCL) but I guess not? Also, sofar as I know, nothing else is using the nVidia chip (I'm not running BOINC concurrently, of which I also see Milkyway@hom OpenCL client is also running.)

~Yav
User avatar
Yavanius
 
Posts: 87
Joined: Thu Nov 03, 2016 4:55 am
Location: 92408

Re: FAULTY project:13816

Postby foldy » Sun Apr 14, 2019 7:01 pm

Can you go to Windows Device Manager and disable the Intel iGPU? Maybe that helps FAH to access the nvidia gtx 960m

This could be a bug in FAH as older forum posts with laptop GPUs also had this issue unsolved.
foldy
 
Posts: 1362
Joined: Sat Dec 01, 2012 3:43 pm

Re: FAULTY project:13816

Postby Yavanius » Sun Apr 14, 2019 8:48 pm

foldy wrote:Can you go to Windows Device Manager and disable the Intel iGPU? Maybe that helps FAH to access the nvidia gtx 960m

This could be a bug in FAH as older forum posts with laptop GPUs also had this issue unsolved.


It's kind of funny that the nVidia control panel thinks the nVidia chip isn't there. However, both Folding and BOINC see it. I don't see any difference in running. Still the same error except the project number says 11719.

Just out of curiousity, when I got back to the nVidia Control Panel I tried setting the FAHClient to use the nVidia chip instead of the Intel chip. No difference.

I also previously tried dumping the slot and then readding the GPU slot. No go either.


I might just end up having to hunt down the older version of the driver if I want to keep Folding.

~Yav
User avatar
Yavanius
 
Posts: 87
Joined: Thu Nov 03, 2016 4:55 am
Location: 92408

Re: FAULTY project:13816

Postby Yavanius » Mon Apr 15, 2019 3:31 am

I fell back to 419.35. Something else is wrong so far as I can see as the workunits are still dying.
User avatar
Yavanius
 
Posts: 87
Joined: Thu Nov 03, 2016 4:55 am
Location: 92408


Return to V7.5.1 Public Release Windows/Linux/MacOS X

Who is online

Users browsing this forum: No registered users and 1 guest

cron