FAULTY project:13816

Moderators: Site Moderators, PandeGroup

FAULTY project:13816

Postby Yavanius » Sun Apr 14, 2019 2:08 am

I've been getting failures after the last 2 recent nVidia driver updates. Not sure if it's driver related because I fired up BOINC and it seems to run fine for GPU.

Work Units keep failing moments after starting, but I did see once that a workunit did run.

Here's the log:

Code: Select all
01:57:10:WU01:FS00:Connecting to 140.163.4.231:8080
01:57:10:WU00:FS00:Connecting to 65.254.110.245:8080
01:57:11:WU01:FS00:Upload complete
01:57:11:WU01:FS00:Server responded WORK_ACK (400)
01:57:11:WU01:FS00:Cleaning up
01:57:12:WU00:FS00:Assigned to work server 128.252.203.4
01:57:12:WU00:FS00:Requesting new work unit for slot 00: READY gpu:0:GM107 [GeForce GTX 960M] from 128.252.203.4
01:57:12:WU00:FS00:Connecting to 128.252.203.4:8080
01:57:13:WU00:FS00:Downloading 11.66MiB
...
01:58:23:WU00:FS00:Download complete
01:58:23:WU00:FS00:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:13816 run:0 clone:2408 gen:185 core:0x21 unit:0x000000f480fccb045b3f7910546ffc49
01:58:23:WU00:FS00:Starting
01:58:23:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\David\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/Win32/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21.exe -dir 00 -suffix 01 -version 705 -lifeline 9092 -checkpoint 14 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
01:58:23:WU00:FS00:Started FahCore on PID 1696
01:58:23:WU00:FS00:Core PID:2124
01:58:23:WU00:FS00:FahCore 0x21 started
01:58:25:WU00:FS00:0x21:*********************** Log Started 2019-04-14T01:58:24Z ***********************
01:58:25:WU00:FS00:0x21:Project: 13816 (Run 0, Clone 2408, Gen 185)
01:58:25:WU00:FS00:0x21:Unit: 0x000000f480fccb045b3f7910546ffc49
01:58:25:WU00:FS00:0x21:CPU: 0x00000000000000000000000000000000
01:58:25:WU00:FS00:0x21:Machine: 0
01:58:25:WU00:FS00:0x21:Reading tar file core.xml
01:58:25:WU00:FS00:0x21:Reading tar file integrator.xml
01:58:25:WU00:FS00:0x21:Reading tar file state.xml
01:58:25:WU00:FS00:0x21:Reading tar file system.xml
01:58:25:WU00:FS00:0x21:Digital signatures verified
01:58:25:WU00:FS00:0x21:Folding@home GPU Core21 Folding@home Core
01:58:25:WU00:FS00:0x21:Version 0.0.18
01:58:28:WU00:FS00:0x21:ERROR:exception: Error initializing context: clGetDeviceInfo (-5)
01:58:28:WU00:FS00:0x21:Saving result file logfile_01.txt
01:58:28:WU00:FS00:0x21:Saving result file log.txt
01:58:28:WU00:FS00:0x21:Folding@home Core Shutdown: BAD_WORK_UNIT
01:58:28:WARNING:WU00:FS00:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
01:58:28:WU00:FS00:Sending unit results: id:00 state:SEND error:FAULTY project:13816 run:0 clone:2408 gen:185 core:0x21 unit:0x000000f480fccb045b3f7910546ffc49
01:58:28:WU00:FS00:Uploading 7.50KiB to 128.252.203.4
01:58:28:WU00:FS00:Connecting to 128.252.203.4:8080
01:58:29:WU01:FS00:Connecting to 65.254.110.245:8080
01:58:29:WU00:FS00:Upload complete
01:58:29:WU00:FS00:Server responded WORK_ACK (400)
01:58:29:WU00:FS00:Cleaning up
User avatar
Yavanius
 
Posts: 100
Joined: Thu Nov 03, 2016 4:55 am
Location: 92408

Re: FAULTY project:13816

Postby Joe_H » Sun Apr 14, 2019 2:22 am

Were your driver updates from MS, or directly from nVidia? This error message:
Code: Select all
01:58:28:WU00:FS00:0x21:ERROR:exception: Error initializing context: clGetDeviceInfo (-5)

looks to be from missing OpenCL support on your system. This often happens after Windows updates video drivers, their downloads usually do not include anything beyond the video driver itself, and lack OpenCL support.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Joe_H
Site Admin
 
Posts: 4443
Joined: Tue Apr 21, 2009 4:41 pm
Location: W. MA

Re: FAULTY project:13816

Postby Yavanius » Sun Apr 14, 2019 5:42 am

The driver came from nVidia through the nVidia GeForce Experience program. I got the initial new driver about a week ago and then I just saw a notification today about a newer driver and the issue persisted. First time I had an issue.

The previous new driver I even tried reinstalling and restarting as sometimes the nVidia driver hasn't initialize correctly in the past, although once again, I haven't had any issues in quite a whiles.
User avatar
Yavanius
 
Posts: 100
Joined: Thu Nov 03, 2016 4:55 am
Location: 92408

Re: FAULTY project:13816

Postby Yavanius » Sun Apr 14, 2019 5:53 am

Checking the status of Einstein@home nVidia OpenCL client runs, it appears that the results validated successfully. I'd expect them to not run or fail validation if OpenCL support wasn't working.
User avatar
Yavanius
 
Posts: 100
Joined: Thu Nov 03, 2016 4:55 am
Location: 92408

Re: FAULTY project:13816

Postby foldy » Sun Apr 14, 2019 8:15 am

I guess Einstein@home GPU client runs using CUDA while FAH runs using OpenCL. Just download driver from nvidia.com and install it => FAH will work.
foldy
 
Posts: 1446
Joined: Sat Dec 01, 2012 3:43 pm

Re: FAULTY project:13816

Postby toTOW » Sun Apr 14, 2019 1:19 pm

Joe_H wrote:Were your driver updates from MS, or directly from nVidia? This error message:
Code: Select all
01:58:28:WU00:FS00:0x21:ERROR:exception: Error initializing context: clGetDeviceInfo (-5)

looks to be from missing OpenCL support on your system. This often happens after Windows updates video drivers, their downloads usually do not include anything beyond the video driver itself, and lack OpenCL support.

No you're wrong, this error code doesn't indicate that OpenCL is missing, it indicates that the GPU is resetting or is not available.

Can we see the lines showing the client startup ?
Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.

FAH-Addict : latest news, tests and reviews about Folding@Home project.

Image
User avatar
toTOW
Site Moderator
 
Posts: 8691
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France

Re: FAULTY project:13816

Postby Yavanius » Sun Apr 14, 2019 6:20 pm

toTOW wrote:No you're wrong, this error code doesn't indicate that OpenCL is missing, it indicates that the GPU is resetting or is not available.

Can we see the lines showing the client startup ?


I shut down Folding all the way and then restarted it.


Code: Select all
*********************** Log Started 2019-04-14T18:07:31Z ***********************
18:07:31:************************* Folding@home Client *************************
18:07:31:        Website: https://foldingathome.org/
18:07:31:      Copyright: (c) 2009-2018 foldingathome.org
18:07:31:         Author: Joseph Coffland <joseph@cauldrondevelopment.com>
18:07:31:           Args: --open-web-control
18:07:31:         Config: C:\Users\~~~\AppData\Roaming\FAHClient\config.xml
18:07:31:******************************** Build ********************************
18:07:31:        Version: 7.5.1
18:07:31:           Date: May 11 2018
18:07:31:           Time: 13:06:32
18:07:31:     Repository: Git
18:07:31:       Revision: 4705bf53c635f88b8fe85af7675557e15d491ff0
18:07:31:         Branch: master
18:07:31:       Compiler: Visual C++ 2008
18:07:31:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox /MT
18:07:31:       Platform: win32 10
18:07:31:           Bits: 32
18:07:31:           Mode: Release
18:07:31:******************************* System ********************************
18:07:31:            CPU: Intel(R) Core(TM) i7-4720HQ CPU @ 2.60GHz
18:07:31:         CPU ID: GenuineIntel Family 6 Model 60 Stepping 3
18:07:31:           CPUs: 8
18:07:31:         Memory: 7.89GiB
18:07:31:    Free Memory: 4.02GiB
18:07:31:        Threads: WINDOWS_THREADS
18:07:31:     OS Version: 6.2
18:07:31:    Has Battery: true
18:07:31:     On Battery: false
18:07:31:     UTC Offset: -7
18:07:31:            PID: 2440
18:07:31:            CWD: C:\Users\~~~\AppData\Roaming\FAHClient
18:07:31:             OS: Windows 10 Home
18:07:31:        OS Arch: AMD64
18:07:31:           GPUs: 1
18:07:31:          GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:4 GM107 [GeForce GTX 960M]
18:07:31:  CUDA Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:5.0 Driver:10.1
18:07:31:OpenCL Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:1.2 Driver:425.31
18:07:31:OpenCL Device 1: Platform:1 Device:0 Bus:NA Slot:NA Compute:1.2 Driver:20.19
18:07:31:  Win32 Service: false
18:07:31:***********************************************************************
18:07:31:<config>
18:07:31:  <!-- Folding Core -->
18:07:31:  <checkpoint v='14'/>
18:07:31:  <core-priority v='low'/>
18:07:31:
18:07:31:  <!-- Network -->
18:07:31:  <proxy v=':8080'/>
18:07:31:
18:07:31:  <!-- Slot Control -->
18:07:31:  <power v='medium'/>
18:07:31:
18:07:31:  <!-- User Information -->
18:07:31:  <passkey v='********************************'/>
18:07:31:  <team v='11'/>
18:07:31:  <user v='Yavanius'/>
18:07:31:
18:07:31:  <!-- Folding Slots -->
18:07:31:  <slot id='0' type='GPU'/>
18:07:31:</config>
18:07:31:Trying to access database...
18:07:31:Successfully acquired database lock
18:07:31:Enabled folding slot 00: READY gpu:0:GM107 [GeForce GTX 960M]
18:07:31:WU00:FS00:Connecting to 65.254.110.245:8080
18:07:33:WU00:FS00:Assigned to work server 128.252.203.4
18:07:33:WU00:FS00:Requesting new work unit for slot 00: READY gpu:0:GM107 [GeForce GTX 960M] from 128.252.203.4
18:07:33:WU00:FS00:Connecting to 128.252.203.4:8080
18:07:34:WU00:FS00:Downloading 11.66MiB
18:07:39:7:127.0.0.1:New Web connection
18:07:40:WU00:FS00:Download 9.11%
18:07:46:WU00:FS00:Download 26.26%
18:07:52:WU00:FS00:Download 39.13%
18:07:58:WU00:FS00:Download 73.43%
18:08:02:WU00:FS00:Download complete
18:08:02:WU00:FS00:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:13816 run:0 clone:1101 gen:164 core:0x21 unit:0x000000d080fccb045b3a6568d62e05b9
18:08:02:WU00:FS00:Starting
18:08:02:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\David\AppData\Roaming\FAHClient\cores/cores.foldingathome.org/Win32/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21.exe -dir 00 -suffix 01 -version 705 -lifeline 2440 -checkpoint 14 -gpu-vendor nvidia -opencl-platform 0 -opencl-device 0 -cuda-device 0 -gpu 0
18:08:02:WU00:FS00:Started FahCore on PID 11776
18:08:02:WU00:FS00:Core PID:12612
18:08:02:WU00:FS00:FahCore 0x21 started
18:08:04:WU00:FS00:0x21:*********************** Log Started 2019-04-14T18:08:04Z ***********************
18:08:04:WU00:FS00:0x21:Project: 13816 (Run 0, Clone 1101, Gen 164)
18:08:04:WU00:FS00:0x21:Unit: 0x000000d080fccb045b3a6568d62e05b9
18:08:04:WU00:FS00:0x21:CPU: 0x00000000000000000000000000000000
18:08:04:WU00:FS00:0x21:Machine: 0
18:08:04:WU00:FS00:0x21:Reading tar file core.xml
18:08:04:WU00:FS00:0x21:Reading tar file integrator.xml
18:08:04:WU00:FS00:0x21:Reading tar file state.xml
18:08:04:WU00:FS00:0x21:Reading tar file system.xml
18:08:04:WU00:FS00:0x21:Digital signatures verified
18:08:04:WU00:FS00:0x21:Folding@home GPU Core21 Folding@home Core
18:08:04:WU00:FS00:0x21:Version 0.0.18
18:08:07:WU00:FS00:0x21:ERROR:exception: Error initializing context: clGetDeviceInfo (-5)



OpenCL Device 1 I believe is the built-in Intel GPU. I'd have thought the CUDA version be the same as the driver number (like with the OpenCL) but I guess not? Also, sofar as I know, nothing else is using the nVidia chip (I'm not running BOINC concurrently, of which I also see Milkyway@hom OpenCL client is also running.)

~Yav
User avatar
Yavanius
 
Posts: 100
Joined: Thu Nov 03, 2016 4:55 am
Location: 92408

Re: FAULTY project:13816

Postby foldy » Sun Apr 14, 2019 7:01 pm

Can you go to Windows Device Manager and disable the Intel iGPU? Maybe that helps FAH to access the nvidia gtx 960m

This could be a bug in FAH as older forum posts with laptop GPUs also had this issue unsolved.
foldy
 
Posts: 1446
Joined: Sat Dec 01, 2012 3:43 pm

Re: FAULTY project:13816

Postby Yavanius » Sun Apr 14, 2019 8:48 pm

foldy wrote:Can you go to Windows Device Manager and disable the Intel iGPU? Maybe that helps FAH to access the nvidia gtx 960m

This could be a bug in FAH as older forum posts with laptop GPUs also had this issue unsolved.


It's kind of funny that the nVidia control panel thinks the nVidia chip isn't there. However, both Folding and BOINC see it. I don't see any difference in running. Still the same error except the project number says 11719.

Just out of curiousity, when I got back to the nVidia Control Panel I tried setting the FAHClient to use the nVidia chip instead of the Intel chip. No difference.

I also previously tried dumping the slot and then readding the GPU slot. No go either.


I might just end up having to hunt down the older version of the driver if I want to keep Folding.

~Yav
User avatar
Yavanius
 
Posts: 100
Joined: Thu Nov 03, 2016 4:55 am
Location: 92408

Re: FAULTY project:13816

Postby Yavanius » Mon Apr 15, 2019 3:31 am

I fell back to 419.35. Something else is wrong so far as I can see as the workunits are still dying.
User avatar
Yavanius
 
Posts: 100
Joined: Thu Nov 03, 2016 4:55 am
Location: 92408

Re: FAULTY project:13816

Postby toTOW » Sat Apr 20, 2019 11:37 am

FAH doesn't like when there are multiple OpenCL platforms available, especially the Intel one.

Also, I see that's a laptop : does it have Optimus technology or something like this that would disable/enable NV GPU on battery or on light loads ? If so, I think you should disable it.
User avatar
toTOW
Site Moderator
 
Posts: 8691
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France

Re: FAULTY project:13816

Postby Yavanius » Sat Apr 20, 2019 6:07 pm

I don't see anything about Optimus in any of the nVidia settings. Actually, I've noted that for Folding that my computers won't sleep if it's running. Can't recall offhand if BOINC was doing this too. Last I looked in my power settings there was nothing set to idle the GPUs. The only other thing I see other than the nVidia driver updating is there was a recent Microsoft update in early April. I'm gonna try uninstalling it and see if that is messing with things...

I'd think that falling back to an early driver version would fix the issue if it was a driver issue. I even did a clean install just in case there was something that didn't transfer correctly to the new driver.

Running a little BOINC just to see what projects are also supporting the Intel GPU in the meanwhile to keep my GPU busy. ;)
User avatar
Yavanius
 
Posts: 100
Joined: Thu Nov 03, 2016 4:55 am
Location: 92408

Re: FAULTY project:13816

Postby foldy » Sat Apr 20, 2019 6:26 pm

What type and manufacturer is your laptop?
foldy
 
Posts: 1446
Joined: Sat Dec 01, 2012 3:43 pm

Re: FAULTY project:13816

Postby Yavanius » Sun Apr 21, 2019 3:44 am

I uninstalled the Security Update. And then while I was out it reinstalled it...

KB4493464 (OS Build 17134.706)


HOWEVER, I noticed the GPUGrid WU on BOINC was suddenly saying 1 day plus to finish when it should have been 7-8 hours estimated to complete. I looked over in the System Tray and... Folding active. Pulled up the Advanced Control and there's a WU running.

So, it's all rather odd. Later this week I'll update the nVidia driver and see what happens.


foldy... It's an ASUS ROG.
User avatar
Yavanius
 
Posts: 100
Joined: Thu Nov 03, 2016 4:55 am
Location: 92408

Re: FAULTY project:13816

Postby bruce » Sun Apr 21, 2019 6:56 am

Yavanius wrote:The only other thing I see other than the nVidia driver updating is there was a recent Microsoft update in early April. I'm gonna try uninstalling it and see if that is messing with things...
There's nothing wrong with the Microsoft update EXCEPT that it installs the drivers from Microsoft, and when it does, it removes the OpenCL that you get from NV. They seem to FORCE IntelOpenCL32 instead. Once it has done that to your system, you MUST download the same driver or better from NVidia to get OpenCL back ... or you can download the OpenCL developer package. Either one should install the OpenCL runtime DLLs.
bruce
 
Posts: 22340
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Next

Return to V7.5.1 Public Release Windows/Linux/MacOS X

Who is online

Users browsing this forum: No registered users and 1 guest

cron