Page 1 of 2

Event Log Errors, but Client is Fine?

Posted: Mon Feb 13, 2017 1:50 pm
by Kougar
Noticed this several times now. I'll see a fault in the event logs for F@H's GPU client, but the F@H log is clean.
Faulting application name: FahCore_21.exe, version: 0.0.0.0, time stamp: 0x588257cc
Faulting module name: igdrcl64.dll, version: 20.19.15.4531, time stamp: 0x57ed260c
Exception code: 0xc0000005
Fault offset: 0x000000000007a74a
Faulting process id: 0xa58
Faulting application start time: 0x01d285f4eb75af35
Faulting application path: C:\Users\gamer\AppData\Roaming\FAHClient\cores\fahwebx.stanford.edu\cores\Win32\AMD64\NVIDIA\Fermi\Core_21.fah\FahCore_21.exe
Faulting module path: C:\WINDOWS\SYSTEM32\igdrcl64.dll
Report Id: 095db762-f64b-40ee-85a9-a51aec42ec1b
Faulting package full name:
Faulting package-relative application ID:
Faulting application name: FahCore_21.exe, version: 0.0.0.0, time stamp: 0x588257cc
Faulting module name: igdrcl64.dll, version: 20.19.15.4531, time stamp: 0x57ed260c
Exception code: 0xc0000005
Fault offset: 0x000000000007a74a
Faulting process id: 0x32dc
Faulting application start time: 0x01d285bcc022f1e8
Faulting application path: C:\Users\gamer\AppData\Roaming\FAHClient\cores\fahwebx.stanford.edu\cores\Win32\AMD64\NVIDIA\Fermi\Core_21.fah\FahCore_21.exe
Faulting module path: C:\WINDOWS\SYSTEM32\igdrcl64.dll
Report Id: f53ccbef-7d36-49d0-bc4f-5e83f252cffe
Faulting package full name:
Faulting package-relative application ID:

Code: Select all

11:40:38:WU02:FS01:0x21:Completed 1080000 out of 2000000 steps (54%)
11:49:48:WU02:FS01:0x21:Completed 1100000 out of 2000000 steps (55%)
11:58:59:WU02:FS01:0x21:Completed 1120000 out of 2000000 steps (56%)
12:08:19:WU02:FS01:0x21:Completed 1140000 out of 2000000 steps (57%)
12:17:30:WU02:FS01:0x21:Completed 1160000 out of 2000000 steps (58%)
12:24:40:FS01:Paused
12:24:40:FS01:Shutting core down
12:24:40:WU02:FS01:0x21:WARNING:Console control signal 1 on PID 5616
12:24:40:WU02:FS01:0x21:Exiting, please wait. . .
12:24:42:WU02:FS01:0x21:Folding@home Core Shutdown: INTERRUPTED
12:24:42:WU02:FS01:FahCore returned: INTERRUPTED (102 = 0x66)
******************************* Date: 2017-02-12 *******************************
15:10:14:FS01:Unpaused
15:10:14:WU02:FS01:Starting
15:10:14:WU02:FS01:Core PID:1548
15:10:14:WU02:FS01:FahCore 0x21 started
15:10:15:WU02:FS01:0x21:*********************** Log Started 2017-02-12T15:10:15Z ***********************
15:10:15:WU02:FS01:0x21:Project: 10496 (Run 85, Clone 16, Gen 43)
15:10:15:WU02:FS01:0x21:Unit: 0x0000003a8ca304f556bbac2c683b9adc
15:10:15:WU02:FS01:0x21:CPU: 0x00000000000000000000000000000000
15:10:15:WU02:FS01:0x21:Machine: 1
15:10:15:WU02:FS01:0x21:Digital signatures verified
15:10:15:WU02:FS01:0x21:Folding@home GPU Core21 Folding@home Core
15:10:15:WU02:FS01:0x21:Version 0.0.18
15:10:15:WU02:FS01:0x21:  Found a checkpoint file
15:10:43:WU02:FS01:0x21:Completed 1125000 out of 2000000 steps (56%)
15:10:43:WU02:FS01:0x21:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
15:17:40:WU02:FS01:0x21:Completed 1140000 out of 2000000 steps (57%)
15:26:53:WU02:FS01:0x21:Completed 1160000 out of 2000000 steps (58%)
15:36:07:WU02:FS01:0x21:Completed 1180000 out of 2000000 steps (59%)
15:42:50:FS01:Paused
15:42:50:FS01:Shutting core down
15:42:50:WU02:FS01:0x21:WARNING:Console control signal 1 on PID 1548
15:42:50:WU02:FS01:0x21:Exiting, please wait. . .
15:42:51:WU02:FS01:FahCore returned: INTERRUPTED (102 = 0x66)
******************************* Date: 2017-02-13 *******************************
05:48:07:FS01:Unpaused
05:48:07:WU02:FS01:Starting
05:48:07:WU02:FS01:Core PID:13020
05:48:07:WU02:FS01:FahCore 0x21 started
05:48:07:WU02:FS01:0x21:*********************** Log Started 2017-02-13T05:48:07Z ***********************
05:48:07:WU02:FS01:0x21:Project: 10496 (Run 85, Clone 16, Gen 43)
05:48:07:WU02:FS01:0x21:Unit: 0x0000003a8ca304f556bbac2c683b9adc
05:48:07:WU02:FS01:0x21:CPU: 0x00000000000000000000000000000000
05:48:07:WU02:FS01:0x21:Machine: 1
05:48:07:WU02:FS01:0x21:Digital signatures verified
05:48:07:WU02:FS01:0x21:Folding@home GPU Core21 Folding@home Core
05:48:07:WU02:FS01:0x21:Version 0.0.18
05:48:07:WU02:FS01:0x21:  Found a checkpoint file
05:48:36:WU02:FS01:0x21:Completed 1125000 out of 2000000 steps (56%)
05:48:36:WU02:FS01:0x21:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
05:55:33:WU02:FS01:0x21:Completed 1140000 out of 2000000 steps (57%)
06:04:45:WU02:FS01:0x21:Completed 1160000 out of 2000000 steps (58%)
06:13:54:WU02:FS01:0x21:Completed 1180000 out of 2000000 steps (59%)
06:23:04:WU02:FS01:0x21:Completed 1200000 out of 2000000 steps (60%)
06:32:12:WU02:FS01:0x21:Completed 1220000 out of 2000000 steps (61%)
06:41:21:WU02:FS01:0x21:Completed 1240000 out of 2000000 steps (62%)
06:50:38:WU02:FS01:0x21:Completed 1260000 out of 2000000 steps (63%)
06:59:47:WU02:FS01:0x21:Completed 1280000 out of 2000000 steps (64%)
07:08:54:WU02:FS01:0x21:Completed 1300000 out of 2000000 steps (65%)
07:18:04:WU02:FS01:0x21:Completed 1320000 out of 2000000 steps (66%)
07:27:12:WU02:FS01:0x21:Completed 1340000 out of 2000000 steps (67%)
07:36:20:WU02:FS01:0x21:Completed 1360000 out of 2000000 steps (68%)
07:45:36:WU02:FS01:0x21:Completed 1380000 out of 2000000 steps (69%)
07:54:46:WU02:FS01:0x21:Completed 1400000 out of 2000000 steps (70%)
08:03:54:WU02:FS01:0x21:Completed 1420000 out of 2000000 steps (71%)
08:13:04:WU02:FS01:0x21:Completed 1440000 out of 2000000 steps (72%)
08:22:13:WU02:FS01:0x21:Completed 1460000 out of 2000000 steps (73%)
08:31:23:WU02:FS01:0x21:Completed 1480000 out of 2000000 steps (74%)
08:40:33:WU02:FS01:0x21:Completed 1500000 out of 2000000 steps (75%)
08:49:49:WU02:FS01:0x21:Completed 1520000 out of 2000000 steps (76%)
08:58:58:WU02:FS01:0x21:Completed 1540000 out of 2000000 steps (77%)
09:08:07:WU02:FS01:0x21:Completed 1560000 out of 2000000 steps (78%)
09:17:17:WU02:FS01:0x21:Completed 1580000 out of 2000000 steps (79%)
09:26:27:WU02:FS01:0x21:Completed 1600000 out of 2000000 steps (80%)
09:35:36:WU02:FS01:0x21:Completed 1620000 out of 2000000 steps (81%)
09:44:53:WU02:FS01:0x21:Completed 1640000 out of 2000000 steps (82%)
09:54:02:WU02:FS01:0x21:Completed 1660000 out of 2000000 steps (83%)
10:03:11:WU02:FS01:0x21:Completed 1680000 out of 2000000 steps (84%)
10:12:20:WU02:FS01:0x21:Completed 1700000 out of 2000000 steps (85%)
10:21:29:WU02:FS01:0x21:Completed 1720000 out of 2000000 steps (86%)
10:30:39:WU02:FS01:0x21:Completed 1740000 out of 2000000 steps (87%)
10:39:57:WU02:FS01:0x21:Completed 1760000 out of 2000000 steps (88%)
10:49:05:WU02:FS01:0x21:Completed 1780000 out of 2000000 steps (89%)
10:58:15:WU02:FS01:0x21:Completed 1800000 out of 2000000 steps (90%)
11:07:23:WU02:FS01:0x21:Completed 1820000 out of 2000000 steps (91%)
11:16:33:WU02:FS01:0x21:Completed 1840000 out of 2000000 steps (92%)
11:25:42:WU02:FS01:0x21:Completed 1860000 out of 2000000 steps (93%)
******************************* Date: 2017-02-13 *******************************
11:34:59:WU02:FS01:0x21:Completed 1880000 out of 2000000 steps (94%)
11:44:06:WU02:FS01:0x21:Completed 1900000 out of 2000000 steps (95%)
11:53:15:WU02:FS01:0x21:Completed 1920000 out of 2000000 steps (96%)
12:02:25:WU02:FS01:0x21:Completed 1940000 out of 2000000 steps (97%)
12:11:34:WU02:FS01:0x21:Completed 1960000 out of 2000000 steps (98%)
12:20:44:WU02:FS01:0x21:Completed 1980000 out of 2000000 steps (99%)
12:20:45:WU01:FS01:Assigned to work server 171.64.65.84
12:20:45:WU01:FS01:Requesting new work unit for slot 01: RUNNING gpu:0:GM107 [GeForce GTX 750 Ti] from 171.64.65.84
12:20:46:WU01:FS01:Downloading 2.71MiB
12:20:49:WU01:FS01:Download complete
12:20:49:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:9191 run:2 clone:18 gen:208 core:0x21 unit:0x00000131ab40415457cb2cde51366888
12:29:55:WU02:FS01:0x21:Completed 2000000 out of 2000000 steps (100%)
12:30:03:WU02:FS01:0x21:Saving result file logfile_01.txt
12:30:03:WU02:FS01:0x21:Saving result file checkpointState.xml
12:30:08:WU02:FS01:0x21:Saving result file checkpt.crc
12:30:08:WU02:FS01:0x21:Saving result file log.txt
12:30:08:WU02:FS01:0x21:Saving result file positions.xtc
12:30:10:WU02:FS01:0x21:Folding@home Core Shutdown: FINISHED_UNIT
12:30:11:WU02:FS01:FahCore returned: FINISHED_UNIT (100 = 0x64)
12:30:11:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:10496 run:85 clone:16 gen:43 core:0x21 unit:0x0000003a8ca304f556bbac2c683b9adc
12:30:11:WU02:FS01:Uploading 21.88MiB to 140.163.4.245
12:30:11:WU01:FS01:Starting
12:30:11:WU01:FS01:Core PID:2648
12:30:11:WU01:FS01:FahCore 0x21 started
12:30:12:WU01:FS01:0x21:*********************** Log Started 2017-02-13T12:30:11Z ***********************
12:30:12:WU01:FS01:0x21:Project: 9191 (Run 2, Clone 18, Gen 208)
12:30:12:WU01:FS01:0x21:Unit: 0x00000131ab40415457cb2cde51366888
12:30:12:WU01:FS01:0x21:CPU: 0x00000000000000000000000000000000
12:30:12:WU01:FS01:0x21:Machine: 1
12:30:12:WU01:FS01:0x21:Reading tar file core.xml
12:30:12:WU01:FS01:0x21:Reading tar file system.xml
12:30:12:WU01:FS01:0x21:Reading tar file integrator.xml
12:30:12:WU01:FS01:0x21:Reading tar file state.xml
12:30:12:WU01:FS01:0x21:Digital signatures verified
12:30:12:WU01:FS01:0x21:Folding@home GPU Core21 Folding@home Core
12:30:12:WU01:FS01:0x21:Version 0.0.18
12:30:17:WU02:FS01:Upload 31.14%
12:30:20:WU01:FS01:0x21:Completed 0 out of 2500000 steps (0%)
12:30:20:WU01:FS01:0x21:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
12:30:23:WU02:FS01:Upload 52.85%
12:30:29:WU02:FS01:Upload 85.41%
12:30:52:WU02:FS01:Upload complete
12:30:52:WU02:FS01:Server responded WORK_ACK (400)
12:30:52:WU02:FS01:Final credit estimate, 28336.00 points
12:30:52:WU02:FS01:Cleaning up
12:33:46:WU01:FS01:0x21:Completed 25000 out of 2500000 steps (1%)
12:37:12:WU01:FS01:0x21:Completed 50000 out of 2500000 steps (2%)
12:40:38:WU01:FS01:0x21:Completed 75000 out of 2500000 steps (3%)
12:44:04:WU01:FS01:0x21:Completed 100000 out of 2500000 steps (4%)
12:47:31:WU01:FS01:0x21:Completed 125000 out of 2500000 steps (5%)
12:50:57:WU01:FS01:0x21:Completed 150000 out of 2500000 steps (6%)
12:54:23:WU01:FS01:0x21:Completed 175000 out of 2500000 steps (7%)
12:57:49:WU01:FS01:0x21:Completed 200000 out of 2500000 steps (8%)
13:01:16:WU01:FS01:0x21:Completed 225000 out of 2500000 steps (9%)
13:04:42:WU01:FS01:0x21:Completed 250000 out of 2500000 steps (10%)
13:08:07:WU01:FS01:0x21:Completed 275000 out of 2500000 steps (11%)
13:11:33:WU01:FS01:0x21:Completed 300000 out of 2500000 steps (12%)
13:15:00:WU01:FS01:0x21:Completed 325000 out of 2500000 steps (13%)
13:18:26:WU01:FS01:0x21:Completed 350000 out of 2500000 steps (14%)
13:21:52:WU01:FS01:0x21:Completed 375000 out of 2500000 steps (15%)
13:25:17:WU01:FS01:0x21:Completed 400000 out of 2500000 steps (16%)
13:28:45:WU01:FS01:0x21:Completed 425000 out of 2500000 steps (17%)
13:32:10:WU01:FS01:0x21:Completed 450000 out of 2500000 steps (18%)
13:35:36:WU01:FS01:0x21:Completed 475000 out of 2500000 steps (19%)
13:39:17:WU01:FS01:0x21:Completed 500000 out of 2500000 steps (20%)
13:42:49:WU01:FS01:0x21:Completed 525000 out of 2500000 steps (21%)
13:46:21:WU01:FS01:0x21:Completed 550000 out of 2500000 steps (22%)
Running NVIDIA 378.57 with a 750 Ti on Win 10

Re: Event Log Errors, but Client is Fine?

Posted: Mon Feb 13, 2017 5:24 pm
by bruce
The Windows error 0xc0000005 is a memory error -- and you should start by looking at the usual culprits: Unreliable memory hardware. Overclocking, or Overheating.

Re: Event Log Errors, but Client is Fine?

Posted: Mon Feb 13, 2017 7:02 pm
by Kougar
Stock everything, temps are fine. Basic DDR3-1600 kit. Will run some memory stress tests & memtest86, thanks! I presume that was RAM and not VRAM.

Re: Event Log Errors, but Client is Fine?

Posted: Mon Feb 13, 2017 8:21 pm
by bruce
RAM?
Yes... where WIndows, itself, runs.

VRAM errors would be detected by the GPU driver ... or maybe not detected at all until FAHCore did a periodic sanity check.

Re: Event Log Errors, but Client is Fine?

Posted: Mon Feb 13, 2017 10:25 pm
by Kougar
So far Prime95 and Memtest are clean, however I found three other instances from last month that involved different program exe's but same event ID + exception code. Oddly there's a dozen for Fahcore21 starting back in Jan, it must've really found the sweet spot.

Going to manually tune the RAM timings since I guess there's a 0.0001% instability in there somewhere... thanks for the help!

Re: Event Log Errors, but Client is Fine?

Posted: Fri Feb 17, 2017 8:20 pm
by Kougar
I wanted to be sure before I posted again as this appears to be a problem with the latest Fahcore 21 build and is not related to GPU driver or GPU vendor. I am seeing this error on two different stock computers. One has a 750 Ti (378.66), the second has an AMD R9 380 (17.2.1) I just installed and immediately I'm seeing identical event logs with it.

Ran memtest and Prime95 for a few hours on my main system and didn't find anything. Just to be sure underclocking the RAM has not decreased the frequency of the Fahcore21 events, I'm seeing multiple a day on both rigs. It doesn't appear to affect the Work Units or performance, dunno if it is just a cosmetic issue.

I was able to review the event logs for a third stock desktop using a GTX 480 and it had one instance of the same error back in November.

Re: Event Log Errors, but Client is Fine?

Posted: Sat Feb 18, 2017 6:53 pm
by bruce
My 750 Ti's aren't showing any such problems, but then they couldn't because they're on Linux. I'll move one to a Windows machine and see if I start getting those errors.

BTW, about 8 lines into the log of a new FAHCore_21 WU's log, there's a Version Number. Are you seeing this problem on a particular version of Core_21?

Re: Event Log Errors, but Client is Fine?

Posted: Sat Feb 18, 2017 7:09 pm
by foldy
The error in Windows event logging happens with igdrcl64.dll which is the opencl runtime driver from intel.
So maybe your FahClient tries to access intel opencl driver which fails.
But as folding continues without a problem you maybe can ignore this Windows event logging.

Re: Event Log Errors, but Client is Fine?

Posted: Sun Feb 19, 2017 6:13 am
by Kougar
Bruce, looks like both GPUs are using version 18 for core 21. When I installed the Radeon it downloaded a separate copy.

That is bizarre, it seems odd F@H would try to access Intel's OpenCL driver. I can confirm that I have the latest Intel HD 4600 (haswell IGP) drivers installed on both of these affected rigs. I will uninstall them now that I'm using the R9 380 and report back if that changes anything.

Below is the event log error on the Radeon, just to have it posted. Same igdrcl64.dll as the nvidia cards.
Faulting application name: FahCore_21.exe, version: 0.0.0.0, time stamp: 0x58825714
Faulting module name: igdrcl64.dll, version: 20.19.15.4531, time stamp: 0x57ed260c
Exception code: 0xc0000005
Fault offset: 0x000000000007a74a
Faulting process id: 0x35c8
Faulting application start time: 0x01d28a36773117fa
Faulting application path: C:\Users\gamer\AppData\Roaming\FAHClient\cores\fahwebx.stanford.edu\cores\Win32\AMD64\ATI\R600\Core_21.fah\FahCore_21.exe
Faulting module path: C:\WINDOWS\SYSTEM32\igdrcl64.dll
Report Id: e5d0f86d-e90a-43ad-9067-a341f1ffe795
Faulting package full name:
Faulting package-relative application ID:

Re: Event Log Errors, but Client is Fine?

Posted: Sun Feb 19, 2017 7:11 am
by bruce
When you install the drivers for Intel, for AMD and/or for nVidia, you should get a copy of an OpenCL run-time library that work with their low-level drivers. That probably means that your PATH needs to be different, depending on which GPU is to be accessed. Since FAH doesn't support the Intel IGP, I've only seen installations with AMD and/or NVidia.

What happens if that .dll is inaccessible to FAH?

Re: Event Log Errors, but Client is Fine?

Posted: Sun Feb 19, 2017 9:52 am
by Kougar
Uninstalled the Intel graphics driver, restarted, resumed F@H and it instantly created a new igdrcl64.dll event.

Went ahead and deleted the dll file, restarted again and seems fine so far. Will check on it later today.

Re: Event Log Errors, but Client is Fine?

Posted: Sun Feb 19, 2017 5:29 pm
by bruce
What I'd read into that is (1) Your CPU still wants igdrcl64 if it's going to be able to support some non-FAH OpenCL application ... and (2) FAH runs fine now that it's not finding the Intel code.

Presumably that also implies that you have successfully installed OpenCL with the other drivers distributed for your GPU but for some reason, the Intel drivers were superseding them.

This may not be 100% true, but at least it fits the symptoms.

Re: Event Log Errors, but Client is Fine?

Posted: Mon Feb 20, 2017 6:14 am
by Kougar
Only been a day, but deleting the igdrcl64.dll seems to have stopped the eventlog errors.

The AMD and NVIDIA drivers were the most recently installed drivers on both affected systems. Nothing is connected to the IGP ports so the IGP shouldn't even be active and F@H shouldn't be trying to use it. Both F@H installations are preexisting. So there's no reason F@H should be attempting to access Intel's OpenCL driver, especially after the Intel drivers were uninstalled. When the Intel driver fails F@H reverts to the appropriate GPU driver, and apparently it had been doing this since mid-January for both systems.

What I don't understand is why my systems would be affected and not others, I would presume this should be a widespread issue given it affects both GPU vendors on Windows 10.

Re: Event Log Errors, but Client is Fine?

Posted: Mon Feb 20, 2017 8:23 am
by foldy
Maybe others also have this entry in Windows event log but because it does not disturb folding nobody will even notice it.

Re: Event Log Errors, but Client is Fine?

Posted: Mon Feb 20, 2017 6:08 pm
by bruce
Kougar wrote:Nothing is connected to the IGP ports so the IGP shouldn't even be active and F@H shouldn't be trying to use it. Both F@H installations are preexisting. So there's no reason F@H should be attempting to access Intel's OpenCL driver, especially after the Intel drivers were uninstalled. When the Intel driver fails F@H reverts to the appropriate GPU driver, and apparently it had been doing this since mid-January for both systems.
This has everything to do with how (and why) Intel's OpenCL driver was installed. If that installation causes the driver to be invoked, logs the error, and then somehow proceeds to process the calls by the FAHCore, there isn't anything Stanford can do. It's probably something as simple as the order of entries in the PATH environment variable -- and likely influenced by the order of the driver installations.
What I don't understand is why my systems would be affected and not others, I would presume this should be a widespread issue given it affects both GPU vendors on Windows 10.
[/quote]
True, but FAH does not manage how the IGP drives are installed or what happens to the errors.