FahCore21.exe Crashing After Upload

If you think it might be a driver problem, see viewforum.php?f=79

Moderators: Site Moderators, PandeGroup

FahCore21.exe Crashing After Upload

Postby ryoungblood » Sat May 06, 2017 6:51 pm

I appear to be having a recurring issue with Core 21 crashing immediately after upload. I am currently using 378.92 drivers.

FAHClient Log:

17:37:57:WU02:FS01:0x18:Completed 5000000 out of 5000000 steps (100%)
17:37:58:WU01:FS01:Connecting to 171.67.108.45:80
17:37:58:WU01:FS01:Assigned to work server 140.163.4.245
17:37:58:WU01:FS01:Requesting new work unit for slot 01: RUNNING gpu:1:GP102 [GeForce GTX 1080 Ti] from 140.163.4.245
17:37:58:WU01:FS01:Connecting to 140.163.4.245:8080
17:37:59:WU01:FS01:Downloading 14.49MiB
17:38:01:WU02:FS01:0x18:Saving result file logfile_01.txt
17:38:01:WU02:FS01:0x18:Saving result file checkpointState.xml
17:38:02:WU01:FS01:Download complete
17:38:02:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:10496 run:15 clone:20 gen:39 core:0x21 unit:0x000000338ca304f558895a6e9c30172b
17:38:02:WU02:FS01:0x18:Saving result file checkpt.crc
17:38:02:WU02:FS01:0x18:Saving result file log.txt
17:38:02:WU02:FS01:0x18:Saving result file positions.xtc
17:38:03:WU02:FS01:0x18:Folding@home Core Shutdown: FINISHED_UNIT
17:38:03:WU02:FS01:FahCore returned: FINISHED_UNIT (100 = 0x64)
17:38:03:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:10490 run:382 clone:0 gen:736 core:0x18 unit:0x000003698ca304f45537e916316ca4ff
17:38:03:WU02:FS01:Uploading 6.64MiB to 140.163.4.244
17:38:03:WU01:FS01:Starting
17:38:03:WU02:FS01:Connecting to 140.163.4.244:8080
17:38:03:WU01:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" "C:/Users/Command Center/AppData/Roaming/FAHClient/cores/fahwebx.stanford.edu/cores/Win32/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21.exe" -dir 01 -suffix 01 -version 704 -lifeline 6988 -checkpoint 30 -gpu 0 -gpu-vendor nvidia
17:38:03:WU01:FS01:Started FahCore on PID 7184
17:38:03:WU01:FS01:Core PID:5944
17:38:03:WU01:FS01:FahCore 0x21 started
17:38:05:WU01:FS01:0x21:*********************** Log Started 2017-05-06T17:38:05Z ***********************
17:38:05:WU01:FS01:0x21:Project: 10496 (Run 15, Clone 20, Gen 39)
17:38:05:WU01:FS01:0x21:Unit: 0x000000338ca304f558895a6e9c30172b
17:38:05:WU01:FS01:0x21:CPU: 0x00000000000000000000000000000000
17:38:05:WU01:FS01:0x21:Machine: 1
17:38:05:WU01:FS01:0x21:Reading tar file core.xml
17:38:05:WU01:FS01:0x21:Reading tar file system.xml
17:38:06:WU01:FS01:0x21:Reading tar file integrator.xml
17:38:06:WU01:FS01:0x21:Reading tar file state.xml
17:38:08:WU01:FS01:0x21:Digital signatures verified
17:38:08:WU01:FS01:0x21:Folding@home GPU Core21 Folding@home Core
17:38:08:WU01:FS01:0x21:Version 0.0.18
17:38:09:WU02:FS01:Upload 93.12%
17:38:13:WU02:FS01:Upload complete
17:38:13:WU02:FS01:Server responded WORK_ACK (400)
17:38:13:WU02:FS01:Final credit estimate, 93399.00 points
17:38:13:WU02:FS01:Cleaning up
18:28:38:WARNING:WU01:FS01:FahCore returned: FAILED_1 (0 = 0x0)


Windows Event Log

Faulting application name: FahCore_21.exe, version: 0.0.0.0, time stamp: 0x588257cc
Faulting module name: ntdll.dll, version: 10.0.14393.479, time stamp: 0x5825887f
Exception code: 0xc0000374
Fault offset: 0x00000000000f8283
Faulting process id: 0x3234
Faulting application start time: 0x01d2b97bd895c739
Faulting application path: C:\Users\Command Center\AppData\Roaming\FAHClient\cores\fahwebx.stanford.edu\cores\Win32\AMD64\NVIDIA\Fermi\Core_21.fah\FahCore_21.exe
Faulting module path: C:\Windows\SYSTEM32\ntdll.dll
Report Id: 44e303f8-6a28-41f9-87fe-b213dccee1a8
Faulting package full name:
Faulting package-relative application ID:


It appears that the WU uploads then the core crashes. Windows makes me acknowledge the crashed program via a text box before it reports as failed within the FAH Client.

System Specs

Image
Image
User avatar
ryoungblood
 
Posts: 39
Joined: Wed Jan 18, 2017 10:13 pm
Location: Boulder, CO

Re: FahCore21.exe Crashing After Upload

Postby bruce » Sat May 06, 2017 7:31 pm

The failure has nothing to do with the upload. WU02 finished an uploaded successfully and "Cleaning up" seems to have worked correctly. WU01 was downloaded and FahCore_21 was started and then failed. I really doubt there was any interaction between the processing of WU02 and WU01.

Did the client restart WU01 successfully or was it dumped? (after the message 18:28:38:WARNING:WU01:FS01:FahCore returned: FAILED_1 (0 = 0x0))

The Windows exception code 0xc0000374 indicates a heap corruption - to determine the solution, you would need to debug the crash in a debugger to figure out who is corrupting the heap. Heap corruption can also occur spontaneously on overclocked systems.
bruce
 
Posts: 20827
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: FahCore21.exe Crashing After Upload

Postby rwh202 » Sun May 07, 2017 8:47 am

I wonder if it's something to do with this being a very quick system - the core finishes and starts a new WU in less than a second. I normally have a delay of a few seconds. Only a guess, but maybe something isn't being cleared out fully before the core is fired up again? Anti virus still scanning the fresh WU?
rwh202
 
Posts: 295
Joined: Mon Nov 15, 2010 8:51 pm
Location: South Coast, UK

Re: FahCore21.exe Crashing After Upload

Postby Nert » Sun May 07, 2017 3:50 pm

I've encountered this problem as well in the past while running Windows. I would have been running a different version of the video drivers at the time.

Here is the thread that I started: https://foldingforum.org/viewtopic.php?f=19&t=28709

Speculation at the time was hardware problem. I believe that speculation was wrong.

All of the symptoms I noticed are described by the original poster:

1) Previous core finishes correctly
2) New core crashes on startup following upload/download.
3) Windows indicates heap corruption problem.

RWH's speculation about a timing related bug makes sense to me ... there is some bug related to one core finishing and a new core starting up that causes heap corruption. It's intermittent and timing related. There are now two instances of this happening. I'm just spit balling here, but thought I'd share my recollection that this particular problem has occurred previously.
Image
Nert
 
Posts: 111
Joined: Wed Mar 26, 2014 7:46 pm

Re: FahCore21.exe Crashing After Upload

Postby bruce » Sun May 07, 2017 10:55 pm

I can't think of a relationship between closing one core and cleaning up the files of one WU and starting another copy of the (same or different) core on another WU that would corrupt the heap. Those events happen regularly for people who run more than one GPU though rarely at "exactly" the same time.

(Just because I think there's no relationship doesn't prove anything. If I happen to be wrong, somebody is going to have to figure out how to gather enough information for development to identify the problem. In the passage I quoted above, it says "you would need to debug the crash in a debugger to figure out who is corrupting the heap.")

The fact that Windows provides a popup that must be responded to before continuing means that it has temporarily preserved a lot of useful information that could be of help but somebody would need to produce a mini-dump of that task BEFORE closing that popup. Can somebody do that for us?
bruce
 
Posts: 20827
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: FahCore21.exe Crashing After Upload

Postby ryoungblood » Sun May 07, 2017 11:23 pm

Nothing restarted after it crashed.

I'm not incredibly technical but next time it comes up I will go with the option that says debug. I'll play around with Visual Studio and see if I can get anything useful to post here.

It happens once every few days, so I should be expecting another one pretty soon. CPU isn't OC'ed, the 3 GPU (1080 Ti/1080 Ti/1070) in the system have a slight OC (+80/+80).
User avatar
ryoungblood
 
Posts: 39
Joined: Wed Jan 18, 2017 10:13 pm
Location: Boulder, CO

Re: FahCore21.exe Crashing After Upload

Postby jcoffland » Sun May 07, 2017 11:35 pm

The second WU crashing during the first WUs clean up may just be a coincidence. However, it is possible that the core is accessing the GPU driver during clean up and that is causing the crash. This would be easy to test. Run a core 0x21 on a WU as normal. Then see if you can crash it simply by running another instance of the core manually. You could run it like this:

Code: Select all
C:\Users\Command Center\AppData\Roaming\FAHClient\cores\fahwebx.stanford.edu\cores\Win32\AMD64\NVIDIA\Fermi\Core_21.fah\FahCore_21.exe --info
Cauldron Development LLC
http://cauldrondevelopment.com/
User avatar
jcoffland
Pande Group Member
 
Posts: 974
Joined: Fri Oct 10, 2008 6:42 pm
Location: San Jose, CA

Re: FahCore21.exe Crashing After Upload

Postby ryoungblood » Mon May 08, 2017 4:44 pm

foldy wrote:@ryoungblood: If you use visual studio debug the most valueable part would be the callstack of the exception and create a dump file.


Good to know. Nothing has happened yet.
User avatar
ryoungblood
 
Posts: 39
Joined: Wed Jan 18, 2017 10:13 pm
Location: Boulder, CO

Re: FahCore21.exe Crashing After Upload

Postby ryoungblood » Wed May 24, 2017 12:16 pm

This probably isn't much help.

I couldn't get the call stack on that past error.

https://i.imgur.com/LnIQQKq.png
User avatar
ryoungblood
 
Posts: 39
Joined: Wed Jan 18, 2017 10:13 pm
Location: Boulder, CO


Return to GPU Projects and FahCores

Who is online

Users browsing this forum: No registered users and 1 guest

cron