Project:9011 run:145 clone:2 gen:75

Moderators: Site Moderators, FAHC Science Team

bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project:9011 run:145 clone:2 gen:75

Post by bruce »

You might want to check with Acer to see if they have an un-distributed version of their App and if not, report the problem to them ... (even if you don't plan to use it).
Ricky
Posts: 483
Joined: Sat Aug 01, 2015 1:34 am
Hardware configuration: 1. 2 each E5-2630 V3 processors, 64 GB RAM, GTX980SC GPU, and GTX980 GPU running on windows 8.1 operating system.
2. I7-6950X V3 processor, 32 GB RAM, 1 GTX980tiFTW, and 2 each GTX1080FTW GPUs running on windows 8.1 operating system.
Location: New Mexico

Re: Project:9011 run:145 clone:2 gen:75

Post by Ricky »

runSW.exe was adding a handle every second. I stopped the RunSwUSB service and cleared over 140k handles that opened in two and a half days of up-time. Now I am at 23.3K handles.

Thanks David, Rel, Bruce and Joe!
Ricky
Posts: 483
Joined: Sat Aug 01, 2015 1:34 am
Hardware configuration: 1. 2 each E5-2630 V3 processors, 64 GB RAM, GTX980SC GPU, and GTX980 GPU running on windows 8.1 operating system.
2. I7-6950X V3 processor, 32 GB RAM, 1 GTX980tiFTW, and 2 each GTX1080FTW GPUs running on windows 8.1 operating system.
Location: New Mexico

Re: Project:9011 run:145 clone:2 gen:75

Post by Ricky »

I had another cleanup problem. I paused, rebooted, and resumed folding. The number of handles were less than 25k. I was only playing solitaire at the time from the web.

Code: Select all

11:05:42:WU01:FS00:0xa4:Completed 78400 out of 80000 steps  (98%)
11:06:17:WU01:FS00:0xa4:Completed 79200 out of 80000 steps  (99%)
11:06:17:WU03:FS00:Connecting to 171.67.108.200:8080
11:06:18:WU03:FS00:Assigned to work server 128.143.199.97
11:06:18:WU03:FS00:Requesting new work unit for slot 00: RUNNING cpu:30 from 128.143.199.97
11:06:18:WU03:FS00:Connecting to 128.143.199.97:8080
11:06:19:WU03:FS00:Downloading 2.14MiB
11:06:22:WU03:FS00:Download complete
11:06:22:WU03:FS00:Received Unit: id:03 state:DOWNLOAD error:NO_ERROR project:7520 run:116 clone:4 gen:366 core:0xa4 unit:0x000000a4fbcb017d54ebbafed546f8f4
11:06:52:WU01:FS00:0xa4:Completed 80000 out of 80000 steps  (100%)
11:06:54:WU01:FS00:0xa4:DynamicWrapper: Finished Work Unit: sleep=10000
11:07:04:WU01:FS00:0xa4:
11:07:04:WU01:FS00:0xa4:Finished Work Unit:
11:07:04:WU01:FS00:0xa4:- Reading up to 4117896 from "01/wudata_01.trr": Read 4117896
11:07:04:WU01:FS00:0xa4:trr file hash check passed.
11:07:04:WU01:FS00:0xa4:- Reading up to 3190160 from "01/wudata_01.xtc": Read 3190160
11:07:04:WU01:FS00:0xa4:xtc file hash check passed.
11:07:04:WU01:FS00:0xa4:edr file hash check passed.
11:07:04:WU01:FS00:0xa4:logfile size: 19877
11:07:04:WU01:FS00:0xa4:Leaving Run
11:07:07:WU01:FS00:0xa4:- Writing 7330325 bytes of core data to disk...
11:07:09:WU01:FS00:0xa4:Done: 7329813 -> 7059547 (compressed to 96.3 percent)
11:07:09:WU01:FS00:0xa4:  ... Done.
11:07:10:WU01:FS00:0xa4:- Shutting down core
11:07:10:WU01:FS00:0xa4:
11:07:10:WU01:FS00:0xa4:Folding@home Core Shutdown: FINISHED_UNIT
11:07:10:WU01:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
11:07:10:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:9752 run:616 clone:0 gen:78 core:0xa4 unit:0x0000006bab40416355417279cda33c16
11:07:10:WU01:FS00:Uploading 6.73MiB to 171.64.65.99
11:07:10:WU03:FS00:Starting
11:07:10:WU01:FS00:Connecting to 171.64.65.99:8080
11:07:10:WU03:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/win1/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/Core_a4.fah/FahCore_a4.exe -dir 03 -suffix 01 -version 704 -lifeline 5656 -checkpoint 15 -np 30
11:07:10:WU03:FS00:Started FahCore on PID 7472
11:07:10:WU03:FS00:Core PID:13704
11:07:10:WU03:FS00:FahCore 0xa4 started
11:07:11:WU03:FS00:0xa4:
11:07:11:WU03:FS00:0xa4:*------------------------------*
11:07:11:WU03:FS00:0xa4:Folding@Home Gromacs GB Core
11:07:11:WU03:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
11:07:11:WU03:FS00:0xa4:
11:07:11:WU03:FS00:0xa4:Preparing to commence simulation
11:07:11:WU03:FS00:0xa4:- Looking at optimizations...
11:07:11:WU03:FS00:0xa4:- Created dyn
11:07:11:WU03:FS00:0xa4:- Files status OK
11:07:11:WU03:FS00:0xa4:- Expanded 2240139 -> 3200028 (decompressed 142.8 percent)
11:07:11:WU03:FS00:0xa4:Called DecompressByteArray: compressed_data_size=2240139 data_size=3200028, decompressed_data_size=3200028 diff=0
11:07:11:WU03:FS00:0xa4:- Digital signature verified
11:07:11:WU03:FS00:0xa4:
11:07:11:WU03:FS00:0xa4:Project: 7520 (Run 116, Clone 4, Gen 366)
11:07:11:WU03:FS00:0xa4:
11:07:11:WU03:FS00:0xa4:Assembly optimizations on if available.
11:07:11:WU03:FS00:0xa4:Entering M.D.
11:07:17:WU03:FS00:0xa4:Mapping NT from 30 to 30 
11:07:17:WU03:FS00:0xa4:Completed 0 out of 500000 steps  (0%)
11:07:18:WU01:FS00:Upload complete
11:07:18:WU01:FS00:Server responded WORK_ACK (400)
11:07:18:WU01:FS00:Final credit estimate, 4336.00 points
11:07:18:WU01:FS00:Cleaning up
11:07:18:ERROR:WU01:FS00:Exception: Failed to remove directory './work/01': boost::filesystem::remove: The process cannot access the file because it is being used by another process: ".\work\01\wudata_01.tpr"
11:07:18:WU01:FS00:Cleaning up
11:07:18:ERROR:WU01:FS00:Exception: Failed to remove directory './work/01': boost::filesystem::remove: The process cannot access the file because it is being used by another process: ".\work\01\wudata_01.tpr"
11:07:30:ERROR:FS00:
11:07:30:ERROR:FS00:-------------------------------------------------------
11:07:30:ERROR:FS00:Program Folding@home, VERSION 4.5.4
11:07:30:ERROR:FS00:Source code file: gromacs-4.5.4\src\gmxlib\smalloc.c, line: 171
11:07:30:ERROR:FS00:
11:07:30:ERROR:FS00:Fatal error:
11:07:30:ERROR:FS00:Not enough memory. Failed to calloc ld elements of size ld for 
11:07:30:ERROR:FS00:(called from file (null), line 24)
11:07:30:ERROR:FS00:For more information and tips for troubleshooting, please check the GROMACS
11:07:30:ERROR:FS00:website at http://www.gromacs.org/Documentation/Errors
11:07:30:ERROR:FS00:-------------------------------------------------------
11:07:30:ERROR:FS00:
11:07:30:ERROR:FS00:Thanx for Using GROMACS - Have a Nice Day
11:07:31:ERROR:FS00:
11:07:31:ERROR:FS00:-------------------------------------------------------
11:07:31:ERROR:FS00:Program Folding@home, VERSION 4.5.4
11:07:31:ERROR:FS00:Source code file: gromacs-4.5.4\src\gmxlib\smalloc.c, line: 171
11:07:31:ERROR:FS00:
11:07:31:ERROR:FS00:Fatal error:
11:07:31:ERROR:FS00:Not enough memory. Failed to calloc ld elements of size ld for 
11:07:31:ERROR:FS00:(called from file (null), line 24)
11:07:31:ERROR:FS00:For more information and tips for troubleshooting, please check the GROMACS
11:07:31:ERROR:FS00:website at http://www.gromacs.org/Documentation/Errors
11:07:31:ERROR:FS00:-------------------------------------------------------
11:07:31:ERROR:FS00:
11:07:31:ERROR:FS00:Thanx for Using GROMACS - Have a Nice Day
11:08:18:WU01:FS00:Cleaning up
11:08:18:ERROR:WU01:FS00:Exception: Failed to remove directory './work/01': boost::filesystem::remove: The process cannot access the file because it is being used by another process: ".\work\01\wudata_01.tpr"
11:09:09:WU03:FS00:0xa4:Completed 5000 out of 500000 steps  (1%)
11:09:55:WU01:FS00:Cleaning up
11:09:55:ERROR:WU01:FS00:Exception: Failed to remove directory './work/01': boost::filesystem::remove: The process cannot access the file because it is being used by another process: ".\work\01\wudata_01.tpr"
11:11:01:WU03:FS00:0xa4:Completed 10000 out of 500000 steps  (2%)
11:12:33:WU01:FS00:Cleaning up
11:12:33:ERROR:WU01:FS00:Exception: Failed to remove directory './work/01': boost::filesystem::remove: The process cannot access the file because it is being used by another process: ".\work\01\wudata_01.tpr"
11:12:52:WU03:FS00:0xa4:Completed 15000 out of 500000 steps  (3%)
davidcoton
Posts: 1102
Joined: Wed Nov 05, 2008 3:19 pm
Location: Cambridge, UK

Re: Project:9011 run:145 clone:2 gen:75

Post by davidcoton »

Strange. The cleanup failed to access one of the files for deletion. The same error occurred again two minutes later. In between there was a memory allocation error. Time to run chkdsk?
Image
Ricky
Posts: 483
Joined: Sat Aug 01, 2015 1:34 am
Hardware configuration: 1. 2 each E5-2630 V3 processors, 64 GB RAM, GTX980SC GPU, and GTX980 GPU running on windows 8.1 operating system.
2. I7-6950X V3 processor, 32 GB RAM, 1 GTX980tiFTW, and 2 each GTX1080FTW GPUs running on windows 8.1 operating system.
Location: New Mexico

Re: Project:9011 run:145 clone:2 gen:75

Post by Ricky »

It is a 1 month old SSD. Reboot seems to allow access. The disk is 80% free. Checked the drive with windows, it found no problems. I think it is something to do with windows 8.1, but not sure.

Typically, about 3.5GB memory is used out of the 64GB available. I wish I looked at what was being used before I rebooted. Very little software is presently installed in the machine.
davidcoton
Posts: 1102
Joined: Wed Nov 05, 2008 3:19 pm
Location: Cambridge, UK

Re: Project:9011 run:145 clone:2 gen:75

Post by davidcoton »

I folded (CPU only) on a Win8.1 laptop for some time, no problems like yours. But I can't offer any other explanation at present. See if it recurs.
Image
Ricky
Posts: 483
Joined: Sat Aug 01, 2015 1:34 am
Hardware configuration: 1. 2 each E5-2630 V3 processors, 64 GB RAM, GTX980SC GPU, and GTX980 GPU running on windows 8.1 operating system.
2. I7-6950X V3 processor, 32 GB RAM, 1 GTX980tiFTW, and 2 each GTX1080FTW GPUs running on windows 8.1 operating system.
Location: New Mexico

Re: Project:9011 run:145 clone:2 gen:75

Post by Ricky »

david,

I will post each occurrence on this thread. Obviously, this is not a big deal. I just have to reboot when it happens. It is not much of a throughput issue for FAH. It seems to only happen after running over a week.

I tend to be curious, that's all.
toTOW
Site Moderator
Posts: 6309
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: Project:9011 run:145 clone:2 gen:75

Post by toTOW »

An anti virus program might be the culprit here ... Which one do you use ? Did you try to exclude FAH Data folder from analysis ?
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
Ricky
Posts: 483
Joined: Sat Aug 01, 2015 1:34 am
Hardware configuration: 1. 2 each E5-2630 V3 processors, 64 GB RAM, GTX980SC GPU, and GTX980 GPU running on windows 8.1 operating system.
2. I7-6950X V3 processor, 32 GB RAM, 1 GTX980tiFTW, and 2 each GTX1080FTW GPUs running on windows 8.1 operating system.
Location: New Mexico

Re: Project:9011 run:145 clone:2 gen:75

Post by Ricky »

toTow,

I only have windows defender at this time. I have noticed that the mouse cursor changes to a busy symbol for about 5 seconds at a periodic rate of 20 minutes with only FAH running. I am not sure it happens all the time, but I did see it on the day the error occurred.
Ricky
Posts: 483
Joined: Sat Aug 01, 2015 1:34 am
Hardware configuration: 1. 2 each E5-2630 V3 processors, 64 GB RAM, GTX980SC GPU, and GTX980 GPU running on windows 8.1 operating system.
2. I7-6950X V3 processor, 32 GB RAM, 1 GTX980tiFTW, and 2 each GTX1080FTW GPUs running on windows 8.1 operating system.
Location: New Mexico

Re: Project:9011 run:145 clone:2 gen:75

Post by Ricky »

Mysteriously my cleanup problem has re-emerged. Before reboot:

Code: Select all

00:01:28:WU10:FS00:Cleaning up
00:01:28:ERROR:WU10:FS00:Exception: Failed to remove directory './work/10': boost::filesystem::remove: The process cannot access the file because it is being used by another process: ".\work\10\wudata_01.tpr"
00:01:42:WU05:FS00:0xa4:Completed 1600 out of 80000 steps  (2%)
00:02:17:WU05:FS00:0xa4:Completed 2400 out of 80000 steps  (3%)
00:02:29:WU07:FS01:0x21:Completed 3250000 out of 5000000 steps (65%)
00:02:52:WU05:FS00:0xa4:Completed 3200 out of 80000 steps  (4%)
00:03:05:WU10:FS00:Cleaning up
00:03:05:ERROR:WU10:FS00:Exception: Failed to remove directory './work/10': boost::filesystem::remove: The process cannot access the file because it is being used by another process: ".\work\10\wudata_01.tpr"
00:03:28:WU05:FS00:0xa4:Completed 4000 out of 80000 steps  (5%)
00:03:43:WU00:FS00:Cleaning up
00:03:43:ERROR:WU00:FS00:Exception: Failed to remove directory './work/00': boost::filesystem::remove: The process cannot access the file because it is being used by another process: ".\work\00\wudata_01.tpr"
00:03:45:WU11:FS02:0x18:Completed 800000 out of 5000000 steps (16%)
00:04:03:WU05:FS00:0xa4:Completed 4800 out of 80000 steps  (6%)
00:04:38:WU05:FS00:0xa4:Completed 5600 out of 80000 steps  (7%)
00:05:13:WU05:FS00:0xa4:Completed 6400 out of 80000 steps  (8%)
00:05:42:WU10:FS00:Cleaning up
00:05:42:ERROR:WU10:FS00:Exception: Failed to remove directory './work/10': boost::filesystem::remove: The process cannot access the file because it is being used by another process: ".\work\10\wudata_01.tpr"
00:05:48:WU05:FS00:0xa4:Completed 7200 out of 80000 steps  (9%)
00:06:19:WU11:FS02:0x18:Completed 850000 out of 5000000 steps (17%)
00:06:23:WU05:FS00:0xa4:Completed 8000 out of 80000 steps  (10%)
00:06:58:WU05:FS00:0xa4:Completed 8800 out of 80000 steps  (11%)
00:07:09:WU07:FS01:0x21:Completed 3300000 out of 5000000 steps (66%)
00:07:33:WU05:FS00:0xa4:Completed 9600 out of 80000 steps  (12%)
00:08:08:WU05:FS00:0xa4:Completed 10400 out of 80000 steps  (13%)
00:08:43:WU05:FS00:0xa4:Completed 11200 out of 80000 steps  (14%)
00:09:05:WU11:FS02:0x18:Completed 900000 out of 5000000 steps (18%)
00:09:18:WU05:FS00:0xa4:Completed 12000 out of 80000 steps  (15%)
00:09:53:WU05:FS00:0xa4:Completed 12800 out of 80000 steps  (16%)
00:09:57:WU10:FS00:Cleaning up
00:09:57:ERROR:WU10:FS00:Exception: Failed to remove directory './work/10': boost::filesystem::remove: The process cannot access the file because it is being used by another process: ".\work\10\wudata_01.tpr"
00:10:28:WU05:FS00:0xa4:Completed 13600 out of 80000 steps  (17%)
00:11:03:WU05:FS00:0xa4:Completed 14400 out of 80000 steps  (18%)
00:11:38:WU05:FS00:0xa4:Completed 15200 out of 80000 steps  (19%)
00:11:40:WU11:FS02:0x18:Completed 950000 out of 5000000 steps (19%)
00:11:46:WU07:FS01:0x21:Completed 3350000 out of 5000000 steps (67%)
00:12:14:WU05:FS00:0xa4:Completed 16000 out of 80000 steps  (20%)
00:12:49:WU05:FS00:0xa4:Completed 16800 out of 80000 steps  (21%)
00:13:24:WU05:FS00:0xa4:Completed 17600 out of 80000 steps  (22%)
00:13:59:WU05:FS00:0xa4:Completed 18400 out of 80000 steps  (23%)
00:14:14:WU11:FS02:0x18:Completed 1000000 out of 5000000 steps (20%)
00:14:32:WU09:FS00:Cleaning up
00:14:32:ERROR:WU09:FS00:Exception: Failed to remove directory './work/09': boost::filesystem::remove: The process cannot access the file because it is being used by another process: ".\work\09\wudata_01.tpr"
00:14:34:WU05:FS00:0xa4:Completed 19200 out of 80000 steps  (24%)
00:15:09:WU05:FS00:0xa4:Completed 20000 out of 80000 steps  (25%)
00:15:45:WU05:FS00:0xa4:Completed 20800 out of 80000 steps  (26%)
00:16:20:WU05:FS00:0xa4:Completed 21600 out of 80000 steps  (27%)
00:16:21:WU07:FS01:0x21:Completed 3400000 out of 5000000 steps (68%)
00:16:48:WU10:FS00:Cleaning up
00:16:48:ERROR:WU10:FS00:Exception: Failed to remove directory './work/10': boost::filesystem::remove: The process cannot access the file because it is being used by another process: ".\work\10\wudata_01.tpr"
00:16:55:WU05:FS00:0xa4:Completed 22400 out of 80000 steps  (28%)
After reboot:

Code: Select all

*********************** Log Started 2016-02-27T00:35:20Z ***********************
00:35:20:************************* Folding@home Client *************************
00:35:20:      Website: http://folding.stanford.edu/
00:35:20:    Copyright: (c) 2009-2014 Stanford University
00:35:20:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
00:35:20:         Args: 
00:35:20:       Config: C:/Users/win1/AppData/Roaming/FAHClient/config.xml
00:35:20:******************************** Build ********************************
00:35:20:      Version: 7.4.4
00:35:20:         Date: Mar 4 2014
00:35:20:         Time: 20:26:54
00:35:20:      SVN Rev: 4130
00:35:20:       Branch: fah/trunk/client
00:35:20:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
00:35:20:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
00:35:20:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
00:35:20:     Platform: win32 XP
00:35:20:         Bits: 32
00:35:20:         Mode: Release
00:35:20:******************************* System ********************************
00:35:20:          CPU: Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
00:35:20:       CPU ID: GenuineIntel Family 6 Model 63 Stepping 2
00:35:20:         CPUs: 32
00:35:20:       Memory: 63.90GiB
00:35:20:  Free Memory: 61.76GiB
00:35:20:      Threads: WINDOWS_THREADS
00:35:20:   OS Version: 6.2
00:35:20:  Has Battery: false
00:35:20:   On Battery: false
00:35:20:   UTC Offset: -7
00:35:20:          PID: 6532
00:35:20:          CWD: C:/Users/win1/AppData/Roaming/FAHClient
00:35:20:           OS: Windows 8.1 Pro
00:35:20:      OS Arch: AMD64
00:35:20:         GPUs: 2
00:35:20:        GPU 0: NVIDIA:5 GM204 [GeForce GTX 980]
00:35:20:        GPU 1: NVIDIA:5 GM204 [GeForce GTX 980]
00:35:20:         CUDA: 5.2
00:35:20:  CUDA Driver: 7050
00:35:20:Win32 Service: false
00:35:20:***********************************************************************
00:35:20:<config>
00:35:20:  <!-- Folding Slot Configuration -->
00:35:20:  <client-type v='beta'/>
00:35:20:  <max-packet-size v='big'/>
00:35:20:
00:35:20:  <!-- HTTP Server -->
00:35:20:  <allow v='127.0.0.1,192.168.1.0-192.168.1.255'/>
00:35:20:
00:35:20:  <!-- Network -->
00:35:20:  <proxy v=':8080'/>
00:35:20:
00:35:20:  <!-- Remote Command Server -->
00:35:20:  <command-allow-no-pass v='127.0.0.1,192.168.1.0-192.168.1.255'/>
00:35:20:  <password v='*************'/>
00:35:20:
00:35:20:  <!-- Slot Control -->
00:35:20:  <power v='FULL'/>
00:35:20:
00:35:20:  <!-- User Information -->
00:35:20:  <passkey v='********************************'/>
00:35:20:  <user v='Richard_Summers'/>
00:35:20:
00:35:20:  <!-- Work Unit Control -->
00:35:20:  <next-unit-percentage v='100'/>
00:35:20:
00:35:20:  <!-- Folding Slots -->
00:35:20:  <slot id='0' type='CPU'>
00:35:20:    <cpus v='28'/>
00:35:20:  </slot>
00:35:20:  <slot id='1' type='GPU'/>
00:35:20:  <slot id='2' type='GPU'/>
00:35:20:</config>
00:35:20:Trying to access database...
00:35:20:Successfully acquired database lock
00:35:20:Enabled folding slot 00: READY cpu:28
00:35:20:Enabled folding slot 01: READY gpu:0:GM204 [GeForce GTX 980]
00:35:20:Enabled folding slot 02: READY gpu:1:GM204 [GeForce GTX 980]
00:35:20:WU02:FS00:Cleaning up
00:35:20:WU03:FS00:Cleaning up
00:35:20:WU04:FS00:Cleaning up
00:35:20:WU01:FS00:Cleaning up
00:35:20:WU06:FS00:Cleaning up
00:35:20:WU00:FS00:Cleaning up
00:35:20:WU08:FS00:Cleaning up
00:35:20:WU09:FS00:Cleaning up
00:35:20:WU10:FS00:Cleaning up
00:35:20:WU11:FS02:Starting
00:35:20:WU11:FS02:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/win1/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/beta/Core_18.fah/FahCore_18.exe -dir 11 -suffix 01 -version 704 -lifeline 6532 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
00:35:20:WU11:FS02:Started FahCore on PID 6608
00:35:20:WU11:FS02:Core PID:6620
00:35:20:WU11:FS02:FahCore 0x18 started
00:35:20:WU07:FS01:Starting
00:35:20:WU07:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/win1/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/beta/Core_21.fah/FahCore_21.exe -dir 07 -suffix 01 -version 704 -lifeline 6532 -checkpoint 15 -gpu 0 -gpu-vendor nvidia
00:35:20:WU07:FS01:Started FahCore on PID 6628
00:35:20:WU07:FS01:Core PID:6640
00:35:20:WU07:FS01:FahCore 0x21 started
00:35:20:WU05:FS00:Starting
00:35:20:WU05:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/win1/AppData/Roaming/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/beta/Core_a4.fah/FahCore_a4.exe -dir 05 -suffix 01 -version 704 -lifeline 6532 -checkpoint 15 -np 28
00:35:20:WU05:FS00:Started FahCore on PID 6648
00:35:20:WU05:FS00:Core PID:6660
00:35:20:WU05:FS00:FahCore 0xa4 started
00:35:21:WU07:FS01:0x21:*********************** Log Started 2016-02-27T00:35:20Z ***********************
00:35:21:WU07:FS01:0x21:Project: 11703 (Run 0, Clone 95, Gen 46)
00:35:21:WU07:FS01:0x21:Unit: 0x000000308ca304f3568961c1c753779f
00:35:21:WU07:FS01:0x21:CPU: 0x00000000000000000000000000000000
00:35:21:WU07:FS01:0x21:Machine: 1
00:35:21:WU07:FS01:0x21:Digital signatures verified
00:35:21:WU07:FS01:0x21:Folding@home GPU Core21 Folding@home Core
00:35:21:WU07:FS01:0x21:Version 0.0.17
00:35:21:WU11:FS02:0x18:*********************** Log Started 2016-02-27T00:35:20Z ***********************
00:35:21:WU11:FS02:0x18:Project: 10490 (Run 387, Clone 0, Gen 145)
00:35:21:WU11:FS02:0x18:Unit: 0x000000a08ca304f45537e91873a661c9
00:35:21:WU11:FS02:0x18:CPU: 0x00000000000000000000000000000000
00:35:21:WU11:FS02:0x18:Machine: 2
00:35:21:WU11:FS02:0x18:Digital signatures verified
00:35:21:WU11:FS02:0x18:Folding@home GPU core18
00:35:21:WU11:FS02:0x18:Version 0.0.4
00:35:21:WU05:FS00:0xa4:
00:35:21:WU11:FS02:0x18:  Found a checkpoint file
00:35:21:WU05:FS00:0xa4:*------------------------------*
00:35:21:WU05:FS00:0xa4:Folding@Home Gromacs GB Core
00:35:21:WU05:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
00:35:21:WU05:FS00:0xa4:
00:35:21:WU05:FS00:0xa4:Preparing to commence simulation
00:35:21:WU05:FS00:0xa4:- Looking at optimizations...
00:35:21:WU05:FS00:0xa4:- Files status OK
00:35:21:WU07:FS01:0x21:  Found a checkpoint file
00:35:23:WU05:FS00:0xa4:- Expanded 6524002 -> 22366748 (decompressed 342.8 percent)
00:35:23:WU05:FS00:0xa4:Called DecompressByteArray: compressed_data_size=6524002 data_size=22366748, decompressed_data_size=22366748 diff=0
00:35:23:WU05:FS00:0xa4:- Digital signature verified
00:35:23:WU05:FS00:0xa4:
00:35:23:WU05:FS00:0xa4:Project: 9752 (Run 3491, Clone 0, Gen 430)
00:35:23:WU05:FS00:0xa4:
00:35:23:WU05:FS00:0xa4:Assembly optimizations on if available.
00:35:23:WU05:FS00:0xa4:Entering M.D.
00:35:29:WU05:FS00:0xa4:Using Gromacs checkpoints
00:35:29:WU05:FS00:0xa4:Mapping NT from 28 to 28 
00:35:30:WU07:FS01:0x21:Completed 3500000 out of 5000000 steps (70%)
00:35:30:WU07:FS01:0x21:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
00:35:32:WU05:FS00:0xa4:Resuming from checkpoint
00:35:32:WU05:FS00:0xa4:Verified 05/wudata_01.log
00:35:32:WU05:FS00:0xa4:Verified 05/wudata_01.trr
00:35:32:WU05:FS00:0xa4:Verified 05/wudata_01.xtc
00:35:32:WU05:FS00:0xa4:Verified 05/wudata_01.edr
00:35:32:WU05:FS00:0xa4:Completed 41000 out of 80000 steps  (51%)
00:35:43:WU11:FS02:0x18:Completed 1250000 out of 5000000 steps (25%)
00:35:43:WU11:FS02:0x18:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
00:36:00:WU05:FS00:0xa4:Completed 41600 out of 80000 steps  (52%)
00:36:37:WU05:FS00:0xa4:Completed 42400 out of 80000 steps  (53%)


I don't seem to have a handle leak problem. I have been up for about a week before the cleanup issue.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project:9011 run:145 clone:2 gen:75

Post by bruce »

If you open a file in the WORK directory (say with an editor) the cleanup WILL FAIL.

Use the log file that is displayed with FAHControl.
Ricky
Posts: 483
Joined: Sat Aug 01, 2015 1:34 am
Hardware configuration: 1. 2 each E5-2630 V3 processors, 64 GB RAM, GTX980SC GPU, and GTX980 GPU running on windows 8.1 operating system.
2. I7-6950X V3 processor, 32 GB RAM, 1 GTX980tiFTW, and 2 each GTX1080FTW GPUs running on windows 8.1 operating system.
Location: New Mexico

Re: Project:9011 run:145 clone:2 gen:75

Post by Ricky »

Bruce,

I had guessed that particular access problem from the message in the log. I had not opened any FAH file between the previous boot and the last one. I only accessed the log file from the log directory for the previous post and I didn't leave it open long either. I just copied from it and then closed it.

I think you are right that something in windows on this one machine is accessing the files somehow. It takes about a week for it to happen. This is just like last summer, but it went away when I fixed the handle leak. So, I thought that was the problem. But now its back without the handle leak process running.

Windows has about 69 processes running. I might try killing suspect processes next week (when it happens again) to see if I can identify the culprit that is accessing the log files.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project:9011 run:145 clone:2 gen:75

Post by bruce »

The WU in ./work/10 has already been uploaded. I'd go to that directory and manually delete whatever is left. (
(You'll probably need elevated privileges to delete one of the files.) You may get a message that's more meaningful, too. It would be nice to fix the problem so it doesn't happen again.

If it keeps happening, the "/10" will probably be different next time. Don't mess with directories other than the one specified in the message plus it's contents.
Ricky
Posts: 483
Joined: Sat Aug 01, 2015 1:34 am
Hardware configuration: 1. 2 each E5-2630 V3 processors, 64 GB RAM, GTX980SC GPU, and GTX980 GPU running on windows 8.1 operating system.
2. I7-6950X V3 processor, 32 GB RAM, 1 GTX980tiFTW, and 2 each GTX1080FTW GPUs running on windows 8.1 operating system.
Location: New Mexico

Re: Project:9011 run:145 clone:2 gen:75

Post by Ricky »

Bruce,

I just looked. At present, there are only a 01,02, and 03 directory in the work folder. I left them all alone.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project:9011 run:145 clone:2 gen:75

Post by bruce »

At the time you got the cleanup failed, ".\work\10\wudata_01.tpr" could not be deleted -- probably because it was still open. A reboot would have closed the file and the cleanup would have worked the next time it was invoked.
Post Reply