Project: 8010 (Run 0, Clone 1943, Gen 42)

Moderators: Site Moderators, FAHC Science Team

Post Reply
v00d00
Posts: 396
Joined: Sun Dec 02, 2007 4:53 am
Hardware configuration: FX8320e (6 cores enabled) @ stock,
- 16GB DDR3,
- Zotac GTX 1050Ti @ Stock.
- Gigabyte GTX 970 @ Stock
Debian 9.

Running GPU since it came out, CPU since client version 3.
Folding since Folding began (~2000) and ran Genome@Home for a while too.
Ran Seti@Home prior to that.
Location: UK
Contact:

Project: 8010 (Run 0, Clone 1943, Gen 42)

Post by v00d00 »

Has failed twice at the same point with the exact same error. Ive done maybe around 100 units of this project prior to this one dying, so doubt its hardware related. Ive scrapped it, and am now folding a different workunit from P8010.

System is GPU on Linux Wine (yes i know its unsupported, but considering ive been doing 3-4 of these wu per day for over a month without a single problem, id lean towards a wu problem), V6.41r2 client, Stock Nvidia GTX 460.

Code: Select all

[01:55:02] Gpu type=3 species=20.
[01:55:02] Loaded queue successfully.
[01:55:02] 
[01:55:02] + Processing work unit
[01:55:02] Core required: FahCore_15.exe
[01:55:02] Core found.
[01:55:02] - Autosending finished units... [June 17 01:55:02 UTC]
[01:55:02] Trying to send all finished work units
[01:55:02] + No unsent completed units remaining.
[01:55:02] - Autosend completed
[01:55:02] Working on queue slot 06 [June 17 01:55:02 UTC]
[01:55:02] + Working ...
[01:55:02] - Calling '.\FahCore_15.exe -dir work/ -suffix 06 -nice 19 -cpu 80 -nocpulock -checkpoint 15 -forceasm -verbose -lifeline 8 -version 641'

[01:55:02] 
[01:55:02] *------------------------------*
[01:55:02] Folding@Home GPU Core
[01:55:02] Version                2.22 (Thu Dec 8 17:08:05 PST 2011)
[01:55:02] Build host             SimbiosNvdWin7
[01:55:02] Board Type             NVIDIA/CUDA
[01:55:02] Core                   15
[01:55:02] 
[01:55:02] Window's signal control handler registered.
[01:55:02] Preparing to commence simulation
[01:55:02] - Assembly optimizations manually forced on.
[01:55:02] - Not checking prior termination.
[01:55:02] sizeof(CORE_PACKET_HDR) = 512 file=<>
[01:55:02] - Expanded 66258 -> 285182 (decompressed 430.4 percent)
[01:55:02] Called DecompressByteArray: compressed_data_size=66258 data_size=285182, decompressed_data_size=285182 diff=0
[01:55:02] - Digital signature verified
[01:55:02] 
[01:55:02] Project: 8010 (Run 0, Clone 1943, Gen 42)
[01:55:02] 
[01:55:02] Assembly optimizations on if available.
[01:55:02] Entering M.D.
[01:55:04] Tpr hash work/wudata_06.tpr:  767648489 3157139368 2452372833 4100391171 2691342935
[01:55:04] GPU device info: vendor=0 device=0 name=<NA> match=0
[01:55:04] Working on Good ROcking Metal Altar for Chronical Sinners
[01:55:04] Client config found, loading data.
[01:55:04] Starting GUI Server
[01:56:06] Setting checkpoint frequency: 500000
[01:56:06] Completed         3 out of 50000000 steps (0%).
[01:59:22] Completed    500000 out of 50000000 steps (1%).
[02:02:38] Completed   1000000 out of 50000000 steps (2%).
[02:05:55] Completed   1500000 out of 50000000 steps (3%).
[02:09:11] Completed   2000000 out of 50000000 steps (4%).
[02:12:28] Completed   2500000 out of 50000000 steps (5%).
[02:15:46] Completed   3000000 out of 50000000 steps (6%).
[02:19:04] Completed   3500000 out of 50000000 steps (7%).
[02:22:22] Completed   4000000 out of 50000000 steps (8%).
[02:25:41] Completed   4500000 out of 50000000 steps (9%).
[02:28:59] Completed   5000000 out of 50000000 steps (10%).
[02:32:16] Completed   5500000 out of 50000000 steps (11%).
[02:35:32] Completed   6000000 out of 50000000 steps (12%).
[02:38:49] Completed   6500000 out of 50000000 steps (13%).
[02:42:05] Completed   7000000 out of 50000000 steps (14%).
[02:45:22] Completed   7500000 out of 50000000 steps (15%).
[02:48:39] Completed   8000000 out of 50000000 steps (16%).
[02:51:54] Completed   8500000 out of 50000000 steps (17%).
[02:55:11] Completed   9000000 out of 50000000 steps (18%).
[02:58:28] Completed   9500000 out of 50000000 steps (19%).
[03:01:45] Completed  10000000 out of 50000000 steps (20%).
[03:05:02] Completed  10500000 out of 50000000 steps (21%).
[03:08:19] Completed  11000000 out of 50000000 steps (22%).
[03:11:36] Completed  11500000 out of 50000000 steps (23%).
[03:14:53] Completed  12000000 out of 50000000 steps (24%).
[03:18:10] Completed  12500000 out of 50000000 steps (25%).
[03:21:26] Completed  13000000 out of 50000000 steps (26%).
[03:24:42] Completed  13500000 out of 50000000 steps (27%).
[03:27:59] Completed  14000000 out of 50000000 steps (28%).
[03:31:16] Completed  14500000 out of 50000000 steps (29%).
[03:34:32] Completed  15000000 out of 50000000 steps (30%).
[03:37:49] Completed  15500000 out of 50000000 steps (31%).
[03:41:06] Completed  16000000 out of 50000000 steps (32%).
[03:44:23] Completed  16500000 out of 50000000 steps (33%).
[03:47:40] Completed  17000000 out of 50000000 steps (34%).
[03:50:55] Completed  17500000 out of 50000000 steps (35%).
[03:54:11] Completed  18000000 out of 50000000 steps (36%).
[03:56:21] Completed  18500000 out of 50000000 steps (37%).
[03:56:21] mdrun_gpu returned 54
[03:56:21] Nonzero force sum on GPU
[03:56:21] 
[03:56:21] Folding@home Core Shutdown: UNSTABLE_MACHINE
[03:56:24] CoreStatus = 7A (122)
[03:56:24] Sending work to server
[03:56:24] Project: 8010 (Run 0, Clone 1943, Gen 42)
[03:56:24] - Error: Could not get length of results file work/wuresults_06.dat
[03:56:24] - Error: Could not read unit 06 file. Removing from queue.
[03:56:25] Trying to send all finished work units
[03:56:25] + No unsent completed units remaining.
[03:56:25] - Preparing to get new work unit...
[03:56:25] Cleaning up work directory
Image
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project: 8010 (Run 0, Clone 1943, Gen 42)

Post by bruce »

No WUs have been returned to the server, but I can see from your log that V6 removes the WU from the queue without making a report. That's one of the issues that V7 is supposed to fix.

The nonzero force sum means that the GPU failed the self-test. Have you tried underclocking?
v00d00
Posts: 396
Joined: Sun Dec 02, 2007 4:53 am
Hardware configuration: FX8320e (6 cores enabled) @ stock,
- 16GB DDR3,
- Zotac GTX 1050Ti @ Stock.
- Gigabyte GTX 970 @ Stock
Debian 9.

Running GPU since it came out, CPU since client version 3.
Folding since Folding began (~2000) and ran Genome@Home for a while too.
Ran Seti@Home prior to that.
Location: UK
Contact:

Re: Project: 8010 (Run 0, Clone 1943, Gen 42)

Post by v00d00 »

Cant change clocks on linux. Can do it on windows. The things that gets me is its done so many of them and now i have an error.

The one i got after is up to 49% so far, if it completes i'll put it down to a quirk. It will complete.

I always wonder whether to report stuff that happens on this box, due to its configuration.

BTW, i have the work directory, should anyone have any interest in it, since it was upto 10% on the third run when i killed it and tarred the dir, then wiped it out etc, so i could get something different. But if someone has need of those files i can upload it to my web server.
Image
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project: 8010 (Run 0, Clone 1943, Gen 42)

Post by bruce »

Please do.

I won't be able to do anything with it personally, but the folks that work on problems with the FahCore can try to reproduce it. If it processes past 38% they will have to assume it's a hardware issue with your system (i.e.- actually "Unstable Machine") but if it's reproducible on their machine (as you suspect it will be) they'll be able to learn something from the failure.
v00d00
Posts: 396
Joined: Sun Dec 02, 2007 4:53 am
Hardware configuration: FX8320e (6 cores enabled) @ stock,
- 16GB DDR3,
- Zotac GTX 1050Ti @ Stock.
- Gigabyte GTX 970 @ Stock
Debian 9.

Running GPU since it came out, CPU since client version 3.
Folding since Folding began (~2000) and ran Genome@Home for a while too.
Ran Seti@Home prior to that.
Location: UK
Contact:

Re: Project: 8010 (Run 0, Clone 1943, Gen 42)

Post by v00d00 »

Well since that one failed ive done 2 more units from P8010 without any problem.

So here is the work/ dir for anyone that might find it useful.

P8010R0C1943G42.tgz
Image
bollix47
Posts: 2941
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: Project: 8010 (Run 0, Clone 1943, Gen 42)

Post by bollix47 »

Another folder was able to complete this WU:

Hi xxxxxx (team xxxxxx),
Your WU (P8010 R0 C1943 G42) was added to the stats database on 2012-09-06 23:08:13 for 2510 points of credit.
Post Reply