Page 1 of 2

P7520(120,6,0) - slow? [Resolved- Bad WU]

PostPosted: Thu Mar 27, 2014 8:46 pm
by billford
My Linux box has been happily folding P7250s for some time, with a TPF around 8 minutes, then it downloaded 7520(120,6,0) and is suddenly running at a TPF of ~15 minutes. Nothing has changed. The GPU is progressing normally.

I know the TPF for some projects can change quite markedly along the trajectory, is this one of them? I noticed that an earlier run 106 was up to about 9 minutes, but a jump to 15 seems a bit drastic!

Log:

Code: Select all
19:12:04:WU00:FS00:Connecting to assign3.stanford.edu:8080
19:12:05:WU00:FS00:News: Welcome to Folding@Home
19:12:05:WU00:FS00:Assigned to work server 128.143.199.97
19:12:05:WU00:FS00:Requesting new work unit for slot 00: READY cpu:3 from 128.143.199.97
19:12:05:WU00:FS00:Connecting to 128.143.199.97:8080
19:12:06:WU00:FS00:Downloading 1.80MiB
19:12:12:WU00:FS00:Download 93.77%
19:12:12:WU00:FS00:Download complete
19:12:12:WU00:FS00:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:7520 run:120 clone:6 gen:0 core:0xa4 unit:0x00000000fbcb017d51229ad3ef03ca12
19:12:12:WU00:FS00:Starting
19:12:12:WU00:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/www.stanford.edu/~pande/Linux/AMD64/Core_a4.fah/FahCore_a4 -dir 00 -suffix 01 -version 703 -lifeline 1312 -checkpoint 15 -np 3
19:12:12:WU00:FS00:Started FahCore on PID 2104
19:12:12:WU00:FS00:Core PID:2108
19:12:12:WU00:FS00:FahCore 0xa4 started
19:12:13:WU00:FS00:0xa4:
19:12:13:WU00:FS00:0xa4:*------------------------------*
19:12:13:WU00:FS00:0xa4:Folding@Home Gromacs GB Core
19:12:13:WU00:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
19:12:13:WU00:FS00:0xa4:
19:12:13:WU00:FS00:0xa4:Preparing to commence simulation
19:12:13:WU00:FS00:0xa4:- Looking at optimizations...
19:12:13:WU00:FS00:0xa4:- Created dyn
19:12:13:WU00:FS00:0xa4:- Files status OK
19:12:13:WU00:FS00:0xa4:- Expanded 1886562 -> 3322796 (decompressed 176.1 percent)
19:12:13:WU00:FS00:0xa4:Called DecompressByteArray: compressed_data_size=1886562 data_size=3322796, decompressed_data_size=3322796 diff=0
19:12:13:WU00:FS00:0xa4:- Digital signature verified
19:12:13:WU00:FS00:0xa4:
19:12:13:WU00:FS00:0xa4:Project: 7520 (Run 120, Clone 6, Gen 0)
19:12:13:WU00:FS00:0xa4:
19:12:13:WU00:FS00:0xa4:Assembly optimizations on if available.
19:12:13:WU00:FS00:0xa4:Entering M.D.
19:12:19:WU00:FS00:0xa4:Completed 0 out of 1000000 steps  (0%)
19:27:54:WU00:FS00:0xa4:Completed 10000 out of 1000000 steps  (1%)
19:43:21:WU00:FS00:0xa4:Completed 20000 out of 1000000 steps  (2%)


And because I know someone will ask:


System and config:

Code: Select all
*********************** Log Started 2014-03-27T18:50:44Z ***********************
18:50:44:************************* Folding@home Client *************************
18:50:44:    Website: http://folding.stanford.edu/
18:50:44:  Copyright: (c) 2009-2013 Stanford University
18:50:44:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
18:50:44:       Args: --child --lifeline 1095 /etc/fahclient/config.xml --run-as
18:50:44:             fahclient --pid-file=/var/run/fahclient.pid --daemon
18:50:44:     Config: /etc/fahclient/config.xml
18:50:44:******************************** Build ********************************
18:50:44:    Version: 7.3.6
18:50:44:       Date: Feb 18 2013
18:50:44:       Time: 07:24:08
18:50:44:    SVN Rev: 3923
18:50:44:     Branch: fah/trunk/client
18:50:44:   Compiler: GNU 4.4.7
18:50:44:    Options: -std=gnu++98 -O3 -funroll-loops -mfpmath=sse -ffast-math
18:50:44:             -fno-unsafe-math-optimizations -msse2
18:50:44:   Platform: linux2 3.2.0-1-amd64
18:50:44:       Bits: 64
18:50:44:       Mode: Release
18:50:44:******************************* System ********************************
18:50:44:        CPU: Intel(R) Core(TM) i5-4430 CPU @ 3.00GHz
18:50:44:     CPU ID: GenuineIntel Family 6 Model 60 Stepping 3
18:50:44:       CPUs: 4
18:50:44:     Memory: 3.82GiB
18:50:44:Free Memory: 3.57GiB
18:50:44:    Threads: POSIX_THREADS
18:50:44:Has Battery: false
18:50:44: On Battery: false
18:50:44: UTC offset: 0
18:50:44:        PID: 1312
18:50:44:        CWD: /var/lib/fahclient
18:50:44:         OS: Linux 3.11.0-12-generic x86_64
18:50:44:    OS Arch: AMD64
18:50:44:       GPUs: 1
18:50:44:      GPU 0: NVIDIA:3 GK106 [GeForce GTX 650 Ti]
18:50:44:       CUDA: 3.0
18:50:44:CUDA Driver: 5050
18:50:44:***********************************************************************
18:50:44:<config>
18:50:44:  <!-- Client Control -->
18:50:44:  <fold-anon v='true'/>
18:50:44:
18:50:44:  <!-- Folding Slot Configuration -->
18:50:44:  <power v='full'/>
18:50:44:
18:50:44:  <!-- HTTP Server -->
18:50:44:  <allow v='127.0.0.1 192.168.1.0/24'/>
18:50:44:
18:50:44:  <!-- Network -->
18:50:44:  <proxy v=':8080'/>
18:50:44:
18:50:44:  <!-- Remote Command Server -->
18:50:44:  <command-allow-no-pass v='127.0.0.1 192.168.1.0/24'/>
18:50:44:
18:50:44:  <!-- Slot Control -->
18:50:44:  <pause-on-start v='true'/>
18:50:44:
18:50:44:  <!-- User Information -->
18:50:44:  <passkey v='********************************'/>
18:50:44:  <user v='<removed>'/>
18:50:44:
18:50:44:  <!-- Folding Slots -->
18:50:44:  <slot id='0' type='CPU'>
18:50:44:    <client-type v='advanced'/>
18:50:44:    <cpus v='3'/>
18:50:44:    <next-unit-percentage v='100'/>
18:50:44:  </slot>
18:50:44:  <slot id='1' type='GPU'>
18:50:44:    <client-type v='advanced'/>
18:50:44:    <next-unit-percentage v='100'/>
18:50:44:  </slot>
18:50:44:</config>

Re: P7520(120,6,0) - slow?

PostPosted: Thu Mar 27, 2014 10:10 pm
by billford
A further thought- I've got a (much lower powered) Linux laptop that also picks up P7520's quite frequently, if it gets one like that it might have a problem meeting the expiry deadline (it runs 24/7)… suggestions?

Re: P7520(120,6,0) - slow?

PostPosted: Thu Mar 27, 2014 10:20 pm
by Joe_H
Could you look up runs of other Project 7520 WU's in your logs and check the number of steps listed? This WU is listed for 1000000 steps, if the others have a different number that would point to a bad WU. In the past a few problem WU's have been created by a WS generating them with an incorrect number of steps.

Re: P7520(120,6,0) - slow?

PostPosted: Thu Mar 27, 2014 10:31 pm
by billford
Rummaging through all the logs might take a while, but I've got a P7520 running normally on the laptop mentioned above, and that's 500000 steps, as are the four P7520's it completed prior to the current one. So it looks like you have a point :)

What should I do, dump this one?

Re: P7520(120,6,0) - slow?

PostPosted: Thu Mar 27, 2014 10:36 pm
by Joe_H
Yes, dump this one and I will report it as a bad WU.

P.S. I have run these in the past and 1000000 did not look right. But I am at work and can not look it up in my own logs.

Re: P7520(120,6,0) - slow?

PostPosted: Thu Mar 27, 2014 10:38 pm
by billford
Will do.

I'll keep an eye open for any more with the same fault and report them in this topic if that's OK?

Re: P7520(120,6,0) - slow?

PostPosted: Thu Mar 27, 2014 10:51 pm
by billford
It's picked up one with the right number of steps and a more sensible TPF, thanks for your help :)

Re: P7520(120,6,0) - slow? [Resolved- Bad WU]

PostPosted: Fri Mar 28, 2014 7:08 pm
by bruce
The person who can fix this problem has been notified.

Re: P7520(120,6,0) - slow? [Resolved- Bad WU]

PostPosted: Fri Mar 28, 2014 7:25 pm
by billford
Thanks Bruce.

P7520 (Run 5, Clone 7, Gen 0) Slow?

PostPosted: Fri Mar 28, 2014 7:41 pm
by parkut
Another bad one? My Quad Core Linux box, reporting a very low PPD

Code: Select all
model name   : Intel(R) Core(TM)2 Quad CPU    Q8300  @ 2.50GHz
cpu MHz      : 2497.000
cache size   : 2048 KB
Memory: 1.95GiB
...
Client Version:   7.3.6
Core: FahCore_a4.exe
Core Version:  2.27 (Dec. 15, 2010)
Current Work Unit
-----------------
Name: p7520_ctx-mut
Tag: P7520R5C7G0
Download time: March 27 19:52:33
Due time: April 02 19:52:33
Progress: 69%  [||||||____]
...
Project: 7520 (Run 5, Clone 7, Gen 0)
basecredit: 850
ppd: 2550
creditestimate: 3435
...
18:28:05:WU01:FS00:0xa4:Completed 690000 out of 1000000 steps  (69%)
18:08:42:WU01:FS00:0xa4:Completed 680000 out of 1000000 steps  (68%)
17:49:19:WU01:FS00:0xa4:Completed 670000 out of 1000000 steps  (67%)

Re: P7520(120,6,0) - slow? [Resolved- Bad WU]

PostPosted: Fri Mar 28, 2014 7:46 pm
by Joe_H
Yes, that does appear to be a "bad" WU. Normal number of steps for one from Project 7520 is 500000. It might not be bad in the sense of inaccurate simulation in that it might calculate all the way to finishing, but the points, etc. will be off.

Re: P7520(120,6,0) - slow? [Resolved- Bad WU]

PostPosted: Fri Mar 28, 2014 7:55 pm
by bruce
bruce wrote:The person who can fix this problem has been notified.


As you can probably tell from comments in this topic, the problem has occurred before. The project was shut down temporarily. The bad WUs were identified and corrected and that project resumed. My guess is that the same process will be followed this time. I don't know enough facts, so this prediction may or may not apply this time.

Re: P7520(120,6,0) - slow? [Resolved- Bad WU]

PostPosted: Fri Mar 28, 2014 8:04 pm
by billford
bruce wrote:I don't know enough facts, so this prediction may or may not apply this time.

Not to worry, it's a (very) little milestone for me- my first bad WU :wink:

(Even if I didn't realise what it was at the time, and thought it was a problem at my end!)

Re: P7520(120,6,0) - slow? [Resolved- Bad WU]

PostPosted: Fri Oct 03, 2014 8:45 pm
by orion456
My quadcore q6600 folding p7520(R33,C1,G418) and (R63,C2,G414) are showing 1100 ppd where normally SMP gets 7 to 10,000 on my system. Something must still be wrong with those WUs.

Re: P7520(120,6,0) - slow? [Resolved- Bad WU]

PostPosted: Tue Oct 07, 2014 9:41 pm
by orion456
I continue to get the p7520 WUs and at 1100 ppd, they aren't worth the power necessary to run them. If I continue to get these, I'm going to shut down those folders until they are fixed.