[Solved]ETA and Timeout are the same time

If you're new to FAH and need help getting started or you have very basic questions, start here.

Moderators: Site Moderators, FAHC Science Team

Tununias
Posts: 8
Joined: Thu Apr 02, 2020 4:34 am

[Solved]ETA and Timeout are the same time

Post by Tununias »

Sorry if this is in the wrong section. I only started folding last week so I'm a new donor. I also tried searching for this first.
Anyway, I just got a new cpu WU which uses 6 of my 8 cores on my Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz because I also have the gpu set up for WUs and I read that 7 cores is problematic so I manually set the cpu cores available to 6. So I have this new cpu WU that says the following:

Progress: 0.97%
ETA: 5.95 days
Base Credit: 900
Estimated Credit: 900
Estimated PPD: 180
Estimated TPF: 1 hours 12 mins
Project: 13821
FahCore: 0xa7
Assigned: 2020-04-02T03:31:15Z
Timeout: 2020-04-07T03:31:15Z
Expiration 2020-04-09T03:31:15Z

When I first got the WU, the ETA was exactly 5 days. The timeout is also exactly 5 days which means that the WU will reach the timeout while I'm uploading the results and be simultaneously be downloading the same WU to somebody else. I removed the gpu folding slot and set the cpu to use all 8 threads, but I get this message:

04:01:16:WARNING:WU00:FS00:Changed SMP threads from 6 to 8 this can cause some work units to fail
04:01:16:WARNING:WU00:FS00:AS lowered CPUs from 8 to 6

I don't care so much about the points, but I'm wondering if I'm wasting the project's time by not canceling early to let someone with a better cpu do this job. This is the first time I got one like this. Most WUs will finish in like 3 hours and I didn't set the option to allow those really big WUs. Anyway, I'll keep running this unless I hear otherwise. Thanks.
Last edited by Tununias on Fri Apr 03, 2020 2:41 am, edited 1 time in total.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: ETA and Timeout are the same time

Post by bruce »

Question: When you first got the WU, how would it have been possible to estimate when your computer would be able to finish it? (Personally, I wish somebody would inhibit that message until 5 or 10% of the WU has been completed and a realistic estimate can be provided.

The estimated completion is a complete guess which will be refined as soon as your computer completes enough of the WU to get some sense of the speed of your computer. It will continue to be refined as it gathers more data.

The message about changing the number of SMP thread count is a generic message because of limitations in the GROMACS software which may not be able to process the assignment with the number of threads you've selected. Actually both 6 and 8 are safe, but the message is triggered by any change.
Neil-B
Posts: 2027
Joined: Sun Mar 22, 2020 5:52 pm
Hardware configuration: 1: 2x Xeon E5-2697v3@2.60GHz, 512GB DDR4 LRDIMM, SSD Raid, Win10 Ent 20H2, Quadro K420 1GB, FAH 7.6.21
2: Xeon E3-1505Mv5@2.80GHz, 32GB DDR4, NVME, Win10 Pro 20H2, Quadro M1000M 2GB, FAH 7.6.21 (actually have two of these)
3: i7-960@3.20GHz, 12GB DDR3, SSD, Win10 Pro 20H2, GTX 750Ti 2GB, GTX 1080Ti 11GB, FAH 7.6.21
Location: UK

Re: ETA and Timeout are the same time

Post by Neil-B »

You are still within the expiration date … the WU may be reissued at Timeout to someone else with who might take a few days so by continuing to process the WU you are not wasting the projects time … If a WU looks as if it will exceed the Expiration then posting here and getting it checked (as there maybe some issue with the WU that needs to be known about would be a good thing - at which point some one from the team would indicate that work on it should be discontinued and help you with the best/proper way to do this.
2x Xeon E5-2697v3, 512GB DDR4 LRDIMM, SSD Raid, W10-Ent, Quadro K420
Xeon E3-1505Mv5, 32GB DDR4, NVME, W10-Pro, Quadro M1000M
i7-960, 12GB DDR3, SSD, W10-Pro, GTX1080Ti
i9-10850K, 64GB DDR4, NVME, W11-Pro, RTX3070

(Green/Bold = Active)
Tununias
Posts: 8
Joined: Thu Apr 02, 2020 4:34 am

Re: ETA and Timeout are the same time

Post by Tununias »

I'm confused now. I'm on Arch Linux and I did a system update that I do weekly. I paused the job, manually closed the client and restarted the computer. When I start everything up again, I have to reconfigure everything and the WU is missing. I went to the log file in /opt/fah/log.txt and the last thing on it is me shutting it down.
So in the middle of me writing this and looking around, I see that after restarting, the files that are in /ops/fah have been recreated in my home directory. The config.xml file and configs directory, cores directory, work directory, GPUs.txt file and log.txt file are all on my home directory and everything in /opts/fah is being ignored.
I don't know how to make it go back to its old files. The WU is probably still there but I don't see how to tell it to finish that one. Here's what it says about the WU in the old log file:
Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:13821 run:707 clone:1 gen:83 core:0xa7 unit:0x0000006480fccb095c88386e61558b93
I'm not sure where to report this as abandoned...
Tununias
Posts: 8
Joined: Thu Apr 02, 2020 4:34 am

Re: ETA and Timeout are the same time

Post by Tununias »

Ok, never mind I'm dumb. After I restarted, I ran the client from my home holder instead of from in /opt/fah/. I have one WU that will be completed in 8 hours and then I'll restart the client and finish up that other one if I don't hear otherwise.
Tununias
Posts: 8
Joined: Thu Apr 02, 2020 4:34 am

Re: ETA and Timeout are the same time

Post by Tununias »

So now the ETA is 12.09 days and progress is at 1.28%. This WU is absurd.
anandhanju
Posts: 526
Joined: Mon Dec 03, 2007 4:33 am
Location: Australia

Re: ETA and Timeout are the same time

Post by anandhanju »

Can you post the contents of your FAH log? There have been reports a few bad WUs that have too many steps and looking at the log would help if this was the case with your WU.
Tununias
Posts: 8
Joined: Thu Apr 02, 2020 4:34 am

Re: ETA and Timeout are the same time

Post by Tununias »

anandhanju wrote:Can you post the contents of your FAH log? There have been reports a few bad WUs that have too many steps and looking at the log would help if this was the case with your WU.
Ok, this is from when I started up again:

Code: Select all

23:25:05:INFO(1):Read GPUs.txt
23:25:05:************************* Folding@home Client *************************
23:25:05:        Website: https://foldingathome.org/
23:25:05:      Copyright: (c) 2009-2018 foldingathome.org
23:25:05:         Author: Joseph Coffland <joseph@cauldrondevelopment.com>
23:25:05:           Args: 
23:25:05:         Config: /opt/fah/config.xml
23:25:05:******************************** Build ********************************
23:25:05:        Version: 7.5.1
23:25:05:           Date: May 11 2018
23:25:05:           Time: 19:59:04
23:25:05:     Repository: Git
23:25:05:       Revision: 4705bf53c635f88b8fe85af7675557e15d491ff0
23:25:05:         Branch: master
23:25:05:       Compiler: GNU 6.3.0 20170516
23:25:05:        Options: -std=gnu++98 -O3 -funroll-loops
23:25:05:       Platform: linux2 4.14.0-3-amd64
23:25:05:           Bits: 64
23:25:05:           Mode: Release
23:25:05:******************************* System ********************************
23:25:05:            CPU: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
23:25:05:         CPU ID: GenuineIntel Family 6 Model 60 Stepping 3
23:25:05:           CPUs: 8
23:25:05:         Memory: 31.32GiB
23:25:05:    Free Memory: 17.34GiB
23:25:05:        Threads: POSIX_THREADS
23:25:05:     OS Version: 5.5
23:25:05:    Has Battery: false
23:25:05:     On Battery: false
23:25:05:     UTC Offset: -4
23:25:05:            PID: 36798
23:25:05:            CWD: /opt/fah
23:25:05:             OS: Linux 5.5.13-arch2-1 x86_64
23:25:05:        OS Arch: AMD64
23:25:05:           GPUs: 1
23:25:05:          GPU 0: Bus:1 Slot:0 Func:0 NVIDIA:7 TU117 [GeForce GTX 1650]
23:25:05:  CUDA Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:7.5 Driver:10.2
23:25:05:OpenCL Device 0: Platform:0 Device:0 Bus:1 Slot:0 Compute:1.2 Driver:440.64
23:25:05:***********************************************************************
23:25:05:<config>
23:25:05:  <!-- Network -->
23:25:05:  <proxy v=':8080'/>
23:25:05:
23:25:05:  <!-- Slot Control -->
23:25:05:  <power v='full'/>
23:25:05:
23:25:05:  <!-- User Information -->
23:25:05:  <passkey v='********************************'/>
23:25:05:  <team v='45032'/>
23:25:05:  <user v='Tununias'/>
23:25:05:
23:25:05:  <!-- Folding Slots -->
23:25:05:  <slot id='0' type='CPU'>
23:25:05:    <cpus v='8'/>
23:25:05:    <paused v='true'/>
23:25:05:  </slot>
23:25:05:</config>
23:25:05:Trying to access database...
23:25:05:Successfully acquired database lock
23:25:05:Enabled folding slot 00: PAUSED cpu:8 (by user)
23:25:18:FS00:Unpaused
23:25:18:WU00:FS00:Starting
23:25:18:WARNING:WU00:FS00:AS lowered CPUs from 8 to 6
23:25:18:WU00:FS00:Running FahCore: /opt/fah/FAHCoreWrapper /opt/fah/cores/cores.foldingathome.org/v7/lin/64bit/avx/Core_a7.fah/FahCore_a7 -dir 00 -suffix 01 -version 705 -lifeline 36798 -checkpoint 15 -np 6
23:25:18:WU00:FS00:Started FahCore on PID 36817
23:25:18:WU00:FS00:Core PID:36821
23:25:18:WU00:FS00:FahCore 0xa7 started
23:25:19:WU00:FS00:0xa7:*********************** Log Started 2020-04-02T23:25:19Z ***********************
23:25:19:WU00:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
23:25:19:WU00:FS00:0xa7:       Type: 0xa7
23:25:19:WU00:FS00:0xa7:       Core: Gromacs
23:25:19:WU00:FS00:0xa7:       Args: -dir 00 -suffix 01 -version 705 -lifeline 36817 -checkpoint 15 -np
23:25:19:WU00:FS00:0xa7:             6
23:25:19:WU00:FS00:0xa7:************************************ CBang *************************************
23:25:19:WU00:FS00:0xa7:       Date: Nov 5 2019
23:25:19:WU00:FS00:0xa7:       Time: 06:06:57
23:25:19:WU00:FS00:0xa7:   Revision: 46c96f1aa8419571d83f3e63f9c99a0d602f6da9
23:25:19:WU00:FS00:0xa7:     Branch: master
23:25:19:WU00:FS00:0xa7:   Compiler: GNU 8.3.0
23:25:19:WU00:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie -fPIC
23:25:19:WU00:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
23:25:19:WU00:FS00:0xa7:       Bits: 64
23:25:19:WU00:FS00:0xa7:       Mode: Release
23:25:19:WU00:FS00:0xa7:************************************ System ************************************
23:25:19:WU00:FS00:0xa7:        CPU: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
23:25:19:WU00:FS00:0xa7:     CPU ID: GenuineIntel Family 6 Model 60 Stepping 3
23:25:19:WU00:FS00:0xa7:       CPUs: 8
23:25:19:WU00:FS00:0xa7:     Memory: 31.32GiB
23:25:19:WU00:FS00:0xa7:Free Memory: 17.28GiB
23:25:19:WU00:FS00:0xa7:    Threads: POSIX_THREADS
23:25:19:WU00:FS00:0xa7: OS Version: 5.5
23:25:19:WU00:FS00:0xa7:Has Battery: false
23:25:19:WU00:FS00:0xa7: On Battery: false
23:25:19:WU00:FS00:0xa7: UTC Offset: -4
23:25:19:WU00:FS00:0xa7:        PID: 36821
23:25:19:WU00:FS00:0xa7:        CWD: /opt/fah/work
23:25:19:WU00:FS00:0xa7:******************************** Build - libFAH ********************************
23:25:19:WU00:FS00:0xa7:    Version: 0.0.18
23:25:19:WU00:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
23:25:19:WU00:FS00:0xa7:  Copyright: 2019 foldingathome.org
23:25:19:WU00:FS00:0xa7:   Homepage: https://foldingathome.org/
23:25:19:WU00:FS00:0xa7:       Date: Nov 5 2019
23:25:19:WU00:FS00:0xa7:       Time: 06:13:26
23:25:19:WU00:FS00:0xa7:   Revision: 490c9aa2957b725af319379424d5c5cb36efb656
23:25:19:WU00:FS00:0xa7:     Branch: master
23:25:19:WU00:FS00:0xa7:   Compiler: GNU 8.3.0
23:25:19:WU00:FS00:0xa7:    Options: -std=c++11 -O3 -funroll-loops -fno-pie
23:25:19:WU00:FS00:0xa7:   Platform: linux2 4.19.0-5-amd64
23:25:19:WU00:FS00:0xa7:       Bits: 64
23:25:19:WU00:FS00:0xa7:       Mode: Release
23:25:19:WU00:FS00:0xa7:************************************ Build *************************************
23:25:19:WU00:FS00:0xa7:       SIMD: avx_256
23:25:19:WU00:FS00:0xa7:********************************************************************************
23:25:19:WU00:FS00:0xa7:Project: 13821 (Run 707, Clone 1, Gen 83)
23:25:19:WU00:FS00:0xa7:Unit: 0x0000006480fccb095c88386e61558b93
23:25:19:WU00:FS00:0xa7:Digital signatures verified
23:25:19:WU00:FS00:0xa7:Calling: mdrun -s frame83.tpr -o frame83.trr -x frame83.xtc -cpi state.cpt -cpt 15 -nt 6
23:25:19:WU00:FS00:0xa7:Steps: first=10375000 total=10375000
23:25:21:WU00:FS00:0xa7:Completed 70002 out of 10375000 steps (0%)
23:26:06:Removing old file 'configs/config-20200329-072135.xml'
23:26:06:Saving configuration to config.xml
23:26:06:<config>
23:26:06:  <!-- Network -->
23:26:06:  <proxy v=':8080'/>
23:26:06:
23:26:06:  <!-- Slot Control -->
23:26:06:  <power v='full'/>
23:26:06:
23:26:06:  <!-- User Information -->
23:26:06:  <passkey v='********************************'/>
23:26:06:  <team v='45032'/>
23:26:06:  <user v='Tununias'/>
23:26:06:
23:26:06:  <!-- Folding Slots -->
23:26:06:  <slot id='0' type='CPU'>
23:26:06:    <cpus v='8'/>
23:26:06:  </slot>
23:26:06:</config>
00:23:18:WU00:FS00:0xa7:Completed 103750 out of 10375000 steps (1%)
My other computer only has 500,000 steps and this one is 10,375,000 steps...
It's currently 01:42:00 at 1.45% progress.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: ETA and Timeout are the same time

Post by bruce »

Did anybody read my earlier post? Lets try again.

See my more complete explanation here
Tununias
Posts: 8
Joined: Thu Apr 02, 2020 4:34 am

Re: ETA and Timeout are the same time

Post by Tununias »

bruce wrote:Did anybody read my earlier post? Lets try again.

See my more complete explanation here
It's kind of hard to when I made my last post before you made that one. Also It's been going for 2 hours and 48 minutes plus 30-60 minutes the day before and it's only at 1.63% with an ETA of 12.05 days. This is no wild guess.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: ETA and Timeout are the same time

Post by bruce »

Assignment for CPU or for GPU?
Running continuously, or starting and stopping?

Are the ETA and Timeout still the same time?
Tohya
Posts: 49
Joined: Thu Feb 07, 2008 12:41 am

Re: ETA and Timeout are the same time

Post by Tohya »

10M steps is a misconfigured project.

Project 13821 was already reported. viewtopic.php?f=19&t=33844
Tununias
Posts: 8
Joined: Thu Apr 02, 2020 4:34 am

Re: ETA and Timeout are the same time

Post by Tununias »

bruce wrote:Assignment for CPU or for GPU?
Running continuously, or starting and stopping?
When I first posted I ran it between 1/2 to 1 hours before I did a reboot. Now it's been going for 3 hours and 8 minutes.
This is for CPU using 6 threads. It gets 0.01% of progress every 100 seconds I have another computer for the same generation with 4 threads and it moves 0.01% every 3 seconds.
So the regular WU on my other computer is 33.33% faster and is estimated for 8 hours. And since I've yet to see a WU suddenly speed up it's progress, I'm calculating 266.64 hours or 11.11 days. Although the client is saying 12.03 days.
Last edited by Tununias on Fri Apr 03, 2020 2:42 am, edited 1 time in total.
Tununias
Posts: 8
Joined: Thu Apr 02, 2020 4:34 am

Re: ETA and Timeout are the same time

Post by Tununias »

Tohya wrote:10M steps is a misconfigured project.

Project 13821 was already reported. viewtopic.php?f=19&t=33844
Ok. I guess I'll delete this WU then. Thank you. :D
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: ETA and Timeout are the same time

Post by bruce »

Yes, follow the link given above and dump the WU with 10375000 steps.
Post Reply