CPU not folding

Moderators: Site Moderators, PandeGroup

CPU not folding

Postby rickoic » Thu Jul 06, 2017 2:37 pm

My 12 Core (downgraded to 10 by client) was folding just fine and then all of a sudden it stopped.

Log indicates that it gets sent to server 171.67.108.45 and says it fails to properly respond.
Then gets sent to 171.64.65.35 and gets a response Empty Work Server.

Went to Server Status page and couldn't find either one of those servers listed??
Duel 2.8 3 250's Quad 2.4 285. 260, Quad 2.4 3 250 , i7 2.27 2 250 GPU's, i7 2.24 2 250 GPU's, i7 3.06 bigadv, duel Xeon 2.27 bigadv, AMD Phenom ][ 3 250 GPU's, Laptop GT 130M.
I'm folding because Dec 2005 I had radical prostrate surgery.
rickoic
 
Posts: 304
Joined: Sat May 23, 2009 4:49 pm
Location: Mississippi near Memphis, Tn

Re: CPU not folding

Postby JimboPalmer » Thu Jul 06, 2017 2:41 pm

You do know that the servers are all down today and tomorrow, right? viewtopic.php?f=2&t=30126

If you could post the first 100 lines of the log with the configuration and the part of the log that fails, we would make even better guesses why folding has stopped.
Tsar of all the Rushers
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
JimboPalmer
 
Posts: 491
Joined: Mon Feb 16, 2009 4:12 am
Location: Greenwood MS USA

Re: CPU not folding

Postby Joe_H » Thu Jul 06, 2017 3:47 pm

rickoic wrote:Went to Server Status page and couldn't find either one of those servers listed??


Those servers are Assignment Servers, they are no longer tracked on the Server Status page. Currently just Work Servers and Collection Servers are posted there.

As already mentioned, many of the servers at Stanford are down due to the announced electrical work, the fact your client can connect to either one of the AS just means it is in a different location on the campus. It may not have any WS's up with assignments for a multiple of 5 to connect your system to. Try pausing the slot and setting it for 8 threads, possibly it can connect your system to one of the WS's not at Stanford that may have suitable work.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Joe_H
Site Admin
 
Posts: 3814
Joined: Tue Apr 21, 2009 4:41 pm
Location: W. MA

Re: CPU not folding

Postby bruce » Thu Jul 06, 2017 10:16 pm

All of the Assignment Servers have been down this morning and are likely to be down until some time tomorrow. A result is that NOBODY is getting any new WUs.

Several of my hosts are still folding WUs that were distributed yesterday but as they finish up, they attempt to upload and ONLY if the Work Server or Collection Server is at one of the non-Stanford sites, it will probably succeed, but it still won't be getting any new work. The client will keep retrying until it gets responses some time tomorrow.
bruce
 
Posts: 21278
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: CPU not folding

Postby davidcoton » Thu Jul 06, 2017 10:25 pm

Just to be a bit contrary:

I set all units to "Finish" last night (UK time), so as not to get any "stuck" WUs since I will be unable to resolve issues when the servers are due back. I shut down my dedicated folding rig, but left the office PC (CPU folding only) running with FAH in the paused state after it uploaded. Then it crashed (clearly it did not like not folding :(), I rebooted, FAH restarted and immediately got an assignment from the 2nd AS. Hope I can finish it before the server power goes (again?).

Code: Select all
*********************** Log Started 2017-07-06T17:21:13Z ***********************
17:21:13:************************* Folding@home Client *************************
17:21:13:        Website: http://folding.stanford.edu/
17:21:13:      Copyright: (c) 2009-2016 Stanford University
17:21:13:         Author: Joseph Coffland <joseph@cauldrondevelopment.com>
17:21:13:           Args:
17:21:13:         Config: C:\Users\david\AppData\Roaming\FAHClient\config.xml
17:21:13:******************************** Build ********************************
17:21:13:        Version: 7.4.16
17:21:13:           Date: Jan 6 2017
17:21:13:           Time: 00:25:14
17:21:13:     Repository: Git
17:21:13:       Revision: a9e9e27dc2ee6ff01398c439677bc27f6cb74032
17:21:13:         Branch: master
17:21:13:       Compiler: Visual C++ 2008
17:21:13:        Options: /TP /nologo /EHa /wd4297 /wd4103 /Ox -arch:SSE /MT
17:21:13:       Platform: win32 10
17:21:13:           Bits: 32
17:21:13:           Mode: Release
17:21:13:******************************* System ********************************
17:21:13:            CPU: Intel(R) Core(TM) i5-6400 CPU @ 2.70GHz
17:21:13:         CPU ID: GenuineIntel Family 6 Model 94 Stepping 3
17:21:13:           CPUs: 4
17:21:13:         Memory: 15.91GiB
17:21:13:    Free Memory: 14.04GiB
17:21:13:        Threads: WINDOWS_THREADS
17:21:13:     OS Version: 6.2
17:21:13:    Has Battery: false
17:21:13:     On Battery: false
17:21:13:     UTC Offset: 1
17:21:13:            PID: 6176
17:21:13:            CWD: C:\Users\david\AppData\Roaming\FAHClient
17:21:13:             OS: Windows 10 Home
17:21:13:        OS Arch: AMD64
17:21:13:           GPUs: 0
17:21:13:           CUDA: Not detected: cuInit() returned 999
17:21:13:OpenCL Device 0: Platform:0 Device:0 Bus:NA Slot:NA Compute:2.0 Driver:21.20
17:21:13:  Win32 Service: false
17:21:13:***********************************************************************
17:21:13:<config>
17:21:13:  <!-- Folding Core -->
17:21:13:  <checkpoint v='5'/>
17:21:13:
17:21:13:  <!-- HTTP Server -->
17:21:13:  <allow v='127.0.0.1 192.168.1.0/24'/>
17:21:13:  <deny v='0.0.0.0/0'/>
17:21:13:  <http-addresses v='127.0.0.1:7396 david-ubuntu:7396'/>
17:21:13:
17:21:13:  <!-- Network -->
17:21:13:  <proxy v=':8080'/>
17:21:13:
17:21:13:  <!-- Remote Command Server -->
17:21:13:  <password v='*******'/>
17:21:13:
17:21:13:  <!-- Slot Control -->
17:21:13:  <power v='full'/>
17:21:13:
17:21:13:  <!-- User Information -->
17:21:13:  <passkey v='********************************'/>
17:21:13:  <user v='davidcoton'/>
17:21:13:
17:21:13:  <!-- Web Server -->
17:21:13:  <web-allow v='127.0.0.1 168.192.1.0/24'/>
17:21:13:
17:21:13:  <!-- Folding Slots -->
17:21:13:  <slot id='0' type='CPU'>
17:21:13:    <client-type v='advanced'/>
17:21:13:    <cpus v='4'/>
17:21:13:  </slot>
17:21:13:</config>
17:21:13:Trying to access database...
17:21:13:Successfully acquired database lock
17:21:13:Enabled folding slot 00: READY cpu:4
17:21:13:ERROR:Exception: Could not bind socket to david-ubuntu:7396: The requested address is not valid in its context.
17:21:14:WU00:FS00:Connecting to 171.67.108.45:8080
17:21:36:WARNING:WU00:FS00:Failed to get assignment from '171.67.108.45:8080': Failed to connect to 171.67.108.45:8080: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.
17:21:36:WU00:FS00:Connecting to 171.64.65.35:80
17:21:38:WU00:FS00:Assigned to work server 128.252.203.4
17:21:38:WU00:FS00:Requesting new work unit for slot 00: READY cpu:4 from 128.252.203.4
17:21:39:WU00:FS00:Connecting to 128.252.203.4:8080
17:21:39:WU00:FS00:Downloading 7.48MiB
17:21:45:WU00:FS00:Download 96.92%
17:21:45:WU00:FS00:Download complete
17:21:45:WU00:FS00:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:13801 run:0 clone:1318 gen:42 core:0xa7 unit:0x0000003880fccb0458a5fc9a6c87c682
17:21:46:WU00:FS00:Starting
17:21:46:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:\Users\david\AppData\Roaming\FAHClient\cores/fahwebx.stanford.edu/cores/Win32/AMD64/AVX/Core_a7.fah/FahCore_a7.exe -dir 00 -suffix 01 -version 704 -lifeline 6176 -checkpoint 5 -np 4
17:21:46:WU00:FS00:Started FahCore on PID 7104
17:21:47:WU00:FS00:Core PID:7152
17:21:47:WU00:FS00:FahCore 0xa7 started
17:21:49:WU00:FS00:0xa7:*********************** Log Started 2017-07-06T17:21:48Z ***********************
17:21:49:WU00:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
17:21:49:WU00:FS00:0xa7:       Type: 0xa7
17:21:49:WU00:FS00:0xa7:       Core: Gromacs
17:21:49:WU00:FS00:0xa7:    Website: http://folding.stanford.edu/
17:21:49:WU00:FS00:0xa7:  Copyright: (c) 2009-2016 Stanford University
17:21:49:WU00:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
17:21:49:WU00:FS00:0xa7:       Args: -dir 00 -suffix 01 -version 704 -lifeline 7104 -checkpoint 5 -np 4
17:21:49:WU00:FS00:0xa7:     Config: <none>
17:21:49:WU00:FS00:0xa7:************************************ Build *************************************
17:21:49:WU00:FS00:0xa7:    Version: 0.0.11
17:21:49:WU00:FS00:0xa7:       Date: Sep 21 2016
17:21:49:WU00:FS00:0xa7:       Time: 01:43:48
17:21:49:WU00:FS00:0xa7: Repository: Git
17:21:49:WU00:FS00:0xa7:   Revision: 957bd90e68d95ddcf1594dc15ff6c64cc4555146
17:21:49:WU00:FS00:0xa7:     Branch: master
17:21:49:WU00:FS00:0xa7:   Compiler: GNU 4.2.1 Compatible Clang 3.9.0 (trunk 274080)
17:21:49:WU00:FS00:0xa7:    Options: -std=gnu++98 -O3 -funroll-loops -ffast-math -mfpmath=sse
17:21:49:WU00:FS00:0xa7:             -fno-unsafe-math-optimizations -msse2 -I/mingw64/include
17:21:49:WU00:FS00:0xa7:             -Wno-inconsistent-dllimport -Wno-parentheses-equality
17:21:49:WU00:FS00:0xa7:             -Wno-deprecated-register -Wno-unused-local-typedef
17:21:49:WU00:FS00:0xa7:   Platform: linux2 4.6.0-1-amd64
17:21:49:WU00:FS00:0xa7:       Bits: 64
17:21:49:WU00:FS00:0xa7:       Mode: Release
17:21:49:WU00:FS00:0xa7:       SIMD: avx_256
17:21:49:WU00:FS00:0xa7:************************************ System ************************************
17:21:49:WU00:FS00:0xa7:        CPU: Intel(R) Core(TM) i5-6400 CPU @ 2.70GHz
17:21:49:WU00:FS00:0xa7:     CPU ID: GenuineIntel Family 6 Model 94 Stepping 3
17:21:49:WU00:FS00:0xa7:       CPUs: 4
17:21:49:WU00:FS00:0xa7:     Memory: 15.91GiB
17:21:49:WU00:FS00:0xa7:Free Memory: 14.04GiB
17:21:49:WU00:FS00:0xa7:    Threads: WINDOWS_THREADS
17:21:49:WU00:FS00:0xa7: OS Version: 6.2
17:21:49:WU00:FS00:0xa7:Has Battery: false
17:21:49:WU00:FS00:0xa7: On Battery: false
17:21:49:WU00:FS00:0xa7: UTC Offset: 1
17:21:49:WU00:FS00:0xa7:        PID: 7152
17:21:49:WU00:FS00:0xa7:        CWD: C:\Users\david\AppData\Roaming\FAHClient\work
17:21:49:WU00:FS00:0xa7:         OS: Windows 10 Home
17:21:49:WU00:FS00:0xa7:    OS Arch: AMD64
17:21:49:WU00:FS00:0xa7:********************************************************************************
17:21:49:WU00:FS00:0xa7:Project: 13801 (Run 0, Clone 1318, Gen 42)
17:21:49:WU00:FS00:0xa7:Unit: 0x0000003880fccb0458a5fc9a6c87c682
17:21:49:WU00:FS00:0xa7:Reading tar file core.xml
17:21:49:WU00:FS00:0xa7:Reading tar file frame42.tpr
17:21:49:WU00:FS00:0xa7:Digital signatures verified
17:21:49:WU00:FS00:0xa7:Calling: mdrun -s frame42.tpr -o frame42.trr -x frame42.xtc -cpt 5 -nt 4
17:21:49:WU00:FS00:0xa7:Steps: first=10500000 total=250000
17:21:51:WU00:FS00:0xa7:Completed 1 out of 250000 steps (0%)
Image
davidcoton
 
Posts: 952
Joined: Wed Nov 05, 2008 3:19 pm
Location: Cambridge, UK

Re: CPU not folding

Postby bruce » Thu Jul 06, 2017 11:39 pm

Facts are not "contrary" when they're observed, they're facts. Electrical repairs on a big campus probably progresses in phases, depending on the building the server is in and other factors, so the exact timining cannot be predicted except by the people doing the wor,k. There's nothing wrong with letting your clients run... even though they'll probably not get a regular supply of new work.

That project is hosted on 128.252.203.4 which is at wustl.edu. Their servers are running fine so I expect you'll be able to return it promptly when it finishes. I can't promise whether you'll get a new assignment until tomorrow, though.
bruce
 
Posts: 21278
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: CPU not folding

Postby rickoic » Sat Jul 08, 2017 7:11 pm

Dropped core count use from 10 to 8 and immediately got work unit.
rickoic
 
Posts: 304
Joined: Sat May 23, 2009 4:49 pm
Location: Mississippi near Memphis, Tn

Re: CPU not folding

Postby bruce » Sat Jul 08, 2017 9:11 pm

rickoic wrote:Dropped core count use from 10 to 8 and immediately got work unit.

The other alternative is to install FAHClient 7.4.16 (beta) which will negotiate with the server and find a WU that uses as many of your CPUs as it can find and then runs with the negotiated number. When that WU finishes, it it had negotiated the 8, it will search again and may end up using 9 or 10 or 12 for the next W - but it will get work.
bruce
 
Posts: 21278
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: CPU not folding

Postby davidcoton » Tue Jul 25, 2017 8:56 pm

bruce wrote:Facts are not "contrary" when they're observed, they're facts. Electrical repairs on a big campus probably progresses in phases, depending on the building the server is in and other factors, so the exact timining cannot be predicted except by the people doing the wor,k. There's nothing wrong with letting your clients run... even though they'll probably not get a regular supply of new work.

Thanks Bruce.
  1. It's me that was being contrary, not the facts :)
  2. I'm an electrician, I do understand something about how work is planned :D
  3. There was something wrong with letting my clients run. I was away for ten days :lol:
davidcoton
 
Posts: 952
Joined: Wed Nov 05, 2008 3:19 pm
Location: Cambridge, UK


Return to Issues with a specific server

Who is online

Users browsing this forum: Magpie [Crawler] and 3 guests

cron