Rejecting from WS 155.247.166.219 [also CS .166.220]

Moderators: Site Moderators, FAHC Science Team

7im
Posts: 10189
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: Rejecting from WS 155.247.166.219 [also CS .166.220]

Post by 7im »

The folding admins are all doing this without getting any points for helping, so it's all from the goodness of their hearts. ;)
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
pdbuzz
Posts: 8
Joined: Tue Feb 11, 2014 4:01 am

Re: Rejecting from WS 155.247.166.219 [also CS .166.220]

Post by pdbuzz »

Which is why I stated 'Folding' admins, not forum admins. This is the only location that we, as contributors, can make any kind of statements that MIGHT find their way to Dr. Pandes' group.
7im
Posts: 10189
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: Rejecting from WS 155.247.166.219 [also CS .166.220]

Post by 7im »

Got it. We call them Researcher, or PI (principle investigator). Ultimately, each is responsible for his/her own projects and servers.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
RMouse
Posts: 146
Joined: Wed Jun 13, 2012 6:15 am

Uploading WU failing

Post by RMouse »

I have finished a WU but find in the log continual failure to upload the results. Anything I can do to get this fixed? I have tried uploading the results both at home and at my work location, both with no success.

Code: Select all

*********************** Log Started 2014-04-17T00:23:16Z ***********************
00:23:16:************************* Folding@home Client *************************
00:23:16:      Website: http://folding.stanford.edu/
00:23:16:    Copyright: (c) 2009-2013 Stanford University
00:23:16:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
00:23:16:         Args: --open-web-control
00:23:16:       Config: C:/Users/*/AppData/Roaming/FAHClient/config.xml
00:23:16:******************************** Build ********************************
00:23:16:      Version: 7.3.6
00:23:16:         Date: Feb 18 2013
00:23:16:         Time: 15:25:17
00:23:16:      SVN Rev: 3923
00:23:16:       Branch: fah/trunk/client
00:23:16:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
00:23:16:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
00:23:16:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
00:23:16:     Platform: win32 XP
00:23:16:         Bits: 32
00:23:16:         Mode: Release
00:23:16:******************************* System ********************************
00:23:16:          CPU: Intel(R) Core(TM) i3-3217U CPU @ 1.80GHz
00:23:16:       CPU ID: GenuineIntel Family 6 Model 58 Stepping 9
00:23:16:         CPUs: 4
00:23:16:       Memory: 7.88GiB
00:23:16:  Free Memory: 5.05GiB
00:23:16:      Threads: WINDOWS_THREADS
00:23:16:  Has Battery: true
00:23:16:   On Battery: false
00:23:16:   UTC offset: 8
00:23:16:          PID: 1388
00:23:16:          CWD: C:/Users/*/AppData/Roaming/FAHClient
00:23:16:           OS: Windows 7 Professional
00:23:16:      OS Arch: AMD64
00:23:16:         GPUs: 0
00:23:16:         CUDA: Not detected
00:23:16:Win32 Service: false
00:23:16:***********************************************************************
00:23:16:<config>
00:23:16:  <!-- Folding Slot Configuration -->
00:23:16:  <power v='full'/>
00:23:16:
00:23:16:  <!-- Network -->
00:23:16:  <proxy v=':8080'/>
00:23:16:
00:23:16:  <!-- Slot Control -->
00:23:16:  <pause-on-battery v='false'/>
00:23:16:
00:23:16:  <!-- User Information -->
00:23:16:  <passkey v='********************************'/>
00:23:16:  <user v='rattym'/>
00:23:16:
00:23:16:  <!-- Folding Slots -->
00:23:16:  <slot id='0' type='CPU'/>
00:23:16:</config>
00:23:16:Trying to access database...
00:23:16:Successfully acquired database lock
00:23:16:Enabled folding slot 00: READY cpu:4
00:23:17:WU01:FS00:Starting
00:23:17:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/Users/*/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/Core_a4.fah/FahCore_a4.exe -dir 01 -suffix 01 -version 703 -lifeline 1388 -checkpoint 15 -np 4
00:23:17:WU01:FS00:Started FahCore on PID 5472
00:23:18:WU01:FS00:Core PID:3540
00:23:18:WU01:FS00:FahCore 0xa4 started
00:23:18:WU00:FS00:Sending unit results: id:00 state:SEND error:NO_ERROR project:6367 run:19 clone:10 gen:23 core:0xa4 unit:0x000000180002894b5323457653fc7527
00:23:18:WU00:FS00:Uploading 1.23MiB to 155.247.166.219
00:23:18:WU00:FS00:Connecting to 155.247.166.219:8080
00:23:18:WU01:FS00:0xa4:
00:23:18:WU01:FS00:0xa4:*------------------------------*
00:23:18:WU01:FS00:0xa4:Folding@Home Gromacs GB Core
00:23:18:WU01:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
00:23:18:WU01:FS00:0xa4:
00:23:18:WU01:FS00:0xa4:Preparing to commence simulation
00:23:18:WU01:FS00:0xa4:- Looking at optimizations...
00:23:18:WU01:FS00:0xa4:- Files status OK
00:23:18:WU01:FS00:0xa4:- Expanded 875149 -> 1434864 (decompressed 163.9 percent)
00:23:18:WU01:FS00:0xa4:Called DecompressByteArray: compressed_data_size=875149 data_size=1434864, decompressed_data_size=1434864 diff=0
00:23:18:WU01:FS00:0xa4:- Digital signature verified
00:23:18:WU01:FS00:0xa4:
00:23:18:WU01:FS00:0xa4:Project: 9006 (Run 1109, Clone 3, Gen 18)
00:23:18:WU01:FS00:0xa4:
00:23:18:WU01:FS00:0xa4:Assembly optimizations on if available.
00:23:18:WU01:FS00:0xa4:Entering M.D.
00:23:24:WU01:FS00:0xa4:Using Gromacs checkpoints
00:23:24:WU01:FS00:0xa4:Mapping NT from 4 to 4 
00:23:25:WU01:FS00:0xa4:Resuming from checkpoint
00:23:25:WU01:FS00:0xa4:Verified 01/wudata_01.log
00:23:25:WU01:FS00:0xa4:Verified 01/wudata_01.trr
00:23:25:WU01:FS00:0xa4:Verified 01/wudata_01.xtc
00:23:25:WU01:FS00:0xa4:Verified 01/wudata_01.edr
00:23:25:WU01:FS00:0xa4:Completed 152275 out of 250000 steps  (60%)
00:23:54:WU01:FS00:0xa4:Completed 152500 out of 250000 steps  (61%)
00:24:18:WU00:FS00:Upload 10.17%
00:24:18:WARNING:WU00:FS00:Exception: Failed to send results to work server: Transfer failed
00:24:18:WU00:FS00:Trying to send results to collection server
00:24:18:WU00:FS00:Uploading 1.23MiB to 155.247.166.220
00:24:18:WU00:FS00:Connecting to 155.247.166.220:8080
00:25:18:WU00:FS00:Upload 10.17%
00:25:18:ERROR:WU00:FS00:Exception: Transfer failed
00:25:21:WU00:FS00:Sending unit results: id:00 state:SEND error:NO_ERROR project:6367 run:19 clone:10 gen:23 core:0xa4 unit:0x000000180002894b5323457653fc7527
00:25:21:WU00:FS00:Uploading 1.23MiB to 155.247.166.219
00:25:21:WU00:FS00:Connecting to 155.247.166.219:8080
00:25:42:WARNING:WU00:FS00:WorkServer connection failed on port 8080 trying 80
00:25:42:WU00:FS00:Connecting to 155.247.166.219:80
00:25:42:WU00:FS00:Upload 5.09%
00:25:48:WU00:FS00:Upload 15.26%
00:25:48:WARNING:WU00:FS00:Exception: Failed to send results to work server: 10001: Server responded: HTTP_INTERNAL_SERVER_ERROR
00:25:48:WU00:FS00:Trying to send results to collection server
00:25:48:WU00:FS00:Uploading 1.23MiB to 155.247.166.220
00:25:48:WU00:FS00:Connecting to 155.247.166.220:8080
00:26:46:WU00:FS00:Upload 10.17%
00:26:46:ERROR:WU00:FS00:Exception: Transfer failed
00:26:46:WU00:FS00:Sending unit results: id:00 state:SEND error:NO_ERROR project:6367 run:19 clone:10 gen:23 core:0xa4 unit:0x000000180002894b5323457653fc7527
00:26:46:WU00:FS00:Uploading 1.23MiB to 155.247.166.219
00:26:46:WU00:FS00:Connecting to 155.247.166.219:8080
00:27:46:WU00:FS00:Upload 10.17%
00:27:46:WARNING:WU00:FS00:Exception: Failed to send results to work server: Transfer failed
00:27:46:WU00:FS00:Trying to send results to collection server
00:27:46:WU00:FS00:Uploading 1.23MiB to 155.247.166.220
00:27:46:WU00:FS00:Connecting to 155.247.166.220:8080
00:28:45:WU00:FS00:Upload 10.17%
00:28:45:ERROR:WU00:FS00:Exception: Transfer failed
00:28:45:WU00:FS00:Sending unit results: id:00 state:SEND error:NO_ERROR project:6367 run:19 clone:10 gen:23 core:0xa4 unit:0x000000180002894b5323457653fc7527
00:28:45:WU00:FS00:Uploading 1.23MiB to 155.247.166.219
00:28:45:WU00:FS00:Connecting to 155.247.166.219:8080
00:29:06:WARNING:WU00:FS00:WorkServer connection failed on port 8080 trying 80
00:29:06:WU00:FS00:Connecting to 155.247.166.219:80
00:29:06:WU00:FS00:Upload 5.09%
00:29:12:WARNING:WU00:FS00:Exception: Failed to send results to work server: 10001: Server responded: HTTP_INTERNAL_SERVER_ERROR
00:29:12:WU00:FS00:Trying to send results to collection server
00:29:12:WU00:FS00:Uploading 1.23MiB to 155.247.166.220
00:29:12:WU00:FS00:Connecting to 155.247.166.220:8080
Joe_H
Site Admin
Posts: 7870
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Uploading WU failing

Post by Joe_H »

RMouse wrote:I have finished a WU but find in the log continual failure to upload the results. Anything I can do to get this fixed? I have tried uploading the results both at home and at my work location, both with no success.
There is not much you can do until the server is fixed. From the other posts here in this top that I merged your post with, you can see that others are also waiting to upload to this WS and its CS.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
pdbuzz
Posts: 8
Joined: Tue Feb 11, 2014 4:01 am

Re: Rejecting from WS 155.247.166.219 [also CS .166.220]

Post by pdbuzz »

I was only able to include one machines logs. The four machines with the issues are from projects (1 box) 6367, (2 boxes) 6369, and (1 box) 6370. They all have expiration dates from the 25th to the 26th. As I mentioned before, I'd sure hate to lose that work. It's too bad there isn't another way that we can find out more about this and why the server isn't accepting the work.

It seems like there has been a lot of unusual issues over the past month between GPU work issues, points, and wildly variable projects. I took a break for a while before coming back to the folding game. February and some of March seemed to be ok, but now...

Oh well, I suppose it's part of the (unwritten) rules of the road.
billford
Posts: 1005
Joined: Thu May 02, 2013 8:46 pm
Hardware configuration: Full Time:

2x NVidia GTX 980
1x NVidia GTX 780 Ti
2x 3GHz Core i5 PC (Linux)

Retired:

3.2GHz Core i5 PC (Linux)
3.2GHz Core i5 iMac
2.8GHz Core i5 iMac
2.16GHz Core 2 Duo iMac
2GHz Core 2 Duo MacBook
1.6GHz Core 2 Duo Acer laptop
Location: Near Oxford, United Kingdom
Contact:

Re: Rejecting from WS 155.247.166.219 [also CS .166.220]

Post by billford »

I see from my logs that when a client has successfully uploaded a WU for a different project that at least one of the AS's is still sending them to 155.247.166.219 to get a new one.

It's not a major item but it wastes time until the client is re-assigned to a working WS (sometimes it seems to need several tries- see edit), could this option not be disabled until the problem is fixed?


edit:

Code: Select all

09:58:16:WU00:FS00:Connecting to assign3.stanford.edu:8080
09:58:16:WU00:FS00:News: Welcome to Folding@Home
09:58:16:WU00:FS00:Assigned to work server 155.247.166.219
09:58:16:WU00:FS00:Requesting new work unit for slot 00: RUNNING cpu:4 from 155.247.166.219
09:58:16:WU00:FS00:Connecting to 155.247.166.219:8080
09:58:25:WU02:FS00:0xa4:
09:58:25:WU02:FS00:0xa4:Finished Work Unit:
09:58:25:WU02:FS00:0xa4:- Reading up to 870480 from "02/wudata_01.trr": Read 870480
09:58:25:WU02:FS00:0xa4:trr file hash check passed.
09:58:25:WU02:FS00:0xa4:- Reading up to 798012 from "02/wudata_01.xtc": Read 798012
09:58:25:WU02:FS00:0xa4:xtc file hash check passed.
09:58:25:WU02:FS00:0xa4:edr file hash check passed.
09:58:25:WU02:FS00:0xa4:logfile size: 22720
09:58:25:WU02:FS00:0xa4:Leaving Run
09:58:30:WU02:FS00:0xa4:- Writing 1693652 bytes of core data to disk...
09:58:30:WU02:FS00:0xa4:Done: 1693140 -> 1642215 (compressed to 96.9 percent)
09:58:30:WU02:FS00:0xa4:  ... Done.
09:58:31:WU02:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
09:58:31:WU02:FS00:Sending unit results: id:02 state:SEND error:NO_ERROR project:9006 run:1056 clone:2 gen:37 core:0xa4 unit:0x00000027664f2de4533b2e24929b8a5d
09:58:31:WU02:FS00:Uploading 1.57MiB to 171.64.65.124
09:58:31:WU02:FS00:Connecting to 171.64.65.124:8080
09:58:35:WU02:FS00:Upload complete
09:58:35:WU02:FS00:Server responded WORK_ACK (400)
09:58:35:WU02:FS00:Final credit estimate, 1621.00 points
09:58:35:WU02:FS00:Cleaning up
10:00:13:ERROR:WU00:FS00:Exception: 10002: Received short response, expected 512 bytes, got 0
10:00:13:WU00:FS00:Connecting to assign3.stanford.edu:8080
10:00:13:WU00:FS00:News: Welcome to Folding@Home
10:00:13:WU00:FS00:Assigned to work server 155.247.166.219
10:00:13:WU00:FS00:Requesting new work unit for slot 00: READY cpu:4 from 155.247.166.219
10:00:13:WU00:FS00:Connecting to 155.247.166.219:8080
10:02:12:ERROR:WU00:FS00:Exception: 10002: Received short response, expected 512 bytes, got 0
10:02:12:WU00:FS00:Connecting to assign3.stanford.edu:8080
10:02:12:WU00:FS00:News: Welcome to Folding@Home
10:02:12:WU00:FS00:Assigned to work server 155.247.166.219
10:02:12:WU00:FS00:Requesting new work unit for slot 00: READY cpu:4 from 155.247.166.219
10:02:12:WU00:FS00:Connecting to 155.247.166.219:8080
10:06:03:ERROR:WU00:FS00:Exception: 10002: Received short response, expected 512 bytes, got 0
10:06:03:WU00:FS00:Connecting to assign3.stanford.edu:8080
10:06:03:WU00:FS00:News: Welcome to Folding@Home
10:06:03:WU00:FS00:Assigned to work server 155.247.166.219
10:06:03:WU00:FS00:Requesting new work unit for slot 00: READY cpu:4 from 155.247.166.219
10:06:03:WU00:FS00:Connecting to 155.247.166.219:8080
10:07:59:ERROR:WU00:FS00:Exception: 10002: Received short response, expected 512 bytes, got 0
10:08:40:WU00:FS00:Connecting to assign3.stanford.edu:8080
10:08:41:WU00:FS00:News: Welcome to Folding@Home
10:08:41:WU00:FS00:Assigned to work server 155.247.166.219
10:08:41:WU00:FS00:Requesting new work unit for slot 00: READY cpu:4 from 155.247.166.219
10:08:41:WU00:FS00:Connecting to 155.247.166.219:8080
10:10:38:ERROR:WU00:FS00:Exception: 10002: Received short response, expected 512 bytes, got 0
10:12:54:WU00:FS00:Connecting to assign3.stanford.edu:8080
10:12:55:WU00:FS00:News: Welcome to Folding@Home
10:12:55:WU00:FS00:Assigned to work server 155.247.166.219
10:12:55:WU00:FS00:Requesting new work unit for slot 00: READY cpu:4 from 155.247.166.219
10:12:55:WU00:FS00:Connecting to 155.247.166.219:8080
10:14:49:ERROR:WU00:FS00:Exception: 10002: Received short response, expected 512 bytes, got 0
10:19:46:WU00:FS00:Connecting to assign3.stanford.edu:8080
10:19:46:WU00:FS00:News: Welcome to Folding@Home
10:19:46:WU00:FS00:Assigned to work server 155.247.166.219
10:19:46:WU00:FS00:Requesting new work unit for slot 00: READY cpu:4 from 155.247.166.219
10:19:46:WU00:FS00:Connecting to 155.247.166.219:8080
10:21:02:WARNING:WU00:FS00:WorkServer connection failed on port 8080 trying 80
10:21:02:WU00:FS00:Connecting to 155.247.166.219:80
10:21:02:ERROR:WU00:FS00:Exception: Failed to connect to 155.247.166.219:80: Connection refused
10:30:51:WU00:FS00:Connecting to assign3.stanford.edu:8080
10:30:52:WU00:FS00:News: Welcome to Folding@Home
10:30:52:WU00:FS00:Assigned to work server 155.247.166.219
10:30:52:WU00:FS00:Requesting new work unit for slot 00: READY cpu:4 from 155.247.166.219
10:30:52:WU00:FS00:Connecting to 155.247.166.219:8080
10:34:25:ERROR:WU00:FS00:Exception: 10002: Received short response, expected 512 bytes, got 0
10:48:48:WU00:FS00:Connecting to assign3.stanford.edu:8080
10:48:48:WU00:FS00:News: Welcome to Folding@Home
10:48:48:WU00:FS00:Assigned to work server 171.64.65.124
10:48:49:WU00:FS00:Requesting new work unit for slot 00: READY cpu:4 from 171.64.65.124
10:48:49:WU00:FS00:Connecting to 171.64.65.124:8080
10:48:50:WU00:FS00:Downloading 855.31KiB
10:48:52:WU00:FS00:Download complete
10:48:52:WU00:FS00:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:9006 run:584 clone:2 gen:27 core:0xa4 unit:0x0000001e664f2de4533b2c11e0fa6ef7
Nearly an hour's folding time wasted!
Image
vvoelz
Pande Group Member
Posts: 539
Joined: Sun Dec 02, 2007 8:07 pm
Location: Temple University, Philadelphia PA

Re: Rejecting from WS 155.247.166.219 [also CS .166.220]

Post by vvoelz »

Hi all -- I would like everyone to know that we are aware of the problem and working hard to fix it. Both the work server 155.247.166.219 and collection server 155.247.166.220 are having network connectivity issues. I apologize for the inconvenience; hopefully we'll get everything back up and running soon. --VInce
vvoelz
Pande Group Member
Posts: 539
Joined: Sun Dec 02, 2007 8:07 pm
Location: Temple University, Philadelphia PA

Re: Rejecting from WS 155.247.166.219 [also CS .166.220]

Post by vvoelz »

Update -- the collection server 155.247.166.220 is back up (and not surprisingly getting a lot of traffic), so hopefully Still working on 155.247.166.219 .... I'll keep you posted
billford
Posts: 1005
Joined: Thu May 02, 2013 8:46 pm
Hardware configuration: Full Time:

2x NVidia GTX 980
1x NVidia GTX 780 Ti
2x 3GHz Core i5 PC (Linux)

Retired:

3.2GHz Core i5 PC (Linux)
3.2GHz Core i5 iMac
2.8GHz Core i5 iMac
2.16GHz Core 2 Duo iMac
2GHz Core 2 Duo MacBook
1.6GHz Core 2 Duo Acer laptop
Location: Near Oxford, United Kingdom
Contact:

Re: Rejecting from WS 155.247.166.219 [also CS .166.220]

Post by billford »

You beat me to it- I was just about to post that one of my clients had got rid of its backlog :)

Thanks for posting.
Image
7im
Posts: 10189
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: Rejecting from WS 155.247.166.219 [also CS .166.220]

Post by 7im »

billford wrote:...snip...

It's not a major item but it wastes time until the client is re-assigned to a working WS (sometimes it seems to need several tries- see edit), could this option not be disabled until the problem is fixed?
This is by design. There is nothing to fix, except the server that is offline.

You see one symptom under one circumstance, and assume it was broken. However the assignment logic is programmed to handle multiple circumstances. It may not do each one well but it needs to do them all well enough.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
billford
Posts: 1005
Joined: Thu May 02, 2013 8:46 pm
Hardware configuration: Full Time:

2x NVidia GTX 980
1x NVidia GTX 780 Ti
2x 3GHz Core i5 PC (Linux)

Retired:

3.2GHz Core i5 PC (Linux)
3.2GHz Core i5 iMac
2.8GHz Core i5 iMac
2.16GHz Core 2 Duo iMac
2GHz Core 2 Duo MacBook
1.6GHz Core 2 Duo Acer laptop
Location: Near Oxford, United Kingdom
Contact:

Re: Rejecting from WS 155.247.166.219 [also CS .166.220]

Post by billford »

7im wrote: You see one symptom under one circumstance, and assume it was broken.
I see another instance of a problem that has occurred before, and a simple way to ameliorate some of the consequential effects.

I also see a refusal to admit that the F@H project could possibly have shortcomings- many are due to financial constraints and have to be lived with, others could be overcome or reduced by simple procedural changes.
However the assignment logic is programmed to handle multiple circumstances. It may not do each one well but it needs to do them all well enough.
It didn't handle it at all, it just ignored it. And my suggestion was for a few moments of manual intervention when it was realised that it was not being handled well.
Image
TobyKY76
Posts: 5
Joined: Thu Apr 17, 2014 3:13 pm

155.247.166.219

Post by TobyKY76 »

Mod edit: Merged with existing topic

Been trying to send my WU since yesterday morning

Log posted below

Code: Select all

*********************** Log Started 2014-04-17T07:22:14Z ***********************
07:22:14:************************* Folding@home Client *************************
07:22:14:      Website: http://folding.stanford.edu/
07:22:14:    Copyright: (c) 2009-2014 Stanford University
07:22:14:       Author: Joseph Coffland <joseph@cauldrondevelopment.com>
07:22:14:         Args: 
07:22:14:       Config: C:/Users/OFFICE 3/AppData/Roaming/FAHClient/config.xml
07:22:14:******************************** Build ********************************
07:22:14:      Version: 7.4.4
07:22:14:         Date: Mar 4 2014
07:22:14:         Time: 20:26:54
07:22:14:      SVN Rev: 4130
07:22:14:       Branch: fah/trunk/client
07:22:14:     Compiler: Intel(R) C++ MSVC 1500 mode 1200
07:22:14:      Options: /TP /nologo /EHa /Qdiag-disable:4297,4103,1786,279 /Ox -arch:SSE
07:22:14:               /QaxSSE2,SSE3,SSSE3,SSE4.1,SSE4.2 /Qopenmp /Qrestrict /MT /Qmkl
07:22:14:     Platform: win32 XP
07:22:14:         Bits: 32
07:22:14:         Mode: Release
07:22:14:******************************* System ********************************
07:22:14:          CPU: AMD Athlon(tm) II X2 220 Processor
07:22:14:       CPU ID: AuthenticAMD Family 16 Model 6 Stepping 3
07:22:14:         CPUs: 2
07:22:14:       Memory: 3.75GiB
07:22:14:  Free Memory: 2.76GiB
07:22:14:      Threads: WINDOWS_THREADS
07:22:14:   OS Version: 6.1
07:22:14:  Has Battery: false
07:22:14:   On Battery: false
07:22:14:   UTC Offset: -4
07:22:14:          PID: 3920
07:22:14:          CWD: C:/Users/OFFICE 3/AppData/Roaming/FAHClient
07:22:14:           OS: Windows 7 Home Premium
07:22:14:      OS Arch: AMD64
07:22:14:         GPUs: 1
07:22:14:        GPU 0: UNSUPPORTED: RS880 [Radeon HD 4200]
07:22:14:         CUDA: Not detected
07:22:14:Win32 Service: false
07:22:14:***********************************************************************
07:22:14:<config>
07:22:14:  <!-- Network -->
07:22:14:  <proxy v=':8080'/>
07:22:14:
07:22:14:  <!-- Slot Control -->
07:22:14:  <power v='FULL'/>
07:22:14:
07:22:14:  <!-- User Information -->
07:22:14:  <passkey v='********************************'/>
07:22:14:  <team v='11108'/>
07:22:14:  <user v='TobyKY76'/>
07:22:14:
07:22:14:  <!-- Folding Slots -->
07:22:14:  <slot id='0' type='CPU'/>
07:22:14:</config>
07:22:14:Trying to access database...
07:22:15:Successfully acquired database lock
07:22:15:Enabled folding slot 00: READY cpu:2
07:22:17:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:6369 run:18 clone:2 gen:39 core:0xa4 unit:0x000000290002894b5327673b7bf61937
07:22:19:WU01:FS00:Uploading 1.23MiB to 155.247.166.219
07:22:19:WU01:FS00:Connecting to 155.247.166.219:8080
07:22:19:WU00:FS00:Starting
07:22:19:WU00:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" "C:/Users/OFFICE 3/AppData/Roaming/FAHClient/cores/www.stanford.edu/~pande/Win32/AMD64/Core_a3.fah/FahCore_a3.exe" -dir 00 -suffix 01 -version 704 -lifeline 3920 -checkpoint 15 -np 2
07:22:21:WU00:FS00:Started FahCore on PID 4724
07:22:26:WU00:FS00:Core PID:4920
07:22:26:WU00:FS00:FahCore 0xa3 started
07:22:26:WU00:FS00:0xa3:
07:22:26:WU00:FS00:0xa3:*------------------------------*
07:22:26:WU00:FS00:0xa3:Folding@Home Gromacs SMP Core
07:22:26:WU00:FS00:0xa3:Version 2.27 (Dec. 15, 2010)
07:22:26:WU00:FS00:0xa3:
07:22:26:WU00:FS00:0xa3:Preparing to commence simulation
07:22:26:WU00:FS00:0xa3:- Ensuring status. Please wait.
07:22:36:WU00:FS00:0xa3:- Looking at optimizations...
07:22:36:WU00:FS00:0xa3:- Working with standard loops on this execution.
07:22:36:WU00:FS00:0xa3:- Previous termination of core was improper.
07:22:36:WU00:FS00:0xa3:- Files status OK
07:22:37:WU00:FS00:0xa3:- Expanded 3848514 -> 4382860 (decompressed 113.8 percent)
07:22:37:WU00:FS00:0xa3:Called DecompressByteArray: compressed_data_size=3848514 data_size=4382860, decompressed_data_size=4382860 diff=0
07:22:37:WU00:FS00:0xa3:- Digital signature verified
07:22:37:WU00:FS00:0xa3:
07:22:37:WU00:FS00:0xa3:Project: 8568 (Run 1, Clone 7, Gen 352)
07:22:37:WU00:FS00:0xa3:
07:22:37:WU00:FS00:0xa3:Entering M.D.
07:22:40:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
07:22:40:WU01:FS00:Connecting to 155.247.166.219:80
07:22:41:WARNING:WU01:FS00:Exception: Failed to send results to work server: Failed to connect to 155.247.166.219:80: No connection could be made because the target machine actively refused it.
07:22:41:WU01:FS00:Trying to send results to collection server
07:22:42:WU01:FS00:Uploading 1.23MiB to 155.247.166.220
07:22:42:WU01:FS00:Connecting to 155.247.166.220:8080
07:22:43:WU00:FS00:0xa3:Using Gromacs checkpoints
07:22:44:WU00:FS00:0xa3:Mapping NT from 2 to 2 
07:22:45:WU00:FS00:0xa3:Resuming from checkpoint
07:22:46:WU00:FS00:0xa3:Verified 00/wudata_01.log
07:22:48:WU00:FS00:0xa3:Verified 00/wudata_01.trr
07:22:48:WU00:FS00:0xa3:Verified 00/wudata_01.edr
07:22:49:WU00:FS00:0xa3:Completed 110450 out of 500000 steps  (22%)
07:23:10:ERROR:WU01:FS00:Exception: 10002: Received short response, expected 512 bytes, got 0
07:23:10:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:6369 run:18 clone:2 gen:39 core:0xa4 unit:0x000000290002894b5327673b7bf61937
07:23:10:WU01:FS00:Uploading 1.23MiB to 155.247.166.219
07:23:10:WU01:FS00:Connecting to 155.247.166.219:8080
07:23:19:WU01:FS00:Upload 5.10%
07:26:16:WARNING:WU01:FS00:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 0
07:26:16:WU01:FS00:Trying to send results to collection server
07:26:16:WU01:FS00:Uploading 1.23MiB to 155.247.166.220
07:26:16:WU01:FS00:Connecting to 155.247.166.220:8080
07:26:38:ERROR:WU01:FS00:Exception: 10002: Received short response, expected 512 bytes, got 0
07:26:39:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:6369 run:18 clone:2 gen:39 core:0xa4 unit:0x000000290002894b5327673b7bf61937
07:26:39:WU01:FS00:Uploading 1.23MiB to 155.247.166.219
07:26:39:WU01:FS00:Connecting to 155.247.166.219:8080
07:26:48:WU01:FS00:Upload 5.10%
07:29:45:WARNING:WU01:FS00:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 0
07:29:45:WU01:FS00:Trying to send results to collection server
07:29:45:WU01:FS00:Uploading 1.23MiB to 155.247.166.220
07:29:45:WU01:FS00:Connecting to 155.247.166.220:8080
07:29:54:WU01:FS00:Upload 5.10%
07:32:52:ERROR:WU01:FS00:Exception: 10002: Received short response, expected 512 bytes, got 0
07:32:52:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:6369 run:18 clone:2 gen:39 core:0xa4 unit:0x000000290002894b5327673b7bf61937
07:32:52:WU01:FS00:Uploading 1.23MiB to 155.247.166.219
07:32:52:WU01:FS00:Connecting to 155.247.166.219:8080
07:33:13:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
07:33:13:WU01:FS00:Connecting to 155.247.166.219:80
07:33:15:WARNING:WU01:FS00:Exception: Failed to send results to work server: Failed to connect to 155.247.166.219:80: No connection could be made because the target machine actively refused it.
07:33:15:WU01:FS00:Trying to send results to collection server
07:33:15:WU01:FS00:Uploading 1.23MiB to 155.247.166.220
07:33:15:WU01:FS00:Connecting to 155.247.166.220:8080
07:36:15:ERROR:WU01:FS00:Exception: 10002: Received short response, expected 512 bytes, got 0
07:36:15:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:6369 run:18 clone:2 gen:39 core:0xa4 unit:0x000000290002894b5327673b7bf61937
07:36:15:WU01:FS00:Uploading 1.23MiB to 155.247.166.219
07:36:15:WU01:FS00:Connecting to 155.247.166.219:8080
07:36:37:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
07:36:37:WU01:FS00:Connecting to 155.247.166.219:80
07:36:38:WARNING:WU01:FS00:Exception: Failed to send results to work server: Failed to connect to 155.247.166.219:80: No connection could be made because the target machine actively refused it.
07:36:38:WU01:FS00:Trying to send results to collection server
07:36:38:WU01:FS00:Uploading 1.23MiB to 155.247.166.220
07:36:38:WU01:FS00:Connecting to 155.247.166.220:8080
07:37:00:ERROR:WU01:FS00:Exception: 10002: Received short response, expected 512 bytes, got 0
07:40:30:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:6369 run:18 clone:2 gen:39 core:0xa4 unit:0x000000290002894b5327673b7bf61937
07:40:30:WU01:FS00:Uploading 1.23MiB to 155.247.166.219
07:40:30:WU01:FS00:Connecting to 155.247.166.219:8080
07:43:30:WARNING:WU01:FS00:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 0
07:43:30:WU01:FS00:Trying to send results to collection server
07:43:30:WU01:FS00:Uploading 1.23MiB to 155.247.166.220
07:43:30:WU01:FS00:Connecting to 155.247.166.220:8080
07:43:56:ERROR:WU01:FS00:Exception: 10002: Received short response, expected 512 bytes, got 0
07:47:21:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:6369 run:18 clone:2 gen:39 core:0xa4 unit:0x000000290002894b5327673b7bf61937
07:47:21:WU01:FS00:Uploading 1.23MiB to 155.247.166.219
07:47:21:WU01:FS00:Connecting to 155.247.166.219:8080
07:47:45:WARNING:WU01:FS00:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 0
07:47:45:WU01:FS00:Trying to send results to collection server
07:47:45:WU01:FS00:Uploading 1.23MiB to 155.247.166.220
07:47:45:WU01:FS00:Connecting to 155.247.166.220:8080
07:48:07:ERROR:WU01:FS00:Exception: 10002: Received short response, expected 512 bytes, got 0
07:58:27:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:6369 run:18 clone:2 gen:39 core:0xa4 unit:0x000000290002894b5327673b7bf61937
07:58:27:WU01:FS00:Uploading 1.23MiB to 155.247.166.219
07:58:27:WU01:FS00:Connecting to 155.247.166.219:8080
07:58:49:WARNING:WU01:FS00:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 0
07:58:49:WU01:FS00:Trying to send results to collection server
07:58:49:WU01:FS00:Uploading 1.23MiB to 155.247.166.220
07:58:49:WU01:FS00:Connecting to 155.247.166.220:8080
07:59:10:ERROR:WU01:FS00:Exception: 10002: Received short response, expected 512 bytes, got 0
08:08:53:WU00:FS00:0xa3:Completed 115000 out of 500000 steps  (23%)
08:16:24:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:6369 run:18 clone:2 gen:39 core:0xa4 unit:0x000000290002894b5327673b7bf61937
08:16:24:WU01:FS00:Uploading 1.23MiB to 155.247.166.219
08:16:24:WU01:FS00:Connecting to 155.247.166.219:8080
08:16:33:WU01:FS00:Upload 5.10%
08:19:30:WARNING:WU01:FS00:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 0
08:19:30:WU01:FS00:Trying to send results to collection server
08:19:30:WU01:FS00:Uploading 1.23MiB to 155.247.166.220
08:19:30:WU01:FS00:Connecting to 155.247.166.220:8080
08:19:39:WU01:FS00:Upload 5.10%
08:22:37:ERROR:WU01:FS00:Exception: 10002: Received short response, expected 512 bytes, got 0
08:45:26:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:6369 run:18 clone:2 gen:39 core:0xa4 unit:0x000000290002894b5327673b7bf61937
08:45:26:WU01:FS00:Uploading 1.23MiB to 155.247.166.219
08:45:26:WU01:FS00:Connecting to 155.247.166.219:8080
08:48:26:WARNING:WU01:FS00:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 0
08:48:26:WU01:FS00:Trying to send results to collection server
08:48:26:WU01:FS00:Uploading 1.23MiB to 155.247.166.220
08:48:26:WU01:FS00:Connecting to 155.247.166.220:8080
08:51:27:ERROR:WU01:FS00:Exception: 10002: Received short response, expected 512 bytes, got 0
08:54:28:WU00:FS00:0xa3:Completed 120000 out of 500000 steps  (24%)
09:32:24:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:6369 run:18 clone:2 gen:39 core:0xa4 unit:0x000000290002894b5327673b7bf61937
09:32:24:WU01:FS00:Uploading 1.23MiB to 155.247.166.219
09:32:24:WU01:FS00:Connecting to 155.247.166.219:8080
09:32:46:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
09:32:46:WU01:FS00:Connecting to 155.247.166.219:80
09:32:47:WARNING:WU01:FS00:Exception: Failed to send results to work server: Failed to connect to 155.247.166.219:80: No connection could be made because the target machine actively refused it.
09:32:47:WU01:FS00:Trying to send results to collection server
09:32:47:WU01:FS00:Uploading 1.23MiB to 155.247.166.220
09:32:47:WU01:FS00:Connecting to 155.247.166.220:8080
09:33:08:WARNING:WU01:FS00:WorkServer connection failed on port 8080 trying 80
09:33:08:WU01:FS00:Connecting to 155.247.166.220:80
09:33:10:ERROR:WU01:FS00:Exception: Failed to connect to 155.247.166.220:80: No connection could be made because the target machine actively refused it.
09:40:02:WU00:FS00:0xa3:Completed 125000 out of 500000 steps  (25%)
10:25:39:WU00:FS00:0xa3:Completed 130000 out of 500000 steps  (26%)
10:48:25:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:6369 run:18 clone:2 gen:39 core:0xa4 unit:0x000000290002894b5327673b7bf61937
10:48:25:WU01:FS00:Uploading 1.23MiB to 155.247.166.219
10:48:25:WU01:FS00:Connecting to 155.247.166.219:8080
10:48:47:WARNING:WU01:FS00:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 0
10:48:47:WU01:FS00:Trying to send results to collection server
10:48:47:WU01:FS00:Uploading 1.23MiB to 155.247.166.220
10:48:47:WU01:FS00:Connecting to 155.247.166.220:8080
10:49:09:ERROR:WU01:FS00:Exception: 10002: Received short response, expected 512 bytes, got 0
11:11:15:WU00:FS00:0xa3:Completed 135000 out of 500000 steps  (27%)
11:56:49:WU00:FS00:0xa3:Completed 140000 out of 500000 steps  (28%)
12:44:28:WU00:FS00:0xa3:Completed 145000 out of 500000 steps  (29%)
12:51:25:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:6369 run:18 clone:2 gen:39 core:0xa4 unit:0x000000290002894b5327673b7bf61937
12:51:25:WU01:FS00:Uploading 1.23MiB to 155.247.166.219
12:51:25:WU01:FS00:Connecting to 155.247.166.219:8080
12:51:46:WARNING:WU01:FS00:Exception: Failed to send results to work server: 10002: Received short response, expected 512 bytes, got 0
12:51:46:WU01:FS00:Trying to send results to collection server
12:51:46:WU01:FS00:Uploading 1.23MiB to 155.247.166.220
12:51:46:WU01:FS00:Connecting to 155.247.166.220:8080
12:52:15:ERROR:WU01:FS00:Exception: 10002: Received short response, expected 512 bytes, got 0
******************************* Date: 2014-04-17 *******************************
7im
Posts: 10189
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: Rejecting from WS 155.247.166.219 [also CS .166.220]

Post by 7im »

billford wrote:
7im wrote: You see one symptom under one circumstance, and assume it was broken.
I see another instance of a problem that has occurred before, and a simple way to ameliorate some of the consequential effects.

I also see a refusal to admit that the F@H project could possibly have shortcomings- many are due to financial constraints and have to be lived with, others could be overcome or reduced by simple procedural changes.
However the assignment logic is programmed to handle multiple circumstances. It may not do each one well but it needs to do them all well enough.
It didn't handle it at all, it just ignored it. And my suggestion was for a few moments of manual intervention when it was realised that it was not being handled well.
With their limited resources, I would rather they spend the time fixing the server than changing the assignment settings, just to have to waste time putting the assignment settings back after the server is fixed.

The assignments are programmed to fail over in a specific order after a designated interval. After more than 10 years of tweaks to their servers, they pretty much work as intended, except for the customary hardware maintenance issues when you run that many servers.

Sorry that you expect google level uptime from a startup level lab, but it isn't going to happen. A FAH researcher is not a full time job. They teach and have other demands on their time. They don't run out in the middle of a class to go fix a downed server. That's why fah has many servers for redundancy. No one wants the research done more quickly than the researcher, but they have real world expectations. They don't sit in front of the server all day long.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
TobyKY76
Posts: 5
Joined: Thu Apr 17, 2014 3:13 pm

Re: Rejecting from WS 155.247.166.219 [also CS .166.220]

Post by TobyKY76 »

Had a power outage just now, rebooted the computer and now the WU sent with no issues (that I know of).
Post Reply