130.237.232.237

Moderators: Site Moderators, FAHC Science Team

3.0charlie
Posts: 13
Joined: Wed Jul 29, 2009 4:34 pm

Re: 130.237.232.237

Post by 3.0charlie »

3.0charlie wrote:This server still has issues with uploading completed WUs, and downloading new ones. Been as such for 2 days straight on my 4p rig, even manual uploads (for 2 completed WUs) using the -send x flag does not work - using both normal and alternate sites. I manually changed the Machine ID to 2, to no avail. Still stuck at 'loaded queue successfully', after being assigned to 130.237.232.237.

edit: obviously, after typing this and with the machine running overnight, it finally decided to download a WU (6903) and get to work. It took 14 hours... and I still don't know if the upload will work.
The 6903 upload was successful to 130.237.232.237, but the server is very slow to assign the next work unit (2 hours and counting, still no WU downloaded).

edit: server assignment for a new WU (P6903) took 2 hours and 15 minutes.
Folding for Hardware Canucks
autogrog
Posts: 38
Joined: Mon Aug 18, 2008 3:38 pm
Location: Halifax, Nova Scotia

Re: 130.237.232.237

Post by autogrog »

After successfully uploading a 6901, it has been handing out an endless stream of the erroneous 512byte WU's:

[13:30:56] Connecting to http://130.237.232.237:8080/
[13:30:57] Posted data.
[13:30:57] Initial: 0000; - Receiving payload (expected size: 512)
. . .
[16:33:47] Connecting to http://130.237.232.237:8080/
[16:33:47] Posted data.
[16:33:47] Initial: 0000; - Receiving payload (expected size: 512)

Is this part of the transition to BA16?
kasson
Pande Group Member
Posts: 1459
Joined: Thu Nov 29, 2007 9:37 pm

Re: 130.237.232.237

Post by kasson »

No--repeated errors are not part of planned transitions. I think there's something going on with the server, but we might have to take it off assign to diagnose (and hopefully to fix it). Are people getting any successful WU's from the server right now?
bollix47
Posts: 2941
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: 130.237.232.237

Post by bollix47 »

Got a successful WU earlier between the times shown by autogrog:

Code: Select all

[14:16:53] + Attempting to get work packet
[14:16:53] Passkey found
[14:16:53] - Will indicate memory of 12033 MB
[14:16:53] - Connecting to assignment server
[14:16:53] Connecting to http://assign.stanford.edu:8080/
[14:16:53] Posted data.
[14:16:53] Initial: ED82; - Successful: assigned to (130.237.232.237).
[14:16:53] + News From Folding@Home: Welcome to Folding@Home
[14:16:54] Loaded queue successfully.
[14:16:54] Sent data
[14:16:54] Connecting to http://130.237.232.237:8080/
[14:17:07] Posted data.
[14:17:07] Initial: 0000; - Receiving payload (expected size: 57249659)
[14:17:52] - Downloaded at ~1242 kB/s
[14:17:52] - Averaged speed for that direction ~921 kB/s
[14:17:52] + Received work.
[14:17:52] Trying to send all finished work units
[14:17:52] + No unsent completed units remaining.
[14:17:52] + Closed connections
Grandpa_01
Posts: 1122
Joined: Wed Mar 04, 2009 7:36 am
Hardware configuration: 3 - Supermicro H8QGi-F AMD MC 6174=144 cores 2.5Ghz, 96GB G.Skill DDR3 1333Mhz Ubuntu 10.10
2 - Asus P6X58D-E i7 980X 4.4Ghz 6GB DDR3 2000 A-Data 64GB SSD Ubuntu 10.10
1 - Asus Rampage Gene III 17 970 4.3Ghz DDR3 2000 2-500GB Segate 7200.11 0-Raid Ubuntu 10.10
1 - Asus G73JH Laptop i7 740QM 1.86Ghz ATI 5870M

Re: 130.237.232.237

Post by Grandpa_01 »

I am guessing it is off line now I was assigned to a different server after not being able to connect for a while now running a smp
Image
2 - SM H8QGi-F AMD 6xxx=112 cores @ 3.2 & 3.9Ghz
5 - SM X9QRI-f+ Intel 4650 = 320 cores @ 3.15Ghz
2 - I7 980X 4.4Ghz 2-GTX680
1 - 2700k 4.4Ghz GTX680
Total = 464 cores folding
kasson
Pande Group Member
Posts: 1459
Joined: Thu Nov 29, 2007 9:37 pm

Re: 130.237.232.237

Post by kasson »

Yes--I just upgraded the server software. Let us know if this helps.
Grandpa_01
Posts: 1122
Joined: Wed Mar 04, 2009 7:36 am
Hardware configuration: 3 - Supermicro H8QGi-F AMD MC 6174=144 cores 2.5Ghz, 96GB G.Skill DDR3 1333Mhz Ubuntu 10.10
2 - Asus P6X58D-E i7 980X 4.4Ghz 6GB DDR3 2000 A-Data 64GB SSD Ubuntu 10.10
1 - Asus Rampage Gene III 17 970 4.3Ghz DDR3 2000 2-500GB Segate 7200.11 0-Raid Ubuntu 10.10
1 - Asus G73JH Laptop i7 740QM 1.86Ghz ATI 5870M

Re: 130.237.232.237

Post by Grandpa_01 »

It will be a while I got 1 of your new ones
Image
2 - SM H8QGi-F AMD 6xxx=112 cores @ 3.2 & 3.9Ghz
5 - SM X9QRI-f+ Intel 4650 = 320 cores @ 3.15Ghz
2 - I7 980X 4.4Ghz 2-GTX680
1 - 2700k 4.4Ghz GTX680
Total = 464 cores folding
autogrog
Posts: 38
Joined: Mon Aug 18, 2008 3:38 pm
Location: Halifax, Nova Scotia

Re: 130.237.232.237

Post by autogrog »

kasson wrote:Yes--I just upgraded the server software. Let us know if this helps.
Just successfully completed a 7503 and started getting the same erroneous WU's:

[23:26:43] Connecting to http://130.237.232.237:8080/
[23:26:43] Posted data.
[23:26:43] Initial: 0000; - Receiving payload (expected size: 512)
. . .
and still getting them at posting time. Help!
bollix47
Posts: 2941
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: 130.237.232.237

Post by bollix47 »

Latest WU Server reports problem:

Code: Select all

[20:38:32] + Attempting to get work packet
[20:38:32] Passkey found
[20:38:32] - Will indicate memory of 32233 MB
[20:38:32] - Connecting to assignment server
[20:38:32] Connecting to http://assign.stanford.edu:8080/
[20:38:32] Posted data.
[20:38:32] Initial: ED82; - Successful: assigned to (130.237.232.237).
[20:38:32] + News From Folding@Home: Welcome to Folding@Home
[20:38:32] Loaded queue successfully.
[20:38:32] Sent data
[20:38:32] Connecting to http://130.237.232.237:8080/
[20:38:45] Posted data.
[20:38:45] Initial: 0000; - Receiving payload (expected size: 57248834)
[20:39:45] - Downloaded at ~931 kB/s
[20:39:45] - Averaged speed for that direction ~1022 kB/s
[20:39:45] + Received work.
[20:39:45] Trying to send all finished work units
[20:39:45] + No unsent completed units remaining.
[20:39:45] + Closed connections
[20:39:45] 
[20:39:45] + Processing work unit
[20:39:45] Core required: FahCore_a5.exe
[20:39:45] Core found.
[20:39:45] Working on queue slot 00 [January 26 20:39:45 UTC]
[20:39:45] + Working ...
[20:39:45] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 00 -np 64 -priority 96 -checkpoint 30 -verbose -lifeline 7390 -version 634'

[20:39:46] 
[20:39:46] *------------------------------*
[20:39:46] Folding@Home Gromacs SMP Core
[20:39:46] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[20:39:46] 
[20:39:46] Preparing to commence simulation
[20:39:46] - Looking at optimizations...
[20:39:46] - Created dyn
[20:39:46] - Files status OK
[20:39:54] - Expanded 57248322 -> 71846524 (decompressed 50.4 percent)
[20:39:54] Called DecompressByteArray: compressed_data_size=57248322 data_size=71846524, decompressed_data_size=71846524 diff=0
[20:39:55] - Digital signature verified
[20:39:55] 
[20:39:55] Project: 6903 (Run 4, Clone 14, Gen 82)
[20:39:55] 
[20:39:55] Assembly optimizations on if available.
[20:39:55] Entering M.D.
[20:40:04] Mapping NT from 64 to 64 
[20:40:11] Completed 0 out of 250000 steps  (0%)
[20:57:18] Completed 2500 out of 250000 steps  (1%)
[21:12:54] Completed 5000 out of 250000 steps  (2%)
[21:28:01] Completed 7500 out of 250000 steps  (3%)
[21:43:15] Completed 10000 out of 250000 steps  (4%)
[21:58:22] Completed 12500 out of 250000 steps  (5%)
[22:13:36] Completed 15000 out of 250000 steps  (6%)
[22:28:43] Completed 17500 out of 250000 steps  (7%)
[22:41:33] - Autosending finished units... [January 26 22:41:33 UTC]
[22:41:33] Trying to send all finished work units
[22:41:33] + No unsent completed units remaining.
[22:41:33] - Autosend completed
[22:43:56] Completed 20000 out of 250000 steps  (8%)
[22:59:03] Completed 22500 out of 250000 steps  (9%)
[23:14:17] Completed 25000 out of 250000 steps  (10%)
[23:29:24] Completed 27500 out of 250000 steps  (11%)
[23:44:37] Completed 30000 out of 250000 steps  (12%)
[23:59:42] Completed 32500 out of 250000 steps  (13%)
[00:14:55] Completed 35000 out of 250000 steps  (14%)
[00:30:02] Completed 37500 out of 250000 steps  (15%)
[00:45:15] Completed 40000 out of 250000 steps  (16%)
[01:00:22] Completed 42500 out of 250000 steps  (17%)
[01:15:35] Completed 45000 out of 250000 steps  (18%)
[01:30:42] Completed 47500 out of 250000 steps  (19%)
[01:45:55] Completed 50000 out of 250000 steps  (20%)
[02:01:02] Completed 52500 out of 250000 steps  (21%)
[02:16:15] Completed 55000 out of 250000 steps  (22%)
[02:31:21] Completed 57500 out of 250000 steps  (23%)
[02:46:34] Completed 60000 out of 250000 steps  (24%)
[03:01:40] Completed 62500 out of 250000 steps  (25%)
[03:16:53] Completed 65000 out of 250000 steps  (26%)
[03:31:59] Completed 67500 out of 250000 steps  (27%)
[03:47:12] Completed 70000 out of 250000 steps  (28%)
[04:02:19] Completed 72500 out of 250000 steps  (29%)
[04:17:32] Completed 75000 out of 250000 steps  (30%)
[04:32:39] Completed 77500 out of 250000 steps  (31%)
[04:41:33] - Autosending finished units... [January 27 04:41:33 UTC]
[04:41:33] Trying to send all finished work units
[04:41:33] + No unsent completed units remaining.
[04:41:33] - Autosend completed
[04:47:52] Completed 80000 out of 250000 steps  (32%)
[05:02:59] Completed 82500 out of 250000 steps  (33%)
[05:18:13] Completed 85000 out of 250000 steps  (34%)
[05:33:20] Completed 87500 out of 250000 steps  (35%)
[05:48:33] Completed 90000 out of 250000 steps  (36%)
[06:03:40] Completed 92500 out of 250000 steps  (37%)
[06:18:54] Completed 95000 out of 250000 steps  (38%)
[06:34:01] Completed 97500 out of 250000 steps  (39%)
[06:49:15] Completed 100000 out of 250000 steps  (40%)
[07:04:24] Completed 102500 out of 250000 steps  (41%)
[07:19:37] Completed 105000 out of 250000 steps  (42%)
[07:34:44] Completed 107500 out of 250000 steps  (43%)
[07:49:58] Completed 110000 out of 250000 steps  (44%)
[08:05:06] Completed 112500 out of 250000 steps  (45%)
[08:20:20] Completed 115000 out of 250000 steps  (46%)
[08:35:29] Completed 117500 out of 250000 steps  (47%)
[08:50:43] Completed 120000 out of 250000 steps  (48%)
[09:05:51] Completed 122500 out of 250000 steps  (49%)
[09:21:06] Completed 125000 out of 250000 steps  (50%)
[09:36:14] Completed 127500 out of 250000 steps  (51%)
[09:51:29] Completed 130000 out of 250000 steps  (52%)
[10:06:37] Completed 132500 out of 250000 steps  (53%)
[10:21:50] Completed 135000 out of 250000 steps  (54%)
[10:36:58] Completed 137500 out of 250000 steps  (55%)
[10:41:33] - Autosending finished units... [January 27 10:41:33 UTC]
[10:41:33] Trying to send all finished work units
[10:41:33] + No unsent completed units remaining.
[10:41:33] - Autosend completed
[10:52:11] Completed 140000 out of 250000 steps  (56%)
[11:07:18] Completed 142500 out of 250000 steps  (57%)
[11:22:32] Completed 145000 out of 250000 steps  (58%)
[11:37:38] Completed 147500 out of 250000 steps  (59%)
[11:52:51] Completed 150000 out of 250000 steps  (60%)
[12:07:57] Completed 152500 out of 250000 steps  (61%)
[12:23:09] Completed 155000 out of 250000 steps  (62%)
[12:38:16] Completed 157500 out of 250000 steps  (63%)
[12:54:49] Completed 160000 out of 250000 steps  (64%)
[13:09:55] Completed 162500 out of 250000 steps  (65%)
[13:25:08] Completed 165000 out of 250000 steps  (66%)
[13:40:20] Completed 167500 out of 250000 steps  (67%)
[13:55:27] Completed 170000 out of 250000 steps  (68%)
[14:10:40] Completed 172500 out of 250000 steps  (69%)
[14:25:47] Completed 175000 out of 250000 steps  (70%)
[14:41:03] Completed 177500 out of 250000 steps  (71%)
[14:56:15] Completed 180000 out of 250000 steps  (72%)
[15:11:34] Completed 182500 out of 250000 steps  (73%)
[15:26:44] Completed 185000 out of 250000 steps  (74%)
[15:41:57] Completed 187500 out of 250000 steps  (75%)
[15:57:04] Completed 190000 out of 250000 steps  (76%)
[16:12:19] Completed 192500 out of 250000 steps  (77%)
[16:27:26] Completed 195000 out of 250000 steps  (78%)
[16:41:33] - Autosending finished units... [January 27 16:41:33 UTC]
[16:41:33] Trying to send all finished work units
[16:41:33] + No unsent completed units remaining.
[16:41:33] - Autosend completed
[16:42:40] Completed 197500 out of 250000 steps  (79%)
[16:57:47] Completed 200000 out of 250000 steps  (80%)
[17:13:02] Completed 202500 out of 250000 steps  (81%)
[17:28:09] Completed 205000 out of 250000 steps  (82%)
[17:43:23] Completed 207500 out of 250000 steps  (83%)
[17:58:30] Completed 210000 out of 250000 steps  (84%)
[18:13:44] Completed 212500 out of 250000 steps  (85%)
[18:28:51] Completed 215000 out of 250000 steps  (86%)
[18:44:04] Completed 217500 out of 250000 steps  (87%)
[18:59:12] Completed 220000 out of 250000 steps  (88%)
[19:14:25] Completed 222500 out of 250000 steps  (89%)
[19:29:34] Completed 225000 out of 250000 steps  (90%)
[19:44:53] Completed 227500 out of 250000 steps  (91%)
[20:00:05] Completed 230000 out of 250000 steps  (92%)
[20:15:20] Completed 232500 out of 250000 steps  (93%)
[20:30:27] Completed 235000 out of 250000 steps  (94%)
[20:45:40] Completed 237500 out of 250000 steps  (95%)
[21:00:49] Completed 240000 out of 250000 steps  (96%)
[21:16:02] Completed 242500 out of 250000 steps  (97%)
[21:31:09] Completed 245000 out of 250000 steps  (98%)
[21:46:22] Completed 247500 out of 250000 steps  (99%)
[22:01:29] Completed 250000 out of 250000 steps  (100%)
[22:02:01] DynamicWrapper: Finished Work Unit: sleep=10000
[22:02:11] 
[22:02:11] Finished Work Unit:
[22:02:11] - Reading up to 121622496 from "work/wudata_00.trr": Read 121622496
[22:02:12] trr file hash check passed.
[22:02:12] - Reading up to 108804224 from "work/wudata_00.xtc": Read 108804224
[22:02:14] xtc file hash check passed.
[22:02:14] edr file hash check passed.
[22:02:14] logfile size: 205409
[22:02:14] Leaving Run
[22:02:14] - Writing 230805121 bytes of core data to disk...
[22:03:30] Done: 230804609 -> 222463349 (compressed to 3.3 percent)
[22:03:31]   ... Done.
[22:03:55] - Shutting down core
[22:03:55] 
[22:03:55] Folding@home Core Shutdown: FINISHED_UNIT
[22:03:58] CoreStatus = 64 (100)
[22:03:58] Unit 0 finished with 91 percent of time to deadline remaining.
[22:03:58] Updated performance fraction: 0.918923
[22:03:58] Sending work to server
[22:03:58] Project: 6903 (Run 4, Clone 14, Gen 82)


[22:03:58] + Attempting to send results [January 27 22:03:58 UTC]
[22:03:58] - Reading file work/wuresults_00.dat from core
[22:03:58]   (Read 222463861 bytes from disk)
[22:03:58] Connecting to http://130.237.232.237:8080/
[22:33:58] Posted data.
[22:33:58] Initial: 0000; - Uploaded at ~120 kB/s
[22:33:58] - Averaged speed for that direction ~119 kB/s
[22:33:58] - Server reports problem with unit.  <===========================
[22:33:58] Trying to send all finished work units
[22:33:58] + No unsent completed units remaining.
[22:33:58] - Preparing to get new work unit...
[22:33:58] Cleaning up work directory
[
Now working on same WU but going to delete and move on rather than waste another day not to mention the science loss and almost 500K points. :e?:

AFAIK this is the first time I've ever had the message "Server reports problem with unit".
Last edited by bollix47 on Sat Jan 28, 2012 9:25 am, edited 1 time in total.
kasson
Pande Group Member
Posts: 1459
Joined: Thu Nov 29, 2007 9:37 pm

Re: 130.237.232.237

Post by kasson »

Not sure about the "server reports problems." I have one idea--just tweaked a server setting. I don't suppose you have a backup of the results file to re-send?
Regarding the "dud" work units, it appears neither the new nor the old server software filters them properly. We'll have to come up with a manual solution.
autogrog
Posts: 38
Joined: Mon Aug 18, 2008 3:38 pm
Location: Halifax, Nova Scotia

Re: 130.237.232.237

Post by autogrog »

kasson wrote:Not sure about the "server reports problems." I have one idea--just tweaked a server setting. I don't suppose you have a backup of the results file to re-send?
Regarding the "dud" work units, it appears neither the new nor the old server software filters them properly. We'll have to come up with a manual solution.
I hope this problem can get fixed soon, because the 'duds' just keep on coming:
[01:37:21] Connecting to http://130.237.232.237:8080/
[01:37:21] Posted data.
[01:37:21] Initial: 0000; - Receiving payload (expected size: 512)
bollix47
Posts: 2941
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: 130.237.232.237

Post by bollix47 »

kasson wrote:Not sure about the "server reports problems." I have one idea--just tweaked a server setting. I don't suppose you have a backup of the results file to re-send?
I did a backup prior to dumping and I do have one results file in the backup's work folder - wuresults_00.dat.

Tried to send with ./fah6 -send 00 but the client said it wasn't finished.

Code: Select all

bollix@Gemini:~/backups/smp$ ./fah6 -send 00

Note: Please read the license agreement (fah6 -license). Further 
use of this software requires that you have read and accepted this agreement.

64 cores detected


--- Opening Log file [January 28 01:58:19 UTC] 


# Linux SMP Console Edition ###################################################
###############################################################################

                       Folding@Home Client Version 6.34

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: /home/bollix/backups/smp
Executable: ./fah6
Arguments: -send 00 -bigadv -smp 64 -verbosity 9 

[01:58:19] - Ask before connecting: No
[01:58:19] - User name: bollix47 (Team 39340)
[01:58:19] - User ID: 3BF502E016E04404
[01:58:19] - Machine ID: 2
[01:58:19] 
[01:58:19] Loaded queue successfully.
[01:58:19] Attempting to return result(s) to server...
[01:58:19] Project: 6903 (Run 4, Clone 14, Gen 82)
[01:58:19] - Warning: Asked to send unfinished unit to server
[01:58:19] - Failed to send unit 00 to server
[01:58:19] ***** Got a SIGTERM signal (15)
[01:58:19] Killing all core threads

Folding@Home Client Shutdown.
bollix@Gemini:~/backups/smp$ 
Tried using qd and qfix but that didn't help:

qd

Code: Select all

bollix@Gemini:~/backups/smp$ ./qd
qd released 28 July 2011 (fr 086)
qd executed Fri Jan 27 20:59:25 EST 2012 (Sat Jan 28 01:59:25 UTC 2012)
Queue version 6.00
Current index: 1
 Index 2: finished 7164.00 pts (618.534 pt/hr, 14831.30 ppd) 12.4 X min speed
  bonus pts: 129731.38 (11190.707 pt/hr, 268576.97 ppd); bonus factor: 18.11; kfactor: 26.40
  server: 130.237.232.237:8080; project: 6901
  Folding: run 12, clone 7, generation 60; benchmark 0; misc: 500, 200, 12 (be)
  issue: Wed Jan 18 16:04:34 2012; begin: Wed Jan 18 16:05:12 2012
  end: Thu Jan 19 03:40:08 2012; due: Tue Jan 24 16:05:12 2012 (6 days)
  preferred: Sun Jan 22 16:05:12 2012 (4 days)
  core URL: http://www.stanford.edu/~pande/Linux/AMD64/Core_a5.fah
  core number: 0xa5; core name: GRO-A5
  CPU: 16,0 AMD64; OS: 4,0 Linux
  smp cores: 64; cores to use: 64
  flops: 1064680370 (1064.680370 megaflops)
  memory: 32233 MB
  client type: 7 BigAdv
  assignment info (be): Wed Jan 18 16:02:41 2012; 945D40E3
  CS: 130.237.165.141; P limit: 524286976
  user: bollix47; team: 39340; ID: 0644E016E002F53B; mach ID: 2
  work/wudata_02.dat file size: 24862120; WU type: Folding@home
 Index 3: finished 22706.00 pts (908.210 pt/hr, 21782.75 ppd) 11.5 X min speed
  bonus pts: 475219.96 (18995.713 pt/hr, 455897.09 ppd); bonus factor: 20.93; kfactor: 38.05
  server: 130.237.232.237:8080; project: 6903
  Folding: run 10, clone 4, generation 32; benchmark 0; misc: 500, 200, 12 (be)
  issue: Thu Jan 19 03:53:46 2012; begin: Thu Jan 19 03:54:45 2012
  end: Fri Jan 20 04:54:48 2012; due: Tue Jan 31 03:54:45 2012 (12 days)
  preferred: Tue Jan 24 03:54:45 2012 (5 days)
  core URL: http://www.stanford.edu/~pande/Linux/AMD64/Core_a5.fah
  core number: 0xa5; core name: GRO-A5
  CPU: 16,0 AMD64; OS: 4,0 Linux
  smp cores: 64; cores to use: 64
  flops: 1064545054 (1064.545054 megaflops)
  memory: 32233 MB
  client type: 7 BigAdv
  assignment info (be): Thu Jan 19 03:51:45 2012; 945E3933
  CS: 130.237.165.141; P limit: 524286976
  user: bollix47; team: 39340; ID: 0644E016E002F53B; mach ID: 2
  work/wudata_03.dat file size: 57234533; WU type: Folding@home
 Index 4: finished 22706.00 pts (909.109 pt/hr, 21803.82 ppd) 11.5 X min speed
  bonus pts: 475449.66 (19023.271 pt/hr, 456558.50 ppd); bonus factor: 20.94; kfactor: 38.05
  server: 130.237.232.237:8080; project: 6903
  Folding: run 11, clone 13, generation 36; benchmark 0; misc: 500, 200, 12 (be)
  issue: Fri Jan 20 05:25:17 2012; begin: Fri Jan 20 05:26:18 2012
  end: Sat Jan 21 06:24:52 2012; due: Wed Feb  1 05:26:18 2012 (12 days)
  preferred: Wed Jan 25 05:26:18 2012 (5 days)
  core URL: http://www.stanford.edu/~pande/Linux/AMD64/Core_a5.fah
  core number: 0xa5; core name: GRO-A5
  CPU: 16,0 AMD64; OS: 4,0 Linux
  smp cores: 64; cores to use: 64
  flops: 1064415405 (1064.415405 megaflops)
  memory: 32233 MB
  client type: 7 BigAdv
  assignment info (be): Fri Jan 20 05:23:16 2012; 945F5206
  CS: 130.237.165.141; P limit: 524286976
  user: bollix47; team: 39340; ID: 0644E016E002F53B; mach ID: 2
  work/wudata_04.dat file size: 57240147; WU type: Folding@home
 Index 5: finished 22706.00 pts (906.849 pt/hr, 21747.98 ppd) 11.5 X min speed
  bonus pts: 474840.49 (18950.244 pt/hr, 454805.88 ppd); bonus factor: 20.91; kfactor: 38.05
  server: 130.237.232.237:8080; project: 6903
  Folding: run 11, clone 13, generation 37; benchmark 0; misc: 500, 200, 12 (be)
  issue: Sat Jan 21 06:55:24 2012; begin: Sat Jan 21 06:56:32 2012
  end: Sun Jan 22 07:58:50 2012; due: Thu Feb  2 06:56:32 2012 (12 days)
  preferred: Thu Jan 26 06:56:32 2012 (5 days)
  core URL: http://www.stanford.edu/~pande/Linux/AMD64/Core_a5.fah
  core number: 0xa5; core name: GRO-A5
  CPU: 16,0 AMD64; OS: 4,0 Linux
  smp cores: 64; cores to use: 64
  flops: 1064311974 (1064.311974 megaflops)
  memory: 32233 MB
  client type: 7 BigAdv
  assignment info (be): Sat Jan 21 06:53:23 2012; 9459CCA1
  CS: 130.237.165.141; P limit: 524286976
  user: bollix47; team: 39340; ID: 0644E016E002F53B; mach ID: 2
  work/wudata_05.dat file size: 57240448; WU type: Folding@home
 Index 6: finished 22706.00 pts (903.542 pt/hr, 21668.47 ppd) 11.5 X min speed
  bonus pts: 473971.68 (18846.418 pt/hr, 452314.00 ppd); bonus factor: 20.87; kfactor: 38.05
  server: 130.237.232.237:8080; project: 6903
  Folding: run 11, clone 13, generation 38; benchmark 0; misc: 500, 200, 12 (be)
  issue: Sun Jan 22 08:29:31 2012; begin: Sun Jan 22 08:30:40 2012
  end: Mon Jan 23 09:38:28 2012; due: Fri Feb  3 08:30:40 2012 (12 days)
  preferred: Fri Jan 27 08:30:40 2012 (5 days)
  core URL: http://www.stanford.edu/~pande/Linux/AMD64/Core_a5.fah
  core number: 0xa5; core name: GRO-A5
  CPU: 16,0 AMD64; OS: 4,0 Linux
  smp cores: 64; cores to use: 64
  flops: 1064228504 (1064.228504 megaflops)
  memory: 32233 MB
  client type: 7 BigAdv
  assignment info (be): Sun Jan 22 08:27:29 2012; 945A6453
  CS: 130.237.165.141; P limit: 524286976
  user: bollix47; team: 39340; ID: 0644E016E002F53B; mach ID: 2
  work/wudata_06.dat file size: 57243454; WU type: Folding@home
 Index 7: finished 22706.00 pts (904.331 pt/hr, 21687.16 ppd) 11.5 X min speed
  bonus pts: 474176.01 (18870.799 pt/hr, 452899.19 ppd); bonus factor: 20.88; kfactor: 38.05
  server: 130.237.232.237:8080; project: 6903
  Folding: run 7, clone 14, generation 56; benchmark 0; misc: 500, 200, 12 (be)
  issue: Mon Jan 23 10:08:49 2012; begin: Mon Jan 23 10:09:59 2012
  end: Tue Jan 24 11:16:28 2012; due: Sat Feb  4 10:09:59 2012 (12 days)
  preferred: Sat Jan 28 10:09:59 2012 (5 days)
  core URL: http://www.stanford.edu/~pande/Linux/AMD64/Core_a5.fah
  core number: 0xa5; core name: GRO-A5
  CPU: 16,0 AMD64; OS: 4,0 Linux
  smp cores: 64; cores to use: 64
  flops: 1064160661 (1064.160661 megaflops)
  memory: 32233 MB
  client type: 7 BigAdv
  assignment info (be): Mon Jan 23 10:06:48 2012; 94449B1A
  CS: 130.237.165.141; P limit: 524286976
  user: bollix47; team: 39340; ID: 0644E016E002F53B; mach ID: 2
  work/wudata_07.dat file size: 57247759; WU type: Folding@home
 Index 8: finished 22706.00 pts (892.171 pt/hr, 21393.19 ppd) 11.3 X min speed
  bonus pts: 470951.36 (18488.418 pt/hr, 443722.03 ppd); bonus factor: 20.74; kfactor: 38.05
  server: 130.237.232.237:8080; project: 6903
  Folding: run 4, clone 14, generation 80; benchmark 0; misc: 500, 200, 12 (be)
  issue: Tue Jan 24 11:46:45 2012; begin: Tue Jan 24 11:48:06 2012
  end: Wed Jan 25 13:15:07 2012; due: Sun Feb  5 11:48:06 2012 (12 days)
  preferred: Sun Jan 29 11:48:06 2012 (5 days)
  core URL: http://www.stanford.edu/~pande/Linux/AMD64/Core_a5.fah
  core number: 0xa5; core name: GRO-A5
  CPU: 16,0 AMD64; OS: 4,0 Linux
  smp cores: 64; cores to use: 64
  flops: 1064106642 (1064.106642 megaflops)
  memory: 32233 MB
  client type: 7 BigAdv
  assignment info (be): Tue Jan 24 11:44:43 2012; 94453369
  CS: 130.237.165.141; P limit: 524286976
  user: bollix47; team: 39340; ID: 0644E016E002F53B; mach ID: 2
  work/wudata_08.dat file size: 57248798; WU type: Folding@home
 Index 9: finished 22706.00 pts (895.779 pt/hr, 21475.86 ppd) 11.4 X min speed
  bonus pts: 471860.41 (18595.688 pt/hr, 446296.50 ppd); bonus factor: 20.78; kfactor: 38.05
  server: 130.237.232.237:8080; project: 6903
  Folding: run 4, clone 14, generation 81; benchmark 0; misc: 500, 200, 12 (be)
  issue: Wed Jan 25 13:45:30 2012; begin: Wed Jan 25 13:47:07 2012
  end: Thu Jan 26 15:07:59 2012; due: Mon Feb  6 13:47:07 2012 (12 days)
  preferred: Mon Jan 30 13:47:07 2012 (5 days)
  core URL: http://www.stanford.edu/~pande/Linux/AMD64/Core_a5.fah
  core number: 0xa5; core name: GRO-A5
  CPU: 16,0 AMD64; OS: 4,0 Linux
  smp cores: 64; cores to use: 64
  flops: 1064059440 (1064.059440 megaflops)
  memory: 32233 MB
  client type: 7 BigAdv
  assignment info (be): Wed Jan 25 13:43:28 2012; 9447A6C2
  CS: 130.237.165.141; P limit: 524286976
  user: bollix47; team: 39340; ID: 0644E016E002F53B; mach ID: 2
  work/wudata_09.dat file size: 57249192; WU type: Folding@home
 Index 0: finished 22706.00 pts (893.810 pt/hr, 21428.94 ppd) 11.3 X min speed
  bonus pts: 471344.73 (18534.785 pt/hr, 444834.84 ppd); bonus factor: 20.76; kfactor: 38.05
  server: 130.237.232.237:8080; project: 6903
  Folding: run 4, clone 14, generation 82; benchmark 0; misc: 500, 200, 12 (be)
  issue: Thu Jan 26 15:38:09 2012; begin: Thu Jan 26 15:39:45 2012
  end: Fri Jan 27 17:03:58 2012; due: Tue Feb  7 15:39:45 2012 (12 days)
  preferred: Tue Jan 31 15:39:45 2012 (5 days)
  core URL: http://www.stanford.edu/~pande/Linux/AMD64/Core_a5.fah (V2.27)
  core number: 0xa5; core name: GRO-A5
  CPU: 16,0 AMD64; OS: 4,0 Linux
  smp cores: 64; cores to use: 64
  flops: 1064022872 (1064.022872 megaflops)
  memory: 32233 MB
  client type: 7 BigAdv
  assignment info (be): Thu Jan 26 15:36:07 2012; 9440DAA5
  CS: 130.237.165.141; P limit: 524286976
  user: bollix47; team: 39340; ID: 0644E016E002F53B; mach ID: 2
  work/wudata_00.dat file size: 57248834; WU type: Folding@home
 Index 1: folding now 22706.00 pts (787.491 pt/hr, 18878.51 ppd) 9.99 X min speed; 6% complete
  bonus pts: 442407.15 (919.579 pt/hr, 367831.81 ppd); bonus factor: 19.48; kfactor: 38.05
  server: 130.237.232.237:8080; project: 6903
  Folding: run 4, clone 14, generation 82; benchmark 0; misc: 500, 634, 12 (be)
  issue: Fri Jan 27 17:33:22 2012; begin: Fri Jan 27 17:35:19 2012
  expect: Sat Jan 28 22:25:19 2012; due: Wed Feb  8 17:35:19 2012 (12 days)
  preferred: Wed Feb  1 17:35:19 2012 (5 days)
  core URL: http://www.stanford.edu/~pande/Linux/AMD64/Core_a5.fah (V2.27)
  core number: 0xa5; core name: GRO-A5
  CPU: 16,0 AMD64; OS: 4,0 Linux
  smp cores: 64; cores to use: 64
  flops: 1063992966 (1063.992966 megaflops)
  memory: 32233 MB
  client type: 7 BigAdv
  assignment info (be): Fri Jan 27 17:31:32 2012; 94414D56
  P limit: 524286976
  user: bollix47; team: 39340; ID: 0644E016E002F53B; mach ID: 2
  work/wudata_01.dat file size: 57248834; WU type: Folding@home
Average download rate 1013.673 KB/s (u=4); upload rate 121.922 KB/s (u=4)
Performance fraction 0.918923 (u=4)
Average pph: 885.491, ppd: 21251.77, ppw: 148762.4, ppy: 7761998
Average bonus pph: 16292.470, ppd: 391019.29, ppw: 2737135.0, ppy: 142815884
Average alternate pph: 873.750, ppd: 20970.01, ppw: 146790.1, ppy: 7659087
Average alternate bonus pph: 18010.340, ppd: 432248.15, ppw: 3025737.0, ppy: 157874314
bollix@Gemini:~/backups/smp$ 
qfix

Code: Select all

bollix@Gemini:~/backups/smp$ ./qfix
entry 2, status 0, address 130.237.232.237:8080
entry 3, status 0, address 130.237.232.237:8080
entry 4, status 0, address 130.237.232.237:8080
entry 5, status 0, address 130.237.232.237:8080
entry 6, status 0, address 130.237.232.237:8080
entry 7, status 0, address 130.237.232.237:8080
entry 8, status 0, address 130.237.232.237:8080
entry 9, status 0, address 130.237.232.237:8080
entry 0, status 0, address 130.237.232.237:8080
  Found results <work/wuresults_00.dat>: proj 51778, run 43896, clone 43484, gen 19939
   -- queue entry: proj 6903, run 4, clone 14, gen 82
   -- doesn't match queue entry
entry 1, status 1, address 130.237.232.237:8080
File is OK
bollix@Gemini:~/backups/smp$
As you can see in the results for qfix the PRCGs don't match so maybe I'm using an older version that doesn't work with this client's results file.

Any help would be appreciated at this point. :e?:
3.0charlie
Posts: 13
Joined: Wed Jul 29, 2009 4:34 pm

Re: 130.237.232.237

Post by 3.0charlie »

Server still hangs at assigning new WUs. Loads up the queue and that's it. Even did a clean re-install of FaH, just in case...
Folding for Hardware Canucks
Grandpa_01
Posts: 1122
Joined: Wed Mar 04, 2009 7:36 am
Hardware configuration: 3 - Supermicro H8QGi-F AMD MC 6174=144 cores 2.5Ghz, 96GB G.Skill DDR3 1333Mhz Ubuntu 10.10
2 - Asus P6X58D-E i7 980X 4.4Ghz 6GB DDR3 2000 A-Data 64GB SSD Ubuntu 10.10
1 - Asus Rampage Gene III 17 970 4.3Ghz DDR3 2000 2-500GB Segate 7200.11 0-Raid Ubuntu 10.10
1 - Asus G73JH Laptop i7 740QM 1.86Ghz ATI 5870M

Re: 130.237.232.237

Post by Grandpa_01 »

That is strange I have recieved 3 from that server today.
bollix47
Posts: 2941
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: 130.237.232.237

Post by bollix47 »

FYI

Had the 512 byte problem today. AFAIK this was the first time it's ever happened to me.

Code: Select all

[14:14:25] + Attempting to get work packet
[14:14:25] Passkey found
[14:14:25] - Will indicate memory of 32233 MB
[14:14:25] - Connecting to assignment server
[14:14:25] Connecting to http://assign.stanford.edu:8080/
[14:14:26] Posted data.
[14:14:26] Initial: ED82; - Successful: assigned to (130.237.232.237).
[14:14:26] + News From Folding@Home: Welcome to Folding@Home
[14:14:26] Loaded queue successfully.
[14:14:26] Sent data
[14:14:26] Connecting to http://130.237.232.237:8080/
[14:14:26] Posted data.
[14:14:26] Initial: 0000; - Receiving payload (expected size: 512)
[14:14:26] Conversation time very short, giving reduced weight in bandwidth avg
[14:14:26] - Downloaded at ~1 kB/s
[14:14:26] - Averaged speed for that direction ~449 kB/s
[14:14:26] + Received work.
[14:14:26] Trying to send all finished work units
[14:14:26] + No unsent completed units remaining.
[14:14:26] + Closed connections
[14:14:26] 
[14:14:26] + Processing work unit
[14:14:26] Core required: FahCore_a5.exe
[14:14:26] Core found.
[14:14:26] Working on queue slot 02 [January 28 14:14:26 UTC]
[14:14:26] + Working ...
[14:14:26] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 02 -np 64 -priority 96 -checkpoint 30 -verbose -lifeline 4689 -version 634'

[14:14:26] 
[14:14:26] *------------------------------*
[14:14:26] Folding@Home Gromacs SMP Core
[14:14:26] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[14:14:26] 
[14:14:26] Preparing to commence simulation
[14:14:26] - Looking at optimizations...
[14:14:26] - Created dyn
[14:14:26] - Files status OK
[14:14:26] Couldn't Decompress
[14:14:26] Called DecompressByteArray: compressed_data_size=0 data_size=0, decompressed_data_size=0 diff=0
[14:14:26] -Error: Couldn't update checksum variables
[14:14:26] Error: Could not open work file
[14:14:26] 
[14:14:26] Folding@home Core Shutdown: FILE_IO_ERROR
[14:14:26] CoreStatus = 75 (117)
[14:14:26] Error opening or reading from a file.
[14:14:26] Deleting current work unit & continuing...
[14:14:26] Trying to send all finished work units
[14:14:26] + No unsent completed units remaining.
[14:14:26] - Preparing to get new work unit...
[14:14:26] Cleaning up work directory
[14:14:26] + Attempting to get work packet
[14:14:26] Passkey found
[14:14:26] - Will indicate memory of 32233 MB
[14:14:26] - Connecting to assignment server
[14:14:26] Connecting to http://assign.stanford.edu:8080/
[14:14:27] Posted data.
[14:14:27] Initial: ED82; - Successful: assigned to (130.237.232.237).
[14:14:27] + News From Folding@Home: Welcome to Folding@Home
[14:14:27] Loaded queue successfully.
[14:14:27] Sent data
[14:14:27] Connecting to http://130.237.232.237:8080/
[14:14:27] Posted data.
[14:14:27] Initial: 0000; - Receiving payload (expected size: 512)
[14:14:27] Conversation time very short, giving reduced weight in bandwidth avg
[14:14:27] - Downloaded at ~1 kB/s
[14:14:27] - Averaged speed for that direction ~360 kB/s
[14:14:27] + Received work.
[14:14:27] + Closed connections
[14:14:32] 
[14:14:32] + Processing work unit
[14:14:32] Core required: FahCore_a5.exe
[14:14:32] Core found.
[14:14:32] Working on queue slot 03 [January 28 14:14:32 UTC]
[14:14:32] + Working ...
[14:14:32] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 03 -np 64 -priority 96 -checkpoint 30 -verbose -lifeline 4689 -version 634'

[14:14:33] 
[14:14:33] *------------------------------*
[14:14:33] Folding@Home Gromacs SMP Core
[14:14:33] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[14:14:33] 
[14:14:33] Preparing to commence simulation
[14:14:33] - Looking at optimizations...
[14:14:33] - Created dyn
[14:14:33] - Files status OK
[14:14:33] Couldn't Decompress
[14:14:33] Called DecompressByteArray: compressed_data_size=0 data_size=0, decompressed_data_size=0 diff=0
[14:14:33] -Error: Couldn't update checksum variables
[14:14:33] Error: Could not open work file
[14:14:33] 
[14:14:33] Folding@home Core Shutdown: FILE_IO_ERROR
[14:14:33] CoreStatus = 75 (117)
[14:14:33] Error opening or reading from a file.
[14:14:33] Deleting current work unit & continuing...
[14:14:33] Trying to send all finished work units
[14:14:33] + No unsent completed units remaining.
[14:14:33] - Preparing to get new work unit...
[14:14:33] Cleaning up work directory

Repeated numerous times but too many characters to post <<<<<<<<<<<<<<<<<<<<<<<<<<<


[14:35:46] + Attempting to get work packet
[14:35:46] Passkey found
[14:35:46] - Will indicate memory of 32233 MB
[14:35:46] - Connecting to assignment server
[14:35:46] Connecting to http://assign.stanford.edu:8080/
[14:35:46] Posted data.
[14:35:46] Initial: ED82; - Successful: assigned to (130.237.232.237).
[14:35:46] + News From Folding@Home: Welcome to Folding@Home
[14:35:46] Loaded queue successfully.
[14:35:46] Sent data
[14:35:46] Connecting to http://130.237.232.237:8080/
[14:35:47] Posted data.
[14:35:47] Initial: 0000; - Receiving payload (expected size: 512)
[14:35:47] Conversation time very short, giving reduced weight in bandwidth avg
[14:35:47] - Downloaded at ~1 kB/s
[14:35:47] - Averaged speed for that direction ~1 kB/s
[14:35:47] + Received work.
[14:35:47] + Closed connections
[14:35:52] 
[14:35:52] + Processing work unit
[14:35:52] Core required: FahCore_a5.exe
[14:35:52] Core found.
[14:35:52] Working on queue slot 02 [January 28 14:35:52 UTC]
[14:35:52] + Working ...
[14:35:52] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 02 -np 64 -priority 96 -checkpoint 30 -verbose -lifeline 4689 -version 634'

[14:35:52] 
[14:35:52] *------------------------------*
[14:35:52] Folding@Home Gromacs SMP Core
[14:35:52] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[14:35:52] 
[14:35:52] Preparing to commence simulation
[14:35:52] - Looking at optimizations...
[14:35:52] - Created dyn
[14:35:52] - Files status OK
[14:35:52] Couldn't Decompress
[14:35:52] Called DecompressByteArray: compressed_data_size=0 data_size=0, decompressed_data_size=0 diff=0
[14:35:52] -Error: Couldn't update checksum variables
[14:35:52] Error: Could not open work file
[14:35:52] 
[14:35:52] Folding@home Core Shutdown: FILE_IO_ERROR
[14:35:52] CoreStatus = 75 (117)
[14:35:52] Error opening or reading from a file.
[14:35:52] Too many errors during run. Purging queue.
[14:35:52] 
[14:35:52] + Processing work unit
[14:35:52] Core required: FahCore_a5.exe
[14:35:52] Core found.
[14:35:52] Working on queue slot 02 [January 28 14:35:52 UTC]
[14:35:52] + Working ...
[14:35:52] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 02 -np 64 -priority 96 -checkpoint 30 -verbose -lifeline 4689 -version 634'

[14:35:52] 
[14:35:52] *------------------------------*
[14:35:52] Folding@Home Gromacs SMP Core
[14:35:52] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
[14:35:52] 
[14:35:52] Preparing to commence simulation
[14:35:52] - Looking at optimizations...
[14:35:52] - Created dyn
[14:35:52] - Files status OK
[14:35:52] Error: Missing work file=<>
[14:35:52] 
[14:35:52] Folding@home Core Shutdown: MISSING_WORK_FILES
[14:35:52] CoreStatus = 74 (116)
[14:35:52] The core could not find the work files specified. Removing from queue
[14:35:52] Deleting current work unit & continuing...
[14:35:52] Trying to send all finished work units
[14:35:52] + No unsent completed units remaining.
[14:35:52] - Preparing to get new work unit...
[14:35:52] Cleaning up work directory
[14:35:52] + Attempting to get work packet
[14:35:52] Passkey found
[14:35:52] - Will indicate memory of 32233 MB
[14:35:52] - Connecting to assignment server
[14:35:52] Connecting to http://assign.stanford.edu:8080/
[14:39:02] - Couldn't send HTTP request to server
[14:39:02] + Could not connect to Assignment Server
[14:39:02] Connecting to http://assign2.stanford.edu:80/
[14:39:03] Posted data.
[14:39:03] Initial: ED82; - Successful: assigned to (130.237.232.141).
[14:39:03] + News From Folding@Home: Welcome to Folding@Home
[14:39:03] Loaded queue successfully.
[14:39:03] Sent data
[14:39:03] Connecting to http://130.237.232.141:80/
[14:39:09] Posted data.
[14:39:09] Initial: 0000; - Receiving payload (expected size: 24867879)
According to HFM the queue filled up with P6901 (R12, C4, G130)

As can be seen in the log the problem continued for ~25 minutes before switching to server 232.141 and starting folding normally with a P6900.
Post Reply