171.67.108.45 and 171.64.65.35

Moderators: Site Moderators, PandeGroup

171.67.108.45 and 171.64.65.35

Postby Wreck3r » Wed Sep 13, 2017 8:44 pm

Hey,

Just started having problems uploading work and receiving jobs.

Last job and following errors:

Code: Select all
19:02:01:WU02:FS01:Starting
19:02:01:WU02:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 02 -suffix 01 -version 704 -lifeline 1527 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
19:02:01:WU02:FS01:Started FahCore on PID 8552
19:02:01:WU02:FS01:Core PID:8556
19:02:01:WU02:FS01:FahCore 0x21 started
19:02:01:WU02:FS01:0x21:*********************** Log Started 2017-09-13T19:02:01Z ***********************
19:02:01:WU02:FS01:0x21:Project: 9415 (Run 1680, Clone 0, Gen 344)
19:02:01:WU02:FS01:0x21:Unit: 0x00000187ab436c9d585e06d8baf4a0ca
19:02:01:WU02:FS01:0x21:CPU: 0x00000000000000000000000000000000
19:02:01:WU02:FS01:0x21:Machine: 1
19:02:01:WU02:FS01:0x21:Reading tar file core.xml
19:02:01:WU02:FS01:0x21:Reading tar file integrator.xml
19:02:01:WU02:FS01:0x21:Reading tar file state.xml
19:02:01:WU02:FS01:0x21:Reading tar file system.xml
19:02:01:WU02:FS01:0x21:Digital signatures verified
19:02:01:WU02:FS01:0x21:Folding@home GPU Core21 Folding@home Core
19:02:01:WU02:FS01:0x21:Version 0.0.18
19:02:02:WU02:FS01:0x21:Completed 0 out of 6250000 steps (0%)
19:02:02:WU02:FS01:0x21:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
19:02:03:WU03:FS01:Upload complete
19:02:03:WU03:FS01:Server responded WORK_ACK (400)
19:02:03:WU03:FS01:Final credit estimate, 53093.00 points
19:02:03:WU03:FS01:Cleaning up
19:02:42:WU02:FS01:0x21:Completed 62500 out of 6250000 steps (1%)
19:03:21:WU02:FS01:0x21:Completed 125000 out of 6250000 steps (2%)
19:04:01:WU02:FS01:0x21:Completed 187500 out of 6250000 steps (3%)
19:04:41:WU02:FS01:0x21:Completed 250000 out of 6250000 steps (4%)
19:05:21:WU02:FS01:0x21:Completed 312500 out of 6250000 steps (5%)
19:06:00:WU02:FS01:0x21:Completed 375000 out of 6250000 steps (6%)
19:06:40:WU02:FS01:0x21:Completed 437500 out of 6250000 steps (7%)
19:07:20:WU02:FS01:0x21:Completed 500000 out of 6250000 steps (8%)
19:07:59:WU02:FS01:0x21:Completed 562500 out of 6250000 steps (9%)
19:08:39:WU02:FS01:0x21:Completed 625000 out of 6250000 steps (10%)
19:09:19:WU02:FS01:0x21:Completed 687500 out of 6250000 steps (11%)
19:09:59:WU02:FS01:0x21:Completed 750000 out of 6250000 steps (12%)
19:10:39:WU02:FS01:0x21:Completed 812500 out of 6250000 steps (13%)
19:11:19:WU02:FS01:0x21:Completed 875000 out of 6250000 steps (14%)
19:11:58:WU02:FS01:0x21:Completed 937500 out of 6250000 steps (15%)
19:12:38:WU02:FS01:0x21:Completed 1000000 out of 6250000 steps (16%)
19:13:18:WU02:FS01:0x21:Completed 1062500 out of 6250000 steps (17%)
19:13:58:WU02:FS01:0x21:Completed 1125000 out of 6250000 steps (18%)
19:14:37:WU02:FS01:0x21:Completed 1187500 out of 6250000 steps (19%)
19:15:17:WU02:FS01:0x21:Completed 1250000 out of 6250000 steps (20%)
19:15:57:WU02:FS01:0x21:Completed 1312500 out of 6250000 steps (21%)
19:16:36:WU02:FS01:0x21:Completed 1375000 out of 6250000 steps (22%)
19:17:16:WU02:FS01:0x21:Completed 1437500 out of 6250000 steps (23%)
19:17:56:WU02:FS01:0x21:Completed 1500000 out of 6250000 steps (24%)
19:18:36:WU02:FS01:0x21:Completed 1562500 out of 6250000 steps (25%)
19:19:15:WU02:FS01:0x21:Completed 1625000 out of 6250000 steps (26%)
19:19:55:WU02:FS01:0x21:Completed 1687500 out of 6250000 steps (27%)
19:20:35:WU02:FS01:0x21:Completed 1750000 out of 6250000 steps (28%)
19:21:15:WU02:FS01:0x21:Completed 1812500 out of 6250000 steps (29%)
19:21:55:WU02:FS01:0x21:Completed 1875000 out of 6250000 steps (30%)
19:22:34:WU02:FS01:0x21:Completed 1937500 out of 6250000 steps (31%)
19:23:14:WU02:FS01:0x21:Completed 2000000 out of 6250000 steps (32%)
19:23:54:WU02:FS01:0x21:Completed 2062500 out of 6250000 steps (33%)
19:24:34:WU02:FS01:0x21:Completed 2125000 out of 6250000 steps (34%)
19:25:13:WU02:FS01:0x21:Completed 2187500 out of 6250000 steps (35%)
19:25:53:WU02:FS01:0x21:Completed 2250000 out of 6250000 steps (36%)
19:26:33:WU02:FS01:0x21:Completed 2312500 out of 6250000 steps (37%)
19:27:12:WU02:FS01:0x21:Completed 2375000 out of 6250000 steps (38%)
19:27:52:WU02:FS01:0x21:Completed 2437500 out of 6250000 steps (39%)
19:28:32:WU02:FS01:0x21:Completed 2500000 out of 6250000 steps (40%)
19:29:12:WU02:FS01:0x21:Completed 2562500 out of 6250000 steps (41%)
19:29:51:WU02:FS01:0x21:Completed 2625000 out of 6250000 steps (42%)
19:30:31:WU02:FS01:0x21:Completed 2687500 out of 6250000 steps (43%)
19:31:11:WU02:FS01:0x21:Completed 2750000 out of 6250000 steps (44%)
19:31:50:WU02:FS01:0x21:Completed 2812500 out of 6250000 steps (45%)
19:32:30:WU02:FS01:0x21:Completed 2875000 out of 6250000 steps (46%)
19:33:10:WU02:FS01:0x21:Completed 2937500 out of 6250000 steps (47%)
19:33:49:WU02:FS01:0x21:Completed 3000000 out of 6250000 steps (48%)
19:34:29:WU02:FS01:0x21:Completed 3062500 out of 6250000 steps (49%)
19:35:09:WU02:FS01:0x21:Completed 3125000 out of 6250000 steps (50%)
19:35:48:WU02:FS01:0x21:Completed 3187500 out of 6250000 steps (51%)
19:36:28:WU02:FS01:0x21:Completed 3250000 out of 6250000 steps (52%)
19:37:08:WU02:FS01:0x21:Completed 3312500 out of 6250000 steps (53%)
19:37:48:WU02:FS01:0x21:Completed 3375000 out of 6250000 steps (54%)
19:38:28:WU02:FS01:0x21:Completed 3437500 out of 6250000 steps (55%)
19:39:07:WU02:FS01:0x21:Completed 3500000 out of 6250000 steps (56%)
19:39:47:WU02:FS01:0x21:Completed 3562500 out of 6250000 steps (57%)
19:40:27:WU02:FS01:0x21:Completed 3625000 out of 6250000 steps (58%)
19:41:06:WU02:FS01:0x21:Completed 3687500 out of 6250000 steps (59%)
19:41:46:WU02:FS01:0x21:Completed 3750000 out of 6250000 steps (60%)
19:42:26:WU02:FS01:0x21:Completed 3812500 out of 6250000 steps (61%)
19:43:06:WU02:FS01:0x21:Completed 3875000 out of 6250000 steps (62%)
19:43:45:WU02:FS01:0x21:Completed 3937500 out of 6250000 steps (63%)
19:44:25:WU02:FS01:0x21:Completed 4000000 out of 6250000 steps (64%)
19:45:05:WU02:FS01:0x21:Completed 4062500 out of 6250000 steps (65%)
19:45:44:WU02:FS01:0x21:Completed 4125000 out of 6250000 steps (66%)
19:46:24:WU02:FS01:0x21:Completed 4187500 out of 6250000 steps (67%)
19:47:04:WU02:FS01:0x21:Completed 4250000 out of 6250000 steps (68%)
19:47:44:WU02:FS01:0x21:Completed 4312500 out of 6250000 steps (69%)
19:48:23:WU02:FS01:0x21:Completed 4375000 out of 6250000 steps (70%)
19:49:03:WU02:FS01:0x21:Completed 4437500 out of 6250000 steps (71%)
19:49:43:WU02:FS01:0x21:Completed 4500000 out of 6250000 steps (72%)
19:50:23:WU02:FS01:0x21:Completed 4562500 out of 6250000 steps (73%)
19:51:02:WU02:FS01:0x21:Completed 4625000 out of 6250000 steps (74%)
19:51:42:WU02:FS01:0x21:Completed 4687500 out of 6250000 steps (75%)
19:52:22:WU02:FS01:0x21:Completed 4750000 out of 6250000 steps (76%)
19:53:02:WU02:FS01:0x21:Completed 4812500 out of 6250000 steps (77%)
19:53:41:WU02:FS01:0x21:Completed 4875000 out of 6250000 steps (78%)
19:54:21:WU02:FS01:0x21:Completed 4937500 out of 6250000 steps (79%)
19:55:00:WU02:FS01:0x21:Completed 5000000 out of 6250000 steps (80%)
19:55:40:WU02:FS01:0x21:Completed 5062500 out of 6250000 steps (81%)
19:56:20:WU02:FS01:0x21:Completed 5125000 out of 6250000 steps (82%)
19:57:00:WU02:FS01:0x21:Completed 5187500 out of 6250000 steps (83%)
19:57:40:WU02:FS01:0x21:Completed 5250000 out of 6250000 steps (84%)
19:58:19:WU02:FS01:0x21:Completed 5312500 out of 6250000 steps (85%)
19:58:59:WU02:FS01:0x21:Completed 5375000 out of 6250000 steps (86%)
19:59:39:WU02:FS01:0x21:Completed 5437500 out of 6250000 steps (87%)
20:00:18:WU02:FS01:0x21:Completed 5500000 out of 6250000 steps (88%)
20:00:58:WU02:FS01:0x21:Completed 5562500 out of 6250000 steps (89%)
20:01:38:WU02:FS01:0x21:Completed 5625000 out of 6250000 steps (90%)
20:02:18:WU02:FS01:0x21:Completed 5687500 out of 6250000 steps (91%)
20:02:57:WU02:FS01:0x21:Completed 5750000 out of 6250000 steps (92%)
20:03:38:WU02:FS01:0x21:Completed 5812500 out of 6250000 steps (93%)
20:04:18:WU02:FS01:0x21:Completed 5875000 out of 6250000 steps (94%)
20:04:57:WU02:FS01:0x21:Completed 5937500 out of 6250000 steps (95%)
20:05:37:WU02:FS01:0x21:Completed 6000000 out of 6250000 steps (96%)
20:06:17:WU02:FS01:0x21:Completed 6062500 out of 6250000 steps (97%)
20:06:57:WU02:FS01:0x21:Completed 6125000 out of 6250000 steps (98%)
20:07:36:WU02:FS01:0x21:Completed 6187500 out of 6250000 steps (99%)
20:07:39:WU01:FS01:Connecting to 171.67.108.45:80
20:08:16:WU02:FS01:0x21:Completed 6250000 out of 6250000 steps (100%)
20:08:16:WU02:FS01:0x21:Saving result file logfile_01.txt
20:08:16:WU02:FS01:0x21:Saving result file checkpointState.xml
20:08:16:WU02:FS01:0x21:Saving result file checkpt.crc
20:08:16:WU02:FS01:0x21:Saving result file log.txt
20:08:16:WU02:FS01:0x21:Saving result file positions.xtc
20:08:16:WU02:FS01:0x21:Folding@home Core Shutdown: FINISHED_UNIT
20:08:17:WU02:FS01:FahCore returned: FINISHED_UNIT (100 = 0x64)
20:08:17:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:9415 run:1680 clone:0 gen:344 core:0x21 unit:0x00000187ab436c9d585e06d8baf4a0ca
20:08:17:WU02:FS01:Uploading 7.80MiB to 171.67.108.157
20:08:17:WU02:FS01:Connecting to 171.67.108.157:8080
20:08:20:WARNING:WU01:FS01:Failed to get assignment from '171.67.108.45:80': 10002: Received short response, expected 272 bytes, got 0
20:08:20:WU01:FS01:Connecting to 171.64.65.35:80
20:09:06:WARNING:WU01:FS01:Failed to get assignment from '171.64.65.35:80': 10002: Received short response, expected 272 bytes, got 0
20:09:06:ERROR:WU01:FS01:Exception: Could not get an assignment
20:09:06:WU01:FS01:Connecting to 171.67.108.45:80
20:09:50:WARNING:WU01:FS01:Failed to get assignment from '171.67.108.45:80': 10002: Received short response, expected 272 bytes, got 0
20:09:50:WU01:FS01:Connecting to 171.64.65.35:80
20:10:24:WARNING:WU02:FS01:WorkServer connection failed on port 8080 trying 80
20:10:24:WU02:FS01:Connecting to 171.67.108.157:80
20:10:24:WU02:FS01:Upload 0.80%
20:10:34:WARNING:WU01:FS01:Failed to get assignment from '171.64.65.35:80': 10002: Received short response, expected 272 bytes, got 0
20:10:34:ERROR:WU01:FS01:Exception: Could not get an assignment
20:10:34:WU01:FS01:Connecting to 171.67.108.45:80
20:11:10:WU02:FS01:Upload 3.20%
20:11:10:WARNING:WU02:FS01:Exception: Failed to send results to work server: Transfer failed
20:11:10:WU02:FS01:Trying to send results to collection server
20:11:10:WU02:FS01:Uploading 7.80MiB to 171.67.108.46
20:11:10:WU02:FS01:Connecting to 171.67.108.46:8080
20:11:22:WARNING:WU01:FS01:Failed to get assignment from '171.67.108.45:80': 10002: Received short response, expected 272 bytes, got 0
20:11:22:WU01:FS01:Connecting to 171.64.65.35:80
20:12:07:WARNING:WU01:FS01:Failed to get assignment from '171.64.65.35:80': 10002: Received short response, expected 272 bytes, got 0
20:12:07:ERROR:WU01:FS01:Exception: Could not get an assignment
20:12:11:WU01:FS01:Connecting to 171.67.108.45:80
20:12:52:WARNING:WU01:FS01:Failed to get assignment from '171.67.108.45:80': 10002: Received short response, expected 272 bytes, got 0
20:12:52:WU01:FS01:Connecting to 171.64.65.35:80
20:13:17:WARNING:WU02:FS01:WorkServer connection failed on port 8080 trying 80
20:13:17:WU02:FS01:Connecting to 171.67.108.46:80
20:13:17:WU02:FS01:Upload 0.80%
20:13:37:WARNING:WU01:FS01:Failed to get assignment from '171.64.65.35:80': 10002: Received short response, expected 272 bytes, got 0
20:13:37:ERROR:WU01:FS01:Exception: Could not get an assignment
20:13:58:WU02:FS01:Upload 3.20%
20:13:58:ERROR:WU02:FS01:Exception: Transfer failed
20:13:58:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:9415 run:1680 clone:0 gen:344 core:0x21 unit:0x00000187ab436c9d585e06d8baf4a0ca
20:13:58:WU02:FS01:Uploading 7.80MiB to 171.67.108.157
20:13:58:WU02:FS01:Connecting to 171.67.108.157:8080
20:14:49:WU01:FS01:Connecting to 171.67.108.45:80
20:15:34:WARNING:WU01:FS01:Failed to get assignment from '171.67.108.45:80': 10002: Received short response, expected 272 bytes, got 0
20:15:34:WU01:FS01:Connecting to 171.64.65.35:80
20:16:06:WARNING:WU02:FS01:WorkServer connection failed on port 8080 trying 80
20:16:06:WU02:FS01:Connecting to 171.67.108.157:80
20:16:06:WU02:FS01:Upload 0.80%
20:16:22:WARNING:WU01:FS01:Failed to get assignment from '171.64.65.35:80': 10002: Received short response, expected 272 bytes, got 0
20:16:22:ERROR:WU01:FS01:Exception: Could not get an assignment
20:16:55:WU02:FS01:Upload 3.20%
20:16:55:WARNING:WU02:FS01:Exception: Failed to send results to work server: Transfer failed
20:16:55:WU02:FS01:Trying to send results to collection server
20:16:55:WU02:FS01:Uploading 7.80MiB to 171.67.108.46
20:16:55:WU02:FS01:Connecting to 171.67.108.46:8080
20:19:02:WARNING:WU02:FS01:WorkServer connection failed on port 8080 trying 80
20:19:02:WU02:FS01:Connecting to 171.67.108.46:80
20:19:02:WU02:FS01:Upload 0.80%
20:19:03:WU01:FS01:Connecting to 171.67.108.45:80
20:19:44:WU02:FS01:Upload 3.20%
20:19:44:ERROR:WU02:FS01:Exception: Transfer failed
20:19:44:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:9415 run:1680 clone:0 gen:344 core:0x21 unit:0x00000187ab436c9d585e06d8baf4a0ca
20:19:44:WU02:FS01:Uploading 7.80MiB to 171.67.108.157
20:19:44:WU02:FS01:Connecting to 171.67.108.157:8080
20:19:44:WARNING:WU01:FS01:Failed to get assignment from '171.67.108.45:80': 10002: Received short response, expected 272 bytes, got 0
20:19:44:WU01:FS01:Connecting to 171.64.65.35:80
20:20:28:WARNING:WU01:FS01:Failed to get assignment from '171.64.65.35:80': 10002: Received short response, expected 272 bytes, got 0
20:20:28:ERROR:WU01:FS01:Exception: Could not get an assignment
20:21:51:WARNING:WU02:FS01:WorkServer connection failed on port 8080 trying 80
20:21:51:WU02:FS01:Connecting to 171.67.108.157:80
20:21:51:WU02:FS01:Upload 0.80%
20:22:40:WU02:FS01:Upload 3.20%
20:22:40:WARNING:WU02:FS01:Exception: Failed to send results to work server: Transfer failed
20:22:40:WU02:FS01:Trying to send results to collection server
20:22:40:WU02:FS01:Uploading 7.80MiB to 171.67.108.46
20:22:40:WU02:FS01:Connecting to 171.67.108.46:8080
20:24:47:WARNING:WU02:FS01:WorkServer connection failed on port 8080 trying 80
20:24:47:WU02:FS01:Connecting to 171.67.108.46:80
20:24:47:WU02:FS01:Upload 0.80%
20:25:30:WU02:FS01:Upload 3.20%
20:25:30:ERROR:WU02:FS01:Exception: Transfer failed
20:25:30:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:9415 run:1680 clone:0 gen:344 core:0x21 unit:0x00000187ab436c9d585e06d8baf4a0ca
20:25:30:WU02:FS01:Uploading 7.80MiB to 171.67.108.157
20:25:30:WU02:FS01:Connecting to 171.67.108.157:8080
20:25:55:WU01:FS01:Connecting to 171.67.108.45:80
20:26:42:WARNING:WU01:FS01:Failed to get assignment from '171.67.108.45:80': 10002: Received short response, expected 272 bytes, got 0
20:26:42:WU01:FS01:Connecting to 171.64.65.35:80
20:27:31:WARNING:WU01:FS01:Failed to get assignment from '171.64.65.35:80': 10002: Received short response, expected 272 bytes, got 0
20:27:31:ERROR:WU01:FS01:Exception: Could not get an assignment
20:27:37:WARNING:WU02:FS01:WorkServer connection failed on port 8080 trying 80
20:27:37:WU02:FS01:Connecting to 171.67.108.157:80
20:27:37:WU02:FS01:Upload 0.80%
20:28:22:WU02:FS01:Upload 3.20%
20:28:22:WARNING:WU02:FS01:Exception: Failed to send results to work server: Transfer failed
20:28:22:WU02:FS01:Trying to send results to collection server
20:28:22:WU02:FS01:Uploading 7.80MiB to 171.67.108.46
20:28:22:WU02:FS01:Connecting to 171.67.108.46:8080
20:30:29:WARNING:WU02:FS01:WorkServer connection failed on port 8080 trying 80
20:30:29:WU02:FS01:Connecting to 171.67.108.46:80
20:30:29:WU02:FS01:Upload 0.80%
20:31:16:WU02:FS01:Upload 3.20%
20:31:16:ERROR:WU02:FS01:Exception: Transfer failed
20:31:16:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:9415 run:1680 clone:0 gen:344 core:0x21 unit:0x00000187ab436c9d585e06d8baf4a0ca
20:31:16:WU02:FS01:Uploading 7.80MiB to 171.67.108.157
20:31:16:WU02:FS01:Connecting to 171.67.108.157:8080
20:33:24:WARNING:WU02:FS01:WorkServer connection failed on port 8080 trying 80
20:33:24:WU02:FS01:Connecting to 171.67.108.157:80
20:33:24:WU02:FS01:Upload 0.80%
20:34:10:WU02:FS01:Upload 3.20%
20:34:10:WARNING:WU02:FS01:Exception: Failed to send results to work server: Transfer failed
20:34:10:WU02:FS01:Trying to send results to collection server
20:34:10:WU02:FS01:Uploading 7.80MiB to 171.67.108.46
20:34:10:WU02:FS01:Connecting to 171.67.108.46:8080
20:36:17:WARNING:WU02:FS01:WorkServer connection failed on port 8080 trying 80
20:36:17:WU02:FS01:Connecting to 171.67.108.46:80
20:36:17:WU02:FS01:Upload 0.80%
20:36:58:WU02:FS01:Upload 3.20%
20:36:58:ERROR:WU02:FS01:Exception: Transfer failed
20:36:58:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:9415 run:1680 clone:0 gen:344 core:0x21 unit:0x00000187ab436c9d585e06d8baf4a0ca
20:36:58:WU02:FS01:Uploading 7.80MiB to 171.67.108.157
20:36:58:WU02:FS01:Connecting to 171.67.108.157:8080
20:37:00:WU01:FS01:Connecting to 171.67.108.45:80
20:37:46:WARNING:WU01:FS01:Failed to get assignment from '171.67.108.45:80': 10002: Received short response, expected 272 bytes, got 0
20:37:46:WU01:FS01:Connecting to 171.64.65.35:80
20:38:28:WARNING:WU01:FS01:Failed to get assignment from '171.64.65.35:80': 10002: Received short response, expected 272 bytes, got 0
20:38:28:ERROR:WU01:FS01:Exception: Could not get an assignment
20:39:06:WARNING:WU02:FS01:WorkServer connection failed on port 8080 trying 80
20:39:06:WU02:FS01:Connecting to 171.67.108.157:80
20:39:06:WU02:FS01:Upload 0.80%
20:39:46:WU02:FS01:Upload 3.20%
20:39:46:WARNING:WU02:FS01:Exception: Failed to send results to work server: Transfer failed
20:39:46:WU02:FS01:Trying to send results to collection server
20:39:46:WU02:FS01:Uploading 7.80MiB to 171.67.108.46
20:39:46:WU02:FS01:Connecting to 171.67.108.46:8080


Got 4 CPU servers and 1 server with 2 GPUs experiencing this.

Maybe related to the fixing procedures for the stats servers, but I thought I'd mention it.
Wreck3r
 
Posts: 10
Joined: Sat Dec 17, 2011 10:06 am

Re: 171.67.108.45 and 171.64.65.35

Postby Wreck3r » Wed Sep 13, 2017 8:47 pm

Looks like I was much too impatient. Everything is back to normal.

Sorry
Wreck3r
 
Posts: 10
Joined: Sat Dec 17, 2011 10:06 am

Re: 171.67.108.45 and 171.64.65.35

Postby nsummy » Wed Sep 13, 2017 10:16 pm

Same thing happened to me around the same time this afternoon, it ended up resolving itself though.
nsummy
 
Posts: 4
Joined: Fri Aug 11, 2017 1:46 pm

Re: 171.67.108.45 and 171.64.65.35

Postby bruce » Wed Sep 13, 2017 11:18 pm

New assignments are received from a Work Server somewhere in FAH's domain. Not all WUs can be processed by any system but normally there will be more than one WS that can give you a new assignment. The client is designed to retry periodically if there's a problem downloading a new WU.

When a WU has been processed, it's generally uploaded to the same WS which assigned it. All servers have occasional problems, and if there's a problem with the WS, the WU is uploaded to a Collection Server which will pass it on to thw WS when it can. That upload transaction is between your client and the applicable server. If for some reason BOTH servers are unavailable, the client holds on the the result file and retries sending it until it is successful.

Periodically (usually on the hour) the WS sends a report to the stats subsystem of the WUS it has received. If this cannot be sent because of some kind of failure, the reports are held and re_transmitted later.l These results are added to a gigantic stats database periodically and made available for on-line queries. Periodically, flat files are generated of that same data, which are geneerated to reduce database overhead while still providing the information to 3rd party stats.

All or most of the steps in paragraph 2 are processed by a server that's getting old and has become unreliable. The data on that server is currently being transferred to new hardware.

Your error was to steps in paragraph 1 and have nothing to do with the current data migration effort.

Data for each step along the way is backed up in a way that it can be regenerated -- so it is EXTREMELY unlikely that anything is lost, although it can be delayed.
bruce
 
Posts: 20834
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: 171.67.108.45 and 171.64.65.35

Postby Wreck3r » Thu Nov 30, 2017 8:24 am

Hi again,

The issues with these two servers started again:

Code: Select all
*********************** Log Started 2017-11-30T08:15:11Z ***********************
08:15:11:************************* Folding@home Client *************************
08:15:11:    Website: http://folding.stanford.edu/
08:15:11:  Copyright: (c) 2009-2014 Stanford University
08:15:11:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
08:15:11:       Args: --child --lifeline 15522 /etc/fahclient/config.xml --run-as
08:15:11:             fahclient --pid-file=/var/run/fahclient.pid --daemon
08:15:11:     Config: /etc/fahclient/config.xml
08:15:11:******************************** Build ********************************
08:15:11:    Version: 7.4.4
08:15:11:       Date: Mar 4 2014
08:15:11:       Time: 12:02:38
08:15:11:    SVN Rev: 4130
08:15:11:     Branch: fah/trunk/client
08:15:11:   Compiler: GNU 4.4.7
08:15:11:    Options: -std=gnu++98 -O3 -funroll-loops -mfpmath=sse -ffast-math
08:15:11:             -fno-unsafe-math-optimizations -msse2
08:15:11:   Platform: linux2 3.2.0-1-amd64
08:15:11:       Bits: 64
08:15:11:       Mode: Release
08:15:11:******************************* System ********************************
08:15:11:        CPU: Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
08:15:11:     CPU ID: GenuineIntel Family 6 Model 79 Stepping 1
08:15:11:       CPUs: 28
08:15:11:     Memory: 5.83GiB
08:15:11:Free Memory: 4.10GiB
08:15:11:    Threads: POSIX_THREADS
08:15:11: OS Version: 3.19
08:15:11:Has Battery: false
08:15:11: On Battery: false
08:15:11: UTC Offset: 2
08:15:11:        PID: 15524
08:15:11:        CWD: /var/lib/fahclient
08:15:11:         OS: Linux 3.19.0-25-generic x86_64
08:15:11:    OS Arch: AMD64
08:15:11:       GPUs: 0
08:15:11:       CUDA: Not detected
08:15:11:***********************************************************************
08:15:11:<config>
08:15:11:  <!-- Client Control -->
08:15:11:  <fold-anon v='true'/>
08:15:11:
08:15:11:  <!-- Folding Core -->
08:15:11:  <core-priority v='low'/>
08:15:11:
08:15:11:  <!-- Folding Slot Configuration -->
08:15:11:  <gpu v='false'/>
08:15:11:
08:15:11:  <!-- HTTP Server -->
08:15:11:  <allow v='127.0.0.1,192.168.100.223,192.168.34.0/24'/>
08:15:11:
08:15:11:  <!-- Network -->
08:15:11:  <proxy v=':8080'/>
08:15:11:
08:15:11:  <!-- Remote Command Server -->
08:15:11:  <command-allow-no-pass v='127.0.0.1,192.168.100.223,192.168.34.0/24'/>
08:15:11:  <password v='*******'/>
08:15:11:
08:15:11:  <!-- Slot Control -->
08:15:11:  <power v='full'/>
08:15:11:
08:15:11:  <!-- User Information -->
08:15:11:  <passkey v='********************************'/>
08:15:11:  <team v='224497'/>
08:15:11:  <user v='Wreck3r_ALL_1HXKxNtoQj5Pu7SdcHy22z4yecxYzkADxD'/>
08:15:11:
08:15:11:  <!-- Folding Slots -->
08:15:11:  <slot id='0' type='CPU'/>
08:15:11:</config>
08:15:11:Switching to user fahclient
08:15:11:Trying to access database...
08:15:11:Successfully acquired database lock
08:15:11:Enabled folding slot 00: READY cpu:28
08:15:11:WU00:FS00:Connecting to 171.67.108.45:8080
08:15:12:WARNING:WU00:FS00:Failed to get assignment from '171.67.108.45:8080': Empty work server assignment
08:15:12:WU00:FS00:Connecting to 171.64.65.35:80
08:15:13:WU00:FS00:Assigned to work server 155.247.166.219
08:15:13:WU00:FS00:Requesting new work unit for slot 00: READY cpu:28 from 155.247.166.219
08:15:13:WU00:FS00:Connecting to 155.247.166.219:8080
08:15:13:WARNING:WU00:FS00:WorkServer connection failed on port 8080 trying 80
08:15:13:WU00:FS00:Connecting to 155.247.166.219:80
08:15:57:ERROR:WU00:FS00:Exception: 10002: Received short response, expected 512 bytes, got 0
08:15:57:WU00:FS00:Connecting to 171.67.108.45:8080
08:15:58:WARNING:WU00:FS00:Failed to get assignment from '171.67.108.45:8080': Empty work server assignment
08:15:58:WU00:FS00:Connecting to 171.64.65.35:80
08:15:59:WU00:FS00:Assigned to work server 155.247.166.219
08:15:59:WU00:FS00:Requesting new work unit for slot 00: READY cpu:28 from 155.247.166.219
08:15:59:WU00:FS00:Connecting to 155.247.166.219:8080
08:15:59:WARNING:WU00:FS00:WorkServer connection failed on port 8080 trying 80
08:15:59:WU00:FS00:Connecting to 155.247.166.219:80


If I restart the client multiple times, it starts working and takes work from 134.139.52.2:8080 by connecting to 171.67.108.45:8080, but without intervention, it just hangs and does nothing.

thanks
Wreck3r
 
Posts: 10
Joined: Sat Dec 17, 2011 10:06 am

Re: 171.67.108.45 and 171.64.65.35

Postby Joe_H » Thu Nov 30, 2017 6:32 pm

Your problem has nothing to do with these Assignment Servers, but the settings for your client. It is requesting work with a setting of CPU:28, a multiple of 7, and there are few projects still assigning to multiples of 7. So if a WS with one of those projects is offline temporarily or all WU's are currently out for those project you will not get connected to a WS and get a WU to download.

You may have better chances of getting available work using CPU settings of 27 or 24. 26 as a multiple of 13 will not be a good setting, and 25 is a multiple of 5 that sometimes is also restricted from being used by some projects.

Or you can use the public beta version of the client, 7.4.16. It has new code that works with the updated code on the servers to get a WU assigned that will use as many CPU cores as possible up to the number set in the request for work.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Joe_H
Site Admin
 
Posts: 3897
Joined: Tue Apr 21, 2009 4:41 pm
Location: W. MA

Re: 171.67.108.45 and 171.64.65.35

Postby Wreck3r » Fri Dec 01, 2017 12:08 pm

Thanks for the reply.

I tried changing to 27, still no jobs assigned. Moving to 24 assigns A4 cores which are not as efficient (points wise) as A7. I also tried the beta client and still encounter the same.

Code: Select all
*********************** Log Started 2017-12-01T12:05:54Z ***********************
12:05:54:************************* Folding@home Client *************************
12:05:54:    Website: http://folding.stanford.edu/
12:05:54:  Copyright: (c) 2009-2014 Stanford University
12:05:54:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
12:05:54:       Args: --child --lifeline 4470 /etc/fahclient/config.xml --run-as
12:05:54:             fahclient --pid-file=/var/run/fahclient.pid --daemon
12:05:54:     Config: /etc/fahclient/config.xml
12:05:54:******************************** Build ********************************
12:05:54:    Version: 7.4.4
12:05:54:       Date: Mar 4 2014
12:05:54:       Time: 12:02:38
12:05:54:    SVN Rev: 4130
12:05:54:     Branch: fah/trunk/client
12:05:54:   Compiler: GNU 4.4.7
12:05:54:    Options: -std=gnu++98 -O3 -funroll-loops -mfpmath=sse -ffast-math
12:05:54:             -fno-unsafe-math-optimizations -msse2
12:05:54:   Platform: linux2 3.2.0-1-amd64
12:05:54:       Bits: 64
12:05:54:       Mode: Release
12:05:54:******************************* System ********************************
12:05:54:        CPU: Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
12:05:54:     CPU ID: GenuineIntel Family 6 Model 79 Stepping 1
12:05:54:       CPUs: 28
12:05:54:     Memory: 5.83GiB
12:05:54:Free Memory: 4.34GiB
12:05:54:    Threads: POSIX_THREADS
12:05:54: OS Version: 3.19
12:05:54:Has Battery: false
12:05:54: On Battery: false
12:05:54: UTC Offset: 2
12:05:54:        PID: 4472
12:05:54:        CWD: /var/lib/fahclient
12:05:54:         OS: Linux 3.19.0-25-generic x86_64
12:05:54:    OS Arch: AMD64
12:05:54:       GPUs: 0
12:05:54:       CUDA: Not detected
12:05:54:***********************************************************************
12:05:54:<config>
12:05:54:  <!-- Client Control -->
12:05:54:  <fold-anon v='true'/>
12:05:54:
12:05:54:  <!-- Folding Core -->
12:05:54:  <core-priority v='low'/>
12:05:54:
12:05:54:  <!-- Folding Slot Configuration -->
12:05:54:  <gpu v='false'/>
12:05:54:
12:05:54:  <!-- HTTP Server -->
12:05:54:  <allow v='127.0.0.1 192.168.100.223'/>
12:05:54:
12:05:54:  <!-- Network -->
12:05:54:  <proxy v=':8080'/>
12:05:54:
12:05:54:  <!-- Remote Command Server -->
12:05:54:  <command-allow-no-pass v='127.0.0.1 192.168.100.223'/>
12:05:54:  <password v='*******'/>
12:05:54:
12:05:54:  <!-- Slot Control -->
12:05:54:  <power v='full'/>
12:05:54:
12:05:54:  <!-- User Information -->
12:05:54:  <passkey v='********************************'/>
12:05:54:  <team v='224497'/>
12:05:54:  <user v='Wreck3r_ALL_1HXKxNtoQj5Pu7SdcHy22z4yecxYzkADxD'/>
12:05:54:
12:05:54:  <!-- Folding Slots -->
12:05:54:  <slot id='0' type='CPU'>
12:05:54:    <cpus v='27'/>
12:05:54:  </slot>
12:05:54:</config>
12:05:54:Switching to user fahclient
12:05:54:Trying to access database...
12:05:56:Successfully acquired database lock
12:05:56:Enabled folding slot 00: READY cpu:27
12:05:56:WU00:FS00:Connecting to 171.67.108.45:8080
12:05:57:WU00:FS00:Assigned to work server 134.139.52.2
12:05:57:WU00:FS00:Requesting new work unit for slot 00: READY cpu:27 from 134.139.52.2
12:05:57:WU00:FS00:Connecting to 134.139.52.2:8080
12:05:58:ERROR:WU00:FS00:Exception: Server did not assign work unit
12:05:58:WU00:FS00:Connecting to 171.67.108.45:8080
12:05:59:WU00:FS00:Assigned to work server 134.139.52.2
12:05:59:WU00:FS00:Requesting new work unit for slot 00: READY cpu:27 from 134.139.52.2
12:05:59:WU00:FS00:Connecting to 134.139.52.2:8080
12:06:00:ERROR:WU00:FS00:Exception: Server did not assign work unit
Wreck3r
 
Posts: 10
Joined: Sat Dec 17, 2011 10:06 am

Re: 171.67.108.45 and 171.64.65.35

Postby Joe_H » Fri Dec 01, 2017 5:48 pm

What you will get at any one time will depend on which projects currently have WU's available. I am getting a mix of about half A4 and half A7 WU's, some are coming from the same servers. More A7 than A4 projects are being created at this point, many have moved to Advanced or regular FAH folding status.

Another possibility with that many CPU cores available is to create two CPU slots, possibly one for 16 and another for 12 cores. That may get you a different mix of projects. Whichever numbers you choose, they should be multiples of 2, 3, and possibly 5. Multiples of 7 and primes larger than that should be avoided with the 7.4.4 client as it does not ave the code to negotiate for a WU like the beta client.
Joe_H
Site Admin
 
Posts: 3897
Joined: Tue Apr 21, 2009 4:41 pm
Location: W. MA

Re: 171.67.108.45 and 171.64.65.35

Postby rwh202 » Fri Dec 01, 2017 6:19 pm

I think there is just a general shortage of work (or servers configured to give out work) for high core count at the moment.

I thought I'd leave my 32 thread workstation busy over the weekend, but nothing for 32 or 24, so settled on 16 to be certain it wouldn't be left idle.
rwh202
 
Posts: 296
Joined: Mon Nov 15, 2010 8:51 pm
Location: South Coast, UK

Re: 171.67.108.45 and 171.64.65.35

Postby JimboPalmer » Fri Dec 01, 2017 6:48 pm

rwh202 wrote:I thought I'd leave my 32 thread workstation busy over the weekend, but nothing for 32 or 24, so settled on 16 to be certain it wouldn't be left idle.

With 16, you can set two CPU slots of 16 each.
Tsar of all the Rushers
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
JimboPalmer
 
Posts: 534
Joined: Mon Feb 16, 2009 4:12 am
Location: Greenwood MS USA

Re: 171.67.108.45 and 171.64.65.35

Postby bruce » Fri Dec 01, 2017 8:00 pm

Wreck3r wrote:I also tried the beta client and still encounter the same.


We can help with the beta client, but you'll need to post the log. All I see is V7.4.4
bruce
 
Posts: 20834
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: 171.67.108.45 and 171.64.65.35

Postby Nathan_P » Sat Dec 02, 2017 7:41 am

I'm getting exactly the same message, this is on a v7.4.16 client, the machine is currently off so cannot post a log but I thought .16 was supposed to react to issues like this and adapt accordingly.
Image
Nathan_P
 
Posts: 1378
Joined: Wed Apr 01, 2009 9:22 pm
Location: Jersey, Channel islands

Re: 171.67.108.45 and 171.64.65.35

Postby bruce » Sat Dec 02, 2017 9:14 am

Several beta version (including .16) do contain code that negotiates with the servers to select a WU with the maximum number of CPU threads (not exceeding whatever value you have set). When there are servers off-line or out of WUs, that number can go up or down, depending on whatever is available at that moment.

Post the log. It won't be EXACTLY the same -- see the 10th line.

If you resumed work on a WU that had already been downloaded with the other version, the negotiation process had already been completed and will not be repeated until you download a new WU.
bruce
 
Posts: 20834
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: 171.67.108.45 and 171.64.65.35

Postby Wreck3r » Sat Dec 02, 2017 10:24 am

I was a bit to quick to judge the beta client. After a few minutes it started downloading jobs as it should. It seems to be a shortage of A7 jobs and mostly 8000s, most of them being 13000s which are of a lower payout.

Other than that, I had to move to a newer version of Ubuntu from 14.04 LTS on which I could get the client to start under any circumstance.

Thanks, keep up the good work.
Wreck3r
 
Posts: 10
Joined: Sat Dec 17, 2011 10:06 am

Re: 171.67.108.45 and 171.64.65.35

Postby Nathan_P » Sat Dec 02, 2017 10:33 am

Code: Select all
*********************** Log Started 2017-11-28T17:43:30Z ***********************
17:43:30:************************* Folding@home Client *************************
17:43:30:    Website: http://folding.stanford.edu/
17:43:30:  Copyright: (c) 2009-2016 Stanford University
17:43:30:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
17:43:30:       Args: --child --lifeline 2225 /etc/fahclient/config.xml --run-as
17:43:30:             fahclient --pid-file=/var/run/fahclient.pid --daemon
17:43:30:     Config: /etc/fahclient/config.xml
17:43:30:******************************** Build ********************************
17:43:30:    Version: 7.4.16
17:43:30:       Date: Jan 6 2017
17:43:30:       Time: 08:08:33
17:43:30: Repository: Git
17:43:30:   Revision: e12187cbb0bd6937c067b9749af011374563b7b9
17:43:30:     Branch: master
17:43:30:   Compiler: GNU 4.9.2
17:43:30:    Options: -std=gnu++98 -O3 -funroll-loops -ffast-math -mfpmath=sse
17:43:30:             -fno-unsafe-math-optimizations -msse2
17:43:30:   Platform: linux2 4.8.0-2-amd64
17:43:30:       Bits: 64
17:43:30:       Mode: Release
17:43:30:******************************* System ********************************
17:43:30:        CPU: Genuine Intel(R) CPU @ 2.00GHz
17:43:30:     CPU ID: GenuineIntel Family 6 Model 62 Stepping 2
17:43:30:       CPUs: 48
17:43:30:     Memory: 15.62GiB
17:43:30:Free Memory: 14.98GiB
17:43:30:    Threads: POSIX_THREADS
17:43:30: OS Version: 4.8
17:43:30:Has Battery: false
17:43:30: On Battery: false
17:43:30: UTC Offset: 0
17:43:30:        PID: 2227
17:43:30:        CWD: /var/lib/fahclient
17:43:30:         OS: Linux 4.8.0-53-generic x86_64
17:43:30:    OS Arch: AMD64
17:43:30:       GPUs: 0
17:43:30:       CUDA: Not detected: Failed to open dynamic library 'libcuda.so':
17:43:30:             libcuda.so: cannot open shared object file: No such file or
17:43:30:             directory
17:43:30:     OpenCL: Not detected: Failed to open dynamic library 'libOpenCL.so':
17:43:30:             libOpenCL.so: cannot open shared object file: No such file or
17:43:30:             directory
17:43:30:***********************************************************************


Thats the start of the last log file.

Code: Select all
22:09:05:WU00:FS00:0xa7:Completed 495000 out of 500000 steps (99%)
22:09:06:WU01:FS00:Connecting to 171.67.108.45:8080
22:09:07:WU01:FS00:Assigned to work server 134.139.52.2
22:09:07:WU01:FS00:Requesting new work unit for slot 00: RUNNING cpu:48 from 134.139.52.2
22:09:07:WU01:FS00:Connecting to 134.139.52.2:8080
22:09:07:ERROR:WU01:FS00:Exception: Server did not assign work unit
22:09:07:WU01:FS00:Connecting to 171.67.108.45:8080
22:09:08:WU01:FS00:Assigned to work server 134.139.52.2
22:09:08:WU01:FS00:Requesting new work unit for slot 00: RUNNING cpu:48 from 134.139.52.2
22:09:08:WU01:FS00:Connecting to 134.139.52.2:8080
22:09:08:ERROR:WU01:FS00:Exception: Server did not assign work unit
22:09:33:WU00:FS00:0xa7:Completed 500000 out of 500000 steps (100%)
22:09:34:WU00:FS00:0xa7:Saving result file ../logfile_01.txt
22:09:34:WU00:FS00:0xa7:Saving result file frame88.cpt
22:09:34:WU00:FS00:0xa7:Saving result file frame88.edr
22:09:34:WU00:FS00:0xa7:Saving result file frame88.xtc
22:09:34:WU00:FS00:0xa7:Saving result file frame88_prev.cpt
22:09:34:WU00:FS00:0xa7:Saving result file science.log
22:09:34:WU00:FS00:0xa7:Folding@home Core Shutdown: FINISHED_UNIT
22:09:35:WU00:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
22:09:35:WU00:FS00:Sending unit results: id:00 state:SEND error:NO_ERROR project:8202 run:2 clone:93 gen:88 core:0xa7 unit:0x00000070868b340258ed3d70596f73f3
22:09:35:WU00:FS00:Uploading 6.18MiB to 134.139.52.2
22:09:35:WU00:FS00:Connecting to 134.139.52.2:8080
22:09:41:WU00:FS00:Upload complete
22:09:41:WU00:FS00:Server responded WORK_ACK (400)
22:09:41:WU00:FS00:Final credit estimate, 10875.00 points
22:09:41:WU00:FS00:Cleaning up
22:10:07:WU01:FS00:Connecting to 171.67.108.45:8080
22:10:08:WU01:FS00:Assigned to work server 134.139.52.2
22:10:08:WU01:FS00:Requesting new work unit for slot 00: READY cpu:48 from 134.139.52.2
22:10:08:WU01:FS00:Connecting to 134.139.52.2:8080
22:10:08:ERROR:WU01:FS00:Exception: Server did not assign work unit
22:11:45:WU01:FS00:Connecting to 171.67.108.45:8080
22:11:45:WU01:FS00:Assigned to work server 134.139.52.2
22:11:45:WU01:FS00:Requesting new work unit for slot 00: READY cpu:48 from 134.139.52.2
22:11:45:WU01:FS00:Connecting to 134.139.52.2:8080
22:11:46:ERROR:WU01:FS00:Exception: Server did not assign work unit
22:14:22:WU01:FS00:Connecting to 171.67.108.45:8080
22:14:23:WU01:FS00:Assigned to work server 134.139.52.2
22:14:23:WU01:FS00:Requesting new work unit for slot 00: READY cpu:48 from 134.139.52.2
22:14:23:WU01:FS00:Connecting to 134.139.52.2:8080
22:14:23:ERROR:WU01:FS00:Exception: Server did not assign work unit
22:18:36:WU01:FS00:Connecting to 171.67.108.45:8080
22:18:37:WU01:FS00:Assigned to work server 134.139.52.2
22:18:37:WU01:FS00:Requesting new work unit for slot 00: READY cpu:48 from 134.139.52.2
22:18:37:WU01:FS00:Connecting to 134.139.52.2:8080
22:18:38:ERROR:WU01:FS00:Exception: Server did not assign work unit
22:25:28:WU01:FS00:Connecting to 171.67.108.45:8080
22:25:29:WU01:FS00:Assigned to work server 134.139.52.2
22:25:29:WU01:FS00:Requesting new work unit for slot 00: READY cpu:48 from 134.139.52.2
22:25:29:WU01:FS00:Connecting to 134.139.52.2:8080
22:25:29:WU01:FS00:Downloading 4.12MiB
22:25:33:WU01:FS00:Download complete
22:25:33:WU01:FS00:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:8200 run:7 clone:45 gen:96 core:0xa7 unit:0x00000077868b340258ed3e2625347f72
22:25:33:WU01:FS00:Starting
22:25:33:WU01:FS00:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/AVX/Core_a7.fah/FahCore_a7 -dir 01 -suffix 01 -version 704 -lifeline 2227 -checkpoint 15 -np 48
22:25:33:WU01:FS00:Started FahCore on PID 20920
22:25:33:WU01:FS00:Core PID:20924
22:25:33:WU01:FS00:FahCore 0xa7 started
22:25:33:WU01:FS00:0xa7:*********************** Log Started 2017-11-30T22:25:33Z ***********************
22:25:33:WU01:FS00:0xa7:************************** Gromacs Folding@home Core ***************************
22:25:33:WU01:FS00:0xa7:       Type: 0xa7
22:25:33:WU01:FS00:0xa7:       Core: Gromacs
22:25:33:WU01:FS00:0xa7:    Website: http://folding.stanford.edu/
22:25:33:WU01:FS00:0xa7:  Copyright: (c) 2009-2016 Stanford University
22:25:33:WU01:FS00:0xa7:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
22:25:33:WU01:FS00:0xa7:       Args: -dir 01 -suffix 01 -version 704 -lifeline 20920 -checkpoint 15 -np
22:25:33:WU01:FS00:0xa7:             48
22:25:33:WU01:FS00:0xa7:     Config: <none>
22:25:33:WU01:FS00:0xa7:************************************ Build *************************************
22:25:33:WU01:FS00:0xa7:    Version: 0.0.16
22:25:33:WU01:FS00:0xa7:       Date: Oct 31 2017
22:25:33:WU01:FS00:0xa7:       Time: 19:24:09
22:25:33:WU01:FS00:0xa7: Repository: Git
22:25:33:WU01:FS00:0xa7:   Revision: 2f0a8a3d0b0698be48154fe99a0216f289060932
22:25:33:WU01:FS00:0xa7:     Branch: master
22:25:33:WU01:FS00:0xa7:   Compiler: GNU 4.9.2
22:25:33:WU01:FS00:0xa7:    Options: -std=gnu++98 -O3 -funroll-loops
22:25:33:WU01:FS00:0xa7:   Platform: linux2 4.9.0-1-amd64
22:25:33:WU01:FS00:0xa7:       Bits: 64
22:25:33:WU01:FS00:0xa7:       Mode: Release
22:25:33:WU01:FS00:0xa7:       SIMD: avx_256
22:25:33:WU01:FS00:0xa7:************************************ System ************************************
22:25:33:WU01:FS00:0xa7:        CPU: Genuine Intel(R) CPU @ 2.00GHz
22:25:33:WU01:FS00:0xa7:     CPU ID: GenuineIntel Family 6 Model 62 Stepping 2
22:25:33:WU01:FS00:0xa7:       CPUs: 48
22:25:33:WU01:FS00:0xa7:     Memory: 15.62GiB
22:25:33:WU01:FS00:0xa7:Free Memory: 13.53GiB
22:25:33:WU01:FS00:0xa7:    Threads: POSIX_THREADS
22:25:33:WU01:FS00:0xa7: OS Version: 4.8
22:25:33:WU01:FS00:0xa7:Has Battery: false
22:25:33:WU01:FS00:0xa7: On Battery: false
22:25:33:WU01:FS00:0xa7: UTC Offset: 0
22:25:33:WU01:FS00:0xa7:        PID: 20924
22:25:33:WU01:FS00:0xa7:        CWD: /var/lib/fahclient/work22:27:54:WU01:FS00:0xa7:Completed 25000 out of 500000 steps (5%)
22:28:22:WU01:FS00:0xa7:Completed 30000 out of 500000 steps (6%)
22:28:49:WU01:FS00:0xa7:Completed 35000 out of 500000 steps (7%)
22:29:17:WU01:FS00:0xa7:Completed 40000 out of 500000 steps (8%)
22:29:44:WU01:FS00:0xa7:Completed 45000 out of 500000 steps (9%)
22:30:12:WU01:FS00:0xa7:Completed 50000 out of 500000 steps (10%)
~snip~
23:05:04:WU01:FS00:0xa7:Completed 430000 out of 500000 steps (86%)
23:05:32:WU01:FS00:0xa7:Completed 435000 out of 500000 steps (87%)
23:05:59:WU01:FS00:0xa7:Completed 440000 out of 500000 steps (88%)
23:06:26:WU01:FS00:0xa7:Completed 445000 out of 500000 steps (89%)
22:25:33:WU01:FS00:0xa7:         OS: Linux 4.8.0-53-generic x86_64
22:25:33:WU01:FS00:0xa7:    OS Arch: AMD64
22:25:33:WU01:FS00:0xa7:********************************************************************************
22:25:33:WU01:FS00:0xa7:Project: 8200 (Run 7, Clone 45, Gen 96)
22:25:33:WU01:FS00:0xa7:Unit: 0x00000077868b340258ed3e2625347f72
22:25:33:WU01:FS00:0xa7:Reading tar file core.xml
22:25:33:WU01:FS00:0xa7:Reading tar file frame96.tpr
22:25:33:WU01:FS00:0xa7:Digital signatures verified
22:25:33:WU01:FS00:0xa7:Calling: mdrun -cpo frame96.cpt -s frame96.tpr -x frame96.xtc -e frame96.edr -cpi frame96.cpt -cpt 15 -nt 48
22:25:35:WU01:FS00:0xa7:Steps: first=48000000 total=500000
22:25:36:WU01:FS00:0xa7:Completed 1 out of 500000 steps (0%)
22:26:03:WU01:FS00:0xa7:Completed 5000 out of 500000 steps (1%)
22:26:31:WU01:FS00:0xa7:Completed 10000 out of 500000 steps (2%)
22:26:59:WU01:FS00:0xa7:Completed 15000 out of 500000 steps (3%)
22:27:26:WU01:FS00:0xa7:Completed 20000 out of 500000 steps (4%)
~Snip~
23:06:54:WU01:FS00:0xa7:Completed 450000 out of 500000 steps (90%)
23:07:21:WU01:FS00:0xa7:Completed 455000 out of 500000 steps (91%)
23:07:49:WU01:FS00:0xa7:Completed 460000 out of 500000 steps (92%)
23:08:16:WU01:FS00:0xa7:Completed 465000 out of 500000 steps (93%)
23:08:44:WU01:FS00:0xa7:Completed 470000 out of 500000 steps (94%)
23:09:11:WU01:FS00:0xa7:Completed 475000 out of 500000 steps (95%)
23:09:39:WU01:FS00:0xa7:Completed 480000 out of 500000 steps (96%)
23:10:06:WU01:FS00:0xa7:Completed 485000 out of 500000 steps (97%)
23:10:33:WU01:FS00:0xa7:Completed 490000 out of 500000 steps (98%)
23:11:01:WU01:FS00:0xa7:Completed 495000 out of 500000 steps (99%)
23:11:02:WU00:FS00:Connecting to 171.67.108.45:8080
23:11:02:WU00:FS00:Assigned to work server 134.139.52.2
23:11:02:WU00:FS00:Requesting new work unit for slot 00: RUNNING cpu:48 from 134.139.52.2
23:11:02:WU00:FS00:Connecting to 134.139.52.2:8080
23:11:03:ERROR:WU00:FS00:Exception: Server did not assign work unit
23:11:03:WU00:FS00:Connecting to 171.67.108.45:8080
23:11:04:WU00:FS00:Assigned to work server 134.139.52.2
23:11:04:WU00:FS00:Requesting new work unit for slot 00: RUNNING cpu:48 from 134.139.52.2
23:11:04:WU00:FS00:Connecting to 134.139.52.2:8080
23:11:04:ERROR:WU00:FS00:Exception: Server did not assign work unit
23:11:29:WU01:FS00:0xa7:Completed 500000 out of 500000 steps (100%)
23:11:30:WU01:FS00:0xa7:Saving result file ../logfile_01.txt
23:11:30:WU01:FS00:0xa7:Saving result file frame96.cpt
23:11:30:WU01:FS00:0xa7:Saving result file frame96.edr
23:11:30:WU01:FS00:0xa7:Saving result file frame96.xtc
23:11:31:WU01:FS00:0xa7:Saving result file frame96_prev.cpt
23:11:31:WU01:FS00:0xa7:Saving result file science.log
23:11:31:WU01:FS00:0xa7:Folding@home Core Shutdown: FINISHED_UNIT
23:11:31:WU01:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
23:11:31:WU01:FS00:Sending unit results: id:01 state:SEND error:NO_ERROR project:8200 run:7 clone:45 gen:96 core:0xa7 unit:0x00000077868b340258ed3e2625347f72
23:11:31:WU01:FS00:Uploading 6.18MiB to 134.139.52.2
23:11:31:WU01:FS00:Connecting to 134.139.52.2:8080
23:11:37:WU01:FS00:Upload 97.13%
23:11:38:WU01:FS00:Upload complete
23:11:38:WU01:FS00:Server responded WORK_ACK (400)
23:11:38:WU01:FS00:Final credit estimate, 11133.00 points
23:11:38:WU01:FS00:Cleaning up
23:12:03:WU00:FS00:Connecting to 171.67.108.45:8080
23:12:04:WU00:FS00:Assigned to work server 134.139.52.2
23:12:04:WU00:FS00:Requesting new work unit for slot 00: READY cpu:48 from 134.139.52.2
23:12:04:WU00:FS00:Connecting to 134.139.52.2:8080
23:12:04:ERROR:WU00:FS00:Exception: Server did not assign work unit
23:13:41:WU00:FS00:Connecting to 171.67.108.45:8080
23:13:41:WU00:FS00:Assigned to work server 134.139.52.2
23:13:41:WU00:FS00:Requesting new work unit for slot 00: READY cpu:48 from 134.139.52.2
23:13:41:WU00:FS00:Connecting to 134.139.52.2:8080
23:13:42:ERROR:WU00:FS00:Exception: Server did not assign work unit
23:16:18:WU00:FS00:Connecting to 171.67.108.45:8080
23:16:19:WU00:FS00:Assigned to work server 134.139.52.2
23:16:19:WU00:FS00:Requesting new work unit for slot 00: READY cpu:48 from 134.139.52.2
23:16:19:WU00:FS00:Connecting to 134.139.52.2:8080
23:16:19:ERROR:WU00:FS00:Exception: Server did not assign work unit
23:20:32:WU00:FS00:Connecting to 171.67.108.45:8080
23:20:33:WU00:FS00:Assigned to work server 134.139.52.2
23:20:33:WU00:FS00:Requesting new work unit for slot 00: READY cpu:48 from 134.139.52.2
23:20:33:WU00:FS00:Connecting to 134.139.52.2:8080
23:20:33:ERROR:WU00:FS00:Exception: Server did not assign work unit
23:27:24:WU00:FS00:Connecting to 171.67.108.45:8080
23:27:24:WU00:FS00:Assigned to work server 134.139.52.2
23:27:24:WU00:FS00:Requesting new work unit for slot 00: READY cpu:48 from 134.139.52.2
23:27:24:WU00:FS00:Connecting to 134.139.52.2:8080
23:27:25:ERROR:WU00:FS00:Exception: Server did not assign work unit
23:38:30:WU00:FS00:Connecting to 171.67.108.45:8080
23:38:30:WU00:FS00:Assigned to work server 134.139.52.2
23:38:30:WU00:FS00:Requesting new work unit for slot 00: READY cpu:48 from 134.139.52.2
23:38:30:WU00:FS00:Connecting to 134.139.52.2:8080
23:38:31:ERROR:WU00:FS00:Exception: Server did not assign work unit
******************************* Date: 2017-11-30 *******************************
23:56:26:WU00:FS00:Connecting to 171.67.108.45:8080
23:56:27:WU00:FS00:Assigned to work server 134.139.52.2
23:56:27:WU00:FS00:Requesting new work unit for slot 00: READY cpu:48 from 134.139.52.2
23:56:27:WU00:FS00:Connecting to 134.139.52.2:8080
23:56:27:ERROR:WU00:FS00:Exception: Server did not assign work unit
00:25:29:WU00:FS00:Connecting to 171.67.108.45:8080
00:25:29:WU00:FS00:Assigned to work server 134.139.52.2
00:25:29:WU00:FS00:Requesting new work unit for slot 00: READY cpu:48 from 134.139.52.2
00:25:29:WU00:FS00:Connecting to 134.139.52.2:8080
00:25:30:ERROR:WU00:FS00:Exception: Server did not assign work unit
01:12:27:WU00:FS00:Connecting to 171.67.108.45:8080


Thats the core actually working, failing to get a WU, getting one and then failing again, it goes on for another 12 hours before locking up the machine.

I've just restarted the client and it has picked up a WU so i'll keep an eye on it. The other issue i have is that if the servers don't respond quickly with a WU it locks up the machine so it just idles for hours until i'm back from work- but thats another issue for another time.
Nathan_P
 
Posts: 1378
Joined: Wed Apr 01, 2009 9:22 pm
Location: Jersey, Channel islands

Next

Return to Issues with a specific server

Who is online

Users browsing this forum: No registered users and 1 guest

cron