problems with all SMP servers

Moderators: Site Moderators, FAHC Science Team

kasson
Pande Group Member
Posts: 1459
Joined: Thu Nov 29, 2007 9:37 pm

Re: problems with all SMP servers

Post by kasson »

One thing to note is that most of the SMP uploads and downloads (especially uploads) are substantially larger than for GPU2, so issues might come up with SMP but not GPU.

In any case, sorry you're having problems! I wish I had more suggestions here...
eliot1785
Posts: 78
Joined: Sat Dec 15, 2007 11:51 pm
Location: Cambridge, MA

Re: problems with all SMP servers

Post by eliot1785 »

UPDATE: My SMP client just downloaded a new WU and is working on it now. I don't know what happened, because I didn't change anything on my computer. The only new thing is that I accidentally started two SMP instances at once, then closed both and started another, but I don't see how that would have an effect. I guess it was either something server-side or with my ISP.

I still have not been able to submit my completed WU, however. Here is an updated log:

Code: Select all

# Windows SMP Console Edition #################################################
###############################################################################

                       Folding@Home Client Version 6.23 Beta R1

                          http://folding.stanford.edu

###############################################################################
###############################################################################

Launch directory: C:\Program Files\folding_older
Executable: C:\Program Files\folding_older\Folding@home-Win32-x86.exe
Arguments: -verbosity 9 -smp -deino 

[01:01:32] - Ask before connecting: No
[01:01:32] - User name: Stephen_Eliot_Dewey (Team 165)
[01:01:32] - User ID: 1980460E6EB77A51
[01:01:32] - Machine ID: 1
[01:01:32] 
[01:01:32] Loaded queue successfully.
[01:01:32] - Preparing to get new work unit...
[01:01:32] + Attempting to get work packet
[01:01:32] - Autosending finished units... [December 19 01:01:32 UTC][01:01:32] - Will indicate memory of 3581 MB

[01:01:32] Trying to send all finished work units
[01:01:32] Project: 2665 (Run 2, Clone 448, Gen 77)
[01:01:32] - Detect CPU. Vendor: GenuineIntel, Family: 6, Model: 7, Stepping: 6
[01:01:32] - Connecting to assignment server


[01:01:32] Connecting to http://assign.stanford.edu:8080/
[01:01:32] + Attempting to send results [December 19 01:01:32 UTC]
[01:01:32] - Reading file work/wuresults_08.dat from core
[01:01:32]   (Read 22288943 bytes from disk)
[01:01:32] Connecting to http://171.64.65.64:8080/
[01:01:33] Posted data.
[01:01:33] Initial: 40AB; - Successful: assigned to (171.64.65.64).
[01:01:33] + News From Folding@Home: Welcome to Folding@Home
[01:01:33] Loaded queue successfully.
[01:01:33] Connecting to http://171.64.65.64:8080/
[01:01:40] Posted data.
[01:01:40] Initial: 0000; - Receiving payload (expected size: 4721125)
[01:03:29] - Downloaded at ~42 kB/s
[01:03:29] - Averaged speed for that direction ~93 kB/s
[01:03:29] + Received work.
[01:03:29] + Closed connections
[01:03:29] 
[01:03:29] + Processing work unit
[01:03:29] Work type a1 not eligible for variable processors
[01:03:29] Core required: FahCore_a1.exe
[01:03:29] Core found.
[01:03:29] Working on queue slot 09 [December 19 01:03:29 UTC]
[01:03:29] + Working ...
[01:03:29] - Calling 'mpiexec -np 4 -channel shm -env MPICH_USE_SMP_OPTIMIZATIONS 1 -host 127.0.0.1 FahCore_a1.exe -dir work/ -suffix 09 -checkpoint 10 -verbose -lifeline 4952 -version 623'

[01:03:32] 
[01:03:32] *------------------------------*
[01:03:32] Folding@Home Gromacs SMP Core
[01:03:32] Version 1.76 (February 23, 2008)
[01:03:32] 
[01:03:32] Preparing to commence simulation
[01:03:32] - Ensuring status. Please wait.
[01:03:39] - Starting from initial work packet
[01:03:39] 
[01:03:39] Project: 2665 (Run 3, Clone 775, Gen 77)
[01:03:39] 
[01:03:39] Assembly optimizations on if available.
[01:03:39] Entering M.D.
[01:04:01] al work packet
[01:04:01] 
[01:04:01] Project: 2665 (Run 3, Clone 775, Gen 77)
[01:04:01] 
[01:04:06] 65 (Run 3, Clone 775, Gen 77)
[01:04:06] 
[01:04:08] Entering M.D.
[01:04:17] GG in water
[01:04:17] Writing local files
[01:04:17] cal files
[01:04:18] Extra SSE boost OK.
[01:04:28] cal files
[01:04:28] Completed 0 out of 250000 steps  (0 percent)
[01:05:55] - Couldn't send HTTP request to server
[01:05:55] + Could not connect to Work Server (results)
[01:05:55]     (171.64.65.64:8080)
[01:05:55] + Retrying using alternative port
[01:05:55] Connecting to http://171.64.65.64:80/
[01:07:01] - Couldn't send HTTP request to server
[01:07:01] + Could not connect to Work Server (results)
[01:07:01]     (171.64.65.64:80)
[01:07:01] - Error: Could not transmit unit 08 (completed December 18) to work server.
[01:07:01] - 13 failed uploads of this unit.


[01:07:01] + Attempting to send results [December 19 01:07:01 UTC]
[01:07:01] - Reading file work/wuresults_08.dat from core
[01:07:02]   (Read 22288943 bytes from disk)
[01:07:02] Connecting to http://171.67.108.25:8080/
[01:07:02] - Couldn't send HTTP request to server
[01:07:02]   (Got status 503)
[01:07:02] + Could not connect to Work Server (results)
[01:07:02]     (171.67.108.25:8080)
[01:07:02] + Retrying using alternative port
[01:07:02] Connecting to http://171.67.108.25:80/
[01:07:02] - Couldn't send HTTP request to server
[01:07:02]   (Got status 503)
[01:07:02] + Could not connect to Work Server (results)
[01:07:02]     (171.67.108.25:80)
[01:07:02]   Could not transmit unit 08 to Collection server; keeping in queue.
[01:07:02] + Sent 0 of 1 completed units to the server
[01:07:02] - Autosend completed
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: problems with all SMP servers

Post by bruce »

eliot1785 wrote:UPDATE: My SMP client just downloaded a new WU and is working on it now. I don't know what happened, because I didn't change anything on my computer. The only new thing is that I accidentally started two SMP instances at once, then closed both and started another, but I don't see how that would have an effect. I guess it was either something server-side or with my ISP.

I still have not been able to submit my completed WU, however.
Starting two instances can cause all sorts of problems unless it's done exactly right, and shutting down those instances can cause other problems. Nevertheless I'm hopeful that no damage was done and your new WU is an encouraging sign.

When does the completed WU expire?
eliot1785
Posts: 78
Joined: Sat Dec 15, 2007 11:51 pm
Location: Cambridge, MA

Re: problems with all SMP servers

Post by eliot1785 »

@bruce et al.,
So, it looks like the problem was my ISP after all. I started seeing browser problems as well a few hours ago (browser claiming it was offline, but then loading the page upon refresh), and eventually I decided to drop the connection and try a USB broadband card (aka "pc card") from AT&T to supplement the WiFi in my apartment (wasn't enthusiastic about the price but I need reliable access for my job). And, as soon as I switched to the AT&T network, the SMP client was able to upload my WU without any problems.

@kasson, I think that your hunch was correct that the difference between the GPU2 client and the SMP client was that the GPU2 files were smaller, so it was able to get through "in between" the connection issues.

I talked with some people with whom we share our WiFi connection and they believe we have a "bad [physical] connection" to the cable network, and that we're also far away from the hub, and that these 2 things cause erratic performance. TCP/IP is supposed to make download/upload robust, but I guess we might be experiencing packet loss that is bad enough that the server/client become unsure how/whether to proceed.

At least, that's my non-network-admin attempt to diagnose the situation. Sorry to bother you folks about it, but the good news is that it doesn't seem to be a problem at Stanford after all.
Post Reply