GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Moderators: Site Moderators, PandeGroup

Re: GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Postby *hondo* » Mon Feb 15, 2010 3:07 pm

Same problem with at least 3 team members from team 51078, I'd normally complete between 5 & 10 WUs per day. My PC as normal was Folding yesterday @ 8:30 GMT I checked the # of completed WUs on the stats page I do know for a fact that I saw 2 WUs go but now approx 23 F@H hours later nothing has been added to my stats score total. Also for the last 12 F@H hours my PC hasn't been able to download a single WU.

Come on Stanford tell us. WHAT IS GOING ON?
*hondo*
 
Posts: 127
Joined: Sat Mar 08, 2008 9:50 am
Location: England UK

Re: GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Postby noorman » Mon Feb 15, 2010 3:16 pm

toTOW wrote:I've just sent an email to Vijay ... I hope well get more informations soon ...
.


I already sent him a PM about 171.67.108.21 almost 2 hours before you did ...

Nor even read yet !

.
- stopped Linux SMP w. HT on i7-860@3.5 GHz
....................................
Folded since 10-06-04 till 09-2010
User avatar
noorman
 
Posts: 553
Joined: Sun Dec 02, 2007 2:26 pm
Location: Belgium, near the International Sea-Port of Antwerp

Re: GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Postby Russ_64 » Mon Feb 15, 2010 3:17 pm

Today (15th) is a holiday in USA, so don't expect a quick answer or solution.

I shutdown my clients yesterday and tried again earlier today, both my GPU's and SMP recieved new WU's today.....
ImageImage
User avatar
Russ_64
 
Posts: 149
Joined: Wed Dec 05, 2007 4:31 pm
Location: London, UK

Re: GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Postby Tobit » Mon Feb 15, 2010 3:21 pm

I've had work for the past few hours but I'm still having the original "Server has already received unit" problem.
User avatar
Tobit
 
Posts: 675
Joined: Thu Apr 17, 2008 2:35 pm
Location: Manchester, NH USA

Re: GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Postby noorman » Mon Feb 15, 2010 3:25 pm

.

Just did another shutdown and restart of F@H and got a WU from server 171.64.65.71 in stead of 171.67.108.21 at which all former requests for work were directed ...

It 's a new P10105 jobby (first one for me)


.
User avatar
noorman
 
Posts: 553
Joined: Sun Dec 02, 2007 2:26 pm
Location: Belgium, near the International Sea-Port of Antwerp

Re: GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Postby VijayPande » Mon Feb 15, 2010 3:27 pm

Thanks for the posts. It's early AM in Califorina (that's why this went unfixed for several hours), but I think we've got everything going again. I've contacted Joe regarding this issue: there was a WS bug.

I've also balanced the weights so the other NV WS's can get into the mix better and improve the redundancy.
Prof. Vijay Pande, PhD
Departments of Chemistry, Structural Biology, and Computer Science
Chair, Biophysics
Director, Folding@home Distributed Computing Project
Stanford University
User avatar
VijayPande
Pande Group Member
 
Posts: 2662
Joined: Fri Nov 30, 2007 6:25 am
Location: Stanford

Re: GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Postby chriskwarren » Mon Feb 15, 2010 3:40 pm

Thanks Dr. Pande. Can you confirm that the "Server has already received unit" problem means that our WUs were accepted by the server and not wasted? From our end it looks like the server rejects our work, and our WU gets wasted.
Image
chriskwarren
 
Posts: 61
Joined: Sun Nov 30, 2008 2:13 am

Re: GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Postby ikerekes » Mon Feb 15, 2010 3:40 pm

Before I went to sleep last night, all of my GPU's were working (on units from 171.64.65.71). Woke up Today morning to a picture where all my GPU's are down, the assignment server reassigned all of them to 108.21 and it was dead in the water.
As of 9:31 PST I restarted every GPU's and they are all loaded 3 from 65.71, 3 from 108.11 and one from 108.21 :)

Hurray!!! Apparently the assignment server needed the biggest kick. (Valentine's day is over, for whoever did the kick doesn't have to feel bad)
Image
ikerekes
 
Posts: 170
Joined: Thu Nov 13, 2008 4:18 pm
Location: Calgary, Canada

Re: GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Postby tonic » Mon Feb 15, 2010 3:53 pm

I just got WUs on all 5 of my clients...so perhaps the problem is fixed.
Image
tonic
 
Posts: 136
Joined: Sat Aug 02, 2008 4:05 am
Location: Seattle, WA

Re: GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Postby Pette Broad » Mon Feb 15, 2010 4:32 pm

Early days yet, but I've just uploaded 3 units (and got some more) :)

Pete
Image
Pette Broad
 
Posts: 184
Joined: Mon Dec 03, 2007 9:38 pm
Location: Chester U.K

Re: GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Postby noorman » Mon Feb 15, 2010 4:38 pm

.

The redistribution is the main thing that helped the NetLoad to come down drastically on 171.67.108.21 (I 've seen).
It was extremely high compared to 'normal' figures, it 's come down to the usual levels already !

I got a WU the minute I restarted my GPU-Client after the news from another member (living in the U.K.) that he 'd got a WU, just before.
I thought that the problem for me was the high network delays I watched when pinging the server.
Since that U.K.member is across the Pond from the U.S. too, I tried my luck again and got a P10105 straight away :D

A fellow Folder (from the W. of the U.S.) got some WU's, 1 about every 2 hrs ...
I knew the server wasn't completely down because it had responded to a check by webbrowser with 'OK' and it was pingable all of the time.
Because of that, I sent a PM to Vijay. I only forgot to calculate the time difference and I also didn't know about the Public Holiday.

By the way, Uploading was no problem for my GPU-Client; that was done already, it just couldn't get new Work.


.
User avatar
noorman
 
Posts: 553
Joined: Sun Dec 02, 2007 2:26 pm
Location: Belgium, near the International Sea-Port of Antwerp

Re: GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Postby Nathan_P » Mon Feb 15, 2010 5:11 pm

chriskwarren wrote:Thanks Dr. Pande. Can you confirm that the "Server has already received unit" problem means that our WUs were accepted by the server and not wasted? From our end it looks like the server rejects our work, and our WU gets wasted.


Yes i'd like to know as well, are we going to have to refold all those wu or is there a way to force the upload, i have about a dozen that the server says it has already received
Image
Nathan_P
 
Posts: 1584
Joined: Wed Apr 01, 2009 9:22 pm
Location: Jersey, Channel islands

Re: GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Postby Tobit » Mon Feb 15, 2010 5:25 pm

Nathan_P wrote:Yes i'd like to know as well, are we going to have to refold all those wu or is there a way to force the upload, i have about a dozen that the server says it has already received

Unfortunately, there is nothing left to force. When the client receives the message that the server has already received the work unit, the slot in queue.dat the work was assigned to is "emptied". Some of us still have some wuresults.dat files. However, this problem had gone on for so long, many of mine were over written several times with newer work. The clients have only so many slots and once the slot is cleared, there is no way to send any lingering work files back to Stanford.
User avatar
Tobit
 
Posts: 675
Joined: Thu Apr 17, 2008 2:35 pm
Location: Manchester, NH USA

Re: GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Postby Nathan_P » Mon Feb 15, 2010 5:30 pm

Tobit wrote:
Nathan_P wrote:Yes i'd like to know as well, are we going to have to refold all those wu or is there a way to force the upload, i have about a dozen that the server says it has already received

Unfortunately, there is nothing left to force. When the client receives the message that the server has already received the work unit, the slot in queue.dat the work was assigned to is "emptied". Some of us still have some wuresults.dat files. However, this problem had gone on for so long, many of mine were over written several times with newer work. The clients have only so many slots and once the slot is cleared, there is no way to send any lingering work files back to Stanford.


Thats a shame as i still have the files in my work directories. At least i'm folding again, and seeing my gtx 275 beaten by my gts 250 is indeed a sight to behold :lol:

I'd post a fahmon shot but i can't grab a screenie
Nathan_P
 
Posts: 1584
Joined: Wed Apr 01, 2009 9:22 pm
Location: Jersey, Channel islands

Re: GPU server status 171.67.108.21, 171.64.65.71,171.67.108.26

Postby VijayPande » Mon Feb 15, 2010 5:31 pm

Nathan_P wrote:
chriskwarren wrote:Thanks Dr. Pande. Can you confirm that the "Server has already received unit" problem means that our WUs were accepted by the server and not wasted? From our end it looks like the server rejects our work, and our WU gets wasted.


Yes i'd like to know as well, are we going to have to refold all those wu or is there a way to force the upload, i have about a dozen that the server says it has already received


It depends on the nature of the WS bug that's causing this, but I'm worried that these won't go back. I've escalated this bug to the highest level on our bug tracker and Joe's on it. I'll post more when we know more.

Note that, as far as we can tell so far, this is only an issue for people with multiple GPUs in the same box. If you're seeing it in some other case, please let us know.
User avatar
VijayPande
Pande Group Member
 
Posts: 2662
Joined: Fri Nov 30, 2007 6:25 am
Location: Stanford

PreviousNext

Return to Issues with a specific server

Who is online

Users browsing this forum: No registered users and 2 guests

cron