128.143.48.226 : server reports problem with unit

Moderators: Site Moderators, FAHC Science Team

toTOW
Site Moderator
Posts: 6312
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: 128.143.48.226 : server reports problem with unit

Post by toTOW »

dnamechanic wrote:Currently my machines use the same ISP and the sneakernetted notebooks have the cloned "UserID" in the registry. In other words the registry "UserID" in the notebooks is identical to the "UserID" in the machine that downloads the work units.
Do you use a different Machine Id for each machine with the same User ID ? That might the problem that generates the issue with 128.43.48.226 ...
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
dimilunatic
Posts: 23
Joined: Mon Mar 02, 2009 4:53 pm
Hardware configuration: =========RIG ONE=========
Processor: Intel(R) Core(TM)i5 CPU M 520 @ 2.40GHz
Memory: 6 GB
Operating System: Kubuntu 10.04
Kernel: 2.6.32-24-generic
=========RIG ONE=========

To be continued...
Location: Alexandroupolis, Greece

Re: 128.143.48.226 : server reports problem with unit

Post by dimilunatic »

I thought you only needed to have different machine IDs if you ran multiple clients on the same system. Should we also have different machine IDs for every different system?
toTOW
Site Moderator
Posts: 6312
Joined: Sun Dec 02, 2007 10:38 am
Location: Bordeaux, France
Contact:

Re: 128.143.48.226 : server reports problem with unit

Post by toTOW »

As a general rule, each machine has its unique User ID ... but when sneakernetting, each machine has the same User ID as the one on the connected machine. So you have to set different machine ID for each client that will be copied to off line machines :
Presumably machine A is still set to the default value of 1 and B and C can be 2 and 3.
Image

Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.
Trotador
Posts: 32
Joined: Sun Feb 17, 2008 6:41 pm

Re: 128.143.48.226 : server reports problem with unit

Post by Trotador »

I've also been "sneakernetting" WUs for years w/o any issue just copying work folder and queue and unitinfo files.

In my case, the sending client ID can be randomly different of equal to the client that folded the WU. It did not cause any problem in the past. ISPs are always different for sure. Yersterday I made several tries using WIFI connection from the same notebook where some of the units were folded with the same result but I can not totally ensure that there have not been changes on the client IDssince the WU was folded (I used to mess up everything constantly :)).

After these WUs, I have not folded any other WU from the same 386X family so I do not know if it will happen again or not.
only4u
Posts: 7
Joined: Wed Mar 04, 2009 7:20 pm

Re: 128.143.48.226 : server reports problem with unit

Post by only4u »

toTOW wrote:
dnamechanic wrote:Currently my machines use the same ISP and the sneakernetted notebooks have the cloned "UserID" in the registry. In other words the registry "UserID" in the notebooks is identical to the "UserID" in the machine that downloads the work units.
Do you use a different Machine Id for each machine with the same User ID ? That might the problem that generates the issue with 128.43.48.226 ...
This is what I did. My folding system having 8 cores so I created 8 mIDs. In the past, I didn't care about mID because I could send WU result with the different mID. :(
mrshirts
Pande Group Member
Posts: 54
Joined: Sat Apr 26, 2008 4:32 am

Re: 128.143.48.226 : server reports problem with unit

Post by mrshirts »

Thanks for the extra information. I'm doing some investigation with Joseph Coffland, the author of the new server code, and will get back to you as soon as I can.
dnamechanic
Posts: 16
Joined: Tue Dec 04, 2007 10:42 pm
Location: Dallas, Texas

Re: 128.143.48.226 : server reports problem with unit

Post by dnamechanic »

I notice that server 171.64.122.139 behaves in a similar manner to 128.143.48.226.

Yesterday, I tried to return a work unit to server 171.64.122.139 and received the now familiar:
- Server reports problem with unit.
Today, the following described work unit was downloaded on a Internet connected machine and moved to a non-Internet connected notebook computer, the work unit was folded then moved back to an Internet connected machine to send results back to Stanford.

First, with this newly folded work unit, the Machine ID of the sending client was matched to the Machine ID of the download client.
[15:59:03] Trying to send all finished work units
[15:59:03] Project: 3798 (Run 32, Clone 3, Gen 30)


[15:59:03] + Attempting to send results [March 18 15:59:03 UTC]
[15:59:03] - Reading file work/wuresults_01.dat from core
[15:59:03] (Read 521456 bytes from disk)
[15:59:03] Connecting to http://171.64.122.139:8080/
[15:59:06] Posted data.
[15:59:06] Initial: 0000; - Uploaded at ~170 kB/s
[15:59:06] - Averaged speed for that direction ~170 kB/s
[15:59:06] + Results successfully sent
[15:59:06] Thank you for your contribution to Folding@Home.
[15:59:06] + Number of Units Completed: 1089
Then, using a copy of the original completely folded work unit (before any attempt to send), the sending Machine ID was changed to be different than the Machine ID on the download computer client:
[16:01:23] Attempting to return result(s) to server...
[16:01:23] Trying to send all finished work units
[16:01:23] Project: 3798 (Run 32, Clone 3, Gen 30)

[16:01:23] + Attempting to send results [March 18 16:01:23 UTC]
[16:01:23] - Reading file work/wuresults_01.dat from core
[16:01:23] (Read 521456 bytes from disk)
[16:01:23] Connecting to http://171.64.122.139:8080/
[16:01:25] Posted data.
[16:01:25] Initial: 0000; - Uploaded at ~255 kB/s
[16:01:25] - Averaged speed for that direction ~255 kB/s
[16:01:25] - Server reports problem with unit.
[16:01:25] + Sent 0 of 1 completed units to the server
[16:01:25] - Failed to send all units to server
Then, using a copy of the original completely folded work unit (before any attempt to send), the sending Machine ID was changed to be the same as the Machine ID on the download computer client:
[16:03:13] Trying to send all finished work units
[16:03:13] Project: 3798 (Run 32, Clone 3, Gen 30)

[16:03:13] + Attempting to send results [March 18 16:03:13 UTC]
[16:03:13] - Reading file work/wuresults_01.dat from core
[16:03:13] (Read 521456 bytes from disk)
[16:03:13] Connecting to http://171.64.122.139:8080/
[16:03:17] Posted data.
[16:03:17] Initial: 0000; - Uploaded at ~127 kB/s
[16:03:17] - Averaged speed for that direction ~127 kB/s
[16:03:17] - Server reports problem with unit.
[16:03:17] + Sent 0 of 1 completed units to the server
[16:03:17] - Failed to send all units to server
Previously as I recall, one could resend a work unit and the receiving server would again present a message the work unit was received (although credit would only be issued for one work unit).

Now, it appears that once an attempt to return a work unit has been made with non-matching download and upload client Machine IDs, then that work unit is no longer accepted at all.

This could help explain why Trotador, myself, and others had been unsuccessful at sending in some work units (to server 128.143.48.226 ) even when the Machine IDs were changed to match.
Trotador wrote:I've tried with different IDs from different locations (i.e. ISPs) and always the same result already posted.
Probably, we had already tried with mis-matched IDs, then the server no longer accepted those WU from that contributor at all, even when Machine IDs were matched.

From thread entitled: Project: 3798 (Run 26, Clone 0, Gen 50), located at:

http://foldingforum.org/posting.php?mod ... 19&p=88382
toTOW wrote:p3798 is the project used to test the new server code ...
And, earlier in this current thread:
mrshirts wrote:...This server is using the newest code, and might have a few quirks that were not present in the older server code.

.... If so, is this different than behavior that was present before?....
Yes, it is diferent (see previous comments).
In the past..., it did not seem to matter where a work unit was downloaded and where it was completed. Credit was seen to be obtained if the username and team name matched (whether from an entirely different machine, different ISP, different machine ID #, or even a different registry UserID).
Trotador
Posts: 32
Joined: Sun Feb 17, 2008 6:41 pm

Re: 128.143.48.226 : server reports problem with unit

Post by Trotador »

Great!, I have left a bunch of 3798 WUs folding away to bring them back to home once finished for uploading... :roll:

So I will bring home the complete folders instead of just the work files so that make sure i use same ID and client config when uploading.

You are right dnamechanic that my last tries were with WUs thad failed before to uploading.

Let's hope this helps to find out the problem.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: 128.143.48.226 : server reports problem with unit

Post by bruce »

In the past, the UserID/MachineID was not checked but it was always considered a possibility that it would be someday. That day may have come, but I don't have any official word.

The actual value used is a combination of the UserID and the MachineID so you probably have to confirm that both match. I originally wrote the sneakernetting instructions to allow for this possibility even though it was not enforced at the time.
only4u
Posts: 7
Joined: Wed Mar 04, 2009 7:20 pm

Re: 128.143.48.226 : server reports problem with unit

Post by only4u »

I reckon I have to spend more time with this change :(
VijayPande
Pande Group Member
Posts: 2058
Joined: Fri Nov 30, 2007 6:25 am
Location: Stanford

Re: 128.143.48.226 : server reports problem with unit

Post by VijayPande »

bruce wrote:In the past, the UserID/MachineID was not checked but it was always considered a possibility that it would be someday. That day may have come, but I don't have any official word.

The actual value used is a combination of the UserID and the MachineID so you probably have to confirm that both match. I originally wrote the sneakernetting instructions to allow for this possibility even though it was not enforced at the time.
This server is running the new server code, which has been rearchitected from scratch. In doing so, we made choices based on what makes the most sense, especially in terms of security of WUs, etc. This clean slate approach has increased reliability and made for much cleaner code. However, there may be some "gray area" issues like this which will now cause issues.

For those sneakernetting -- is there any way you can use the same machine ID's on your remote machines, so this is not a problem? In general, we have not supported this approach, and so while we will try to work some thing out that works for everyone, I should stress that we will be tightening up security to avoid various possible cheating schemes, trojans, etc, and issues like this may crop up for those using non-standard installs.
Prof. Vijay Pande, PhD
Departments of Chemistry, Structural Biology, and Computer Science
Chair, Biophysics
Director, Folding@home Distributed Computing Project
Stanford University
7im
Posts: 10189
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: 128.143.48.226 : server reports problem with unit

Post by 7im »

As I understand it, as long as you upload the completed work unit using the same SystemID (aka UserID, not fah user name) and MachineID as when it was downloaded, then there won't be a problem. It shouldn't matter what those settings are while the work unit is being folded. Only at download and upload.*

Sneakernetters will need to become more meticulous now.


*The new server code may have become even more strict than this, but I can't imagine how it could be tracked any tighter than it was spec'd previously. Time will tell.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
stevei
Posts: 5
Joined: Fri Jan 25, 2008 12:08 am

Re: 128.143.48.226 : server reports problem with unit

Post by stevei »

I sneakernet, copying the full directory for FAH and work folders. I get this once in a while... I checked and the results file is still there on a couple of machines.... I attempt upload on the same that downloaded, to machine ID should match. What else can I do?
Mactin
Posts: 222
Joined: Sun Dec 02, 2007 1:08 pm
Location: Côte-des-Neiges, Montréal, Québec

Re: 128.143.48.226 : server reports problem with unit

Post by Mactin »

dnamechanic wrote:I notice that server 171.64.122.139 behaves in a similar manner to 128.143.48.226.
I had similar problems with a few WUs from these same servers and did not write about it sinse I figures it was a sneaketnetting error on my part (yes, it happens).

The problem I have is that clients are intermitently connected to the Internet, I might use the sneakernet to load a WU and when completed, will download without intervention.

The question now is :
Which servers are using the new server code so that we can avoid and/or manage the problem/feature/bug ?
Image
stevei
Posts: 5
Joined: Fri Jan 25, 2008 12:08 am

Re: 128.143.48.226 : server reports problem with unit

Post by stevei »

I've found 4 units completed but unsent with the same error.

When I run qfix, it detect that that results do not match the que entry, saying proj 0 run 0 clone 0 for all 4 finished WUs...

Should I just delete them and move on, or is there any way to fix this?

The error is "server reports problem with unit" and the server is 171.64.122.139 for all 4 units.
Post Reply