Problem with QRB

Moderators: Site Moderators, FAHC Science Team

Prelude514
Posts: 19
Joined: Sun Feb 05, 2012 5:19 am
Location: Montreal, Canada

Re: Problem with QRB

Post by Prelude514 »

Hi Bruce,

As toTOW pointed out, it seems to be happening with all my machines, both local and remote. All are using the same user name and pass key. I believe, referring to the stats graph I posted earlier, that I've had the problem since January 15th.
Prelude514
Posts: 19
Joined: Sun Feb 05, 2012 5:19 am
Location: Montreal, Canada

Re: Problem with QRB

Post by Prelude514 »

toTOW wrote:I think you can request another passkey by entering your informations again on : can't post links yet*

If you enter the same informations as before, I guess it will give you the same passkey, so you will be able to double check what you entered in each client (be careful of unwanted space or hidden characters).

And the tool I used is only available to mods.
Thanks, figured it would be only for mods. I'll try to avoid changing my pass key for now, as I'd like to avoid the hassle of reconfiguring all my clients.

Thanks!
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Problem with QRB

Post by bruce »

Some of your WUs are getting bonus credit. This is the most recent one I can find from one of your clients:

Your WU (P8004 R116 C4 G37) was added to the stats database on 2012-01-14 17:10:22 for 823.98 points of credit.

Many others are not. As of right now, I don't see the pattern.
Prelude514
Posts: 19
Joined: Sun Feb 05, 2012 5:19 am
Location: Montreal, Canada

Re: Problem with QRB

Post by Prelude514 »

That's in line with what I think is happening, ie no QRB on or after 01/15. The WU you posted was completed on 01/14. Can you see any other returned WU with a bonus after the 15th?
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Problem with QRB

Post by bruce »

I see nothing submitted between the 14th and the 20th. (That doesn't mean there wasn't anything -- Only that if a client DID submit something in that window, it has completed another WU since then. Using this method, I can only see the most recent WU from each client.)

I guess you've spotted the pattern. On about the 15th your completion rate something happened. Those submitted since then have not been getting a bonus.

It's a lot of work to check a large number of individual WUs, but I did spot-check a few and I see quite a few which were completed within the timeout and got no bonus. I can't tell if you've been assigned WUs that were lost, but if they exceed the 20% threshold, that's the explanation.
Prelude514
Posts: 19
Joined: Sun Feb 05, 2012 5:19 am
Location: Montreal, Canada

Re: Problem with QRB

Post by Prelude514 »

I see. Thanks for all the hard work so far. All of my machines run 24/7, so they definitely returned work between the 14th and the 20th.

When you say "WUs that were lost" do you mean they crashed on my machine and weren't completed? Or that they possibly got ignored by Stanford's servers? Guess it could be either, but I can't see anything in any of my logs that would indicate a client crashing and returning a WU early.

Have we exhausted all options, with regards to solving my problem? If so, how should I proceed? Continue folding as I'm set up and forget about the bonus points?

Thanks again,

Phil
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Problem with QRB

Post by bruce »

I can't clearly define "WUs that were lost" If a client requests a WU and it doesn't come back, that's what I call "lost"

Some WUs do run into errors and a report is made to the server.
Some WUs do run into errors and the client is unable to return anything to the server.
Some WUs are downloaded and then a client is uninstalled, discarding data files.
Some donors discard WUs they don't like. (I'm not making any insinuations about whether you've ever done that or not.)

All of those count against the 80% requirement, whether there was anything you could do about it or not. Anything you can do to reduce the number of lost WUs is a good investment of your time.

Also, all WUs that are simply returned after the timeout are counted against the 80% requirement. Presumably there IS something you can do about that but those are "late" WUs, not "lost" WUs.

I wouldn't call the first on the list (where a report IS made to the server) as either "lost" or "late" but rather "errors", some of which might be caused randomly by "bad WUs" and others which are caused by other factors, including excessive overclocking, improper installation, hardware failures, etc. Bad hardware or improper client installation can contribute to any category and you're responsible for fixing that sort of errors.
Joe_H
Site Admin
Posts: 7870
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Problem with QRB

Post by Joe_H »

Your problem seeming to start on the 15th rang a bell. That was the weekend the Project 8001, 8004 and 8011 WU's had their deadlines and k-factors adjusted because they were getting too high a bonus. Were your machines doing a lot of those then? There were some reports of persons having WU's dropped as they could no longer be finished within the shortened deadlines measured in hours. The deadlines were made longer within a couple days. If you had a lot of that happening, that could have put your completion percentage low. I might be reaching a bit, that temporary loss should have been made up by now and got you back over 80%. But that might be something to look for in your logs.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Problem with QRB

Post by bruce »

Joe_H wrote:Your problem seeming to start on the 15th rang a bell. That was the weekend the Project 8001, 8004 and 8011 WU's had their deadlines and k-factors adjusted because they were getting too high a bonus. Were your machines doing a lot of those then? There were some reports of persons having WU's dropped as they could no longer be finished within the shortened deadlines measured in hours. The deadlines were made longer within a couple days. If you had a lot of that happening, that could have put your completion percentage low. I might be reaching a bit, that temporary loss should have been made up by now and got you back over 80%. But that might be something to look for in your logs.
That's a useful guess, but there's some additional information that would also be useful.

Those projects had deadlines that were consistent with SMP but they were too short if the WU was assigned to a Uniprocessor client. The Uni client has every right to expect a longer deadlines while the SMP client should be able to meet tighter deadlines. This is only a problem for WUs that can be assigned to EITHER the SMP or the Uni client and the deadlines were extended to be consistent with that requirement.

In other words, the question revolves around the number of SMP or Uni clients that Prelude514 was running. My data shows which WUs were completed, but I can't tell the number of cores that the client reported to the assignment server.
Prelude514
Posts: 19
Joined: Sun Feb 05, 2012 5:19 am
Location: Montreal, Canada

Re: Problem with QRB

Post by Prelude514 »

bruce wrote:I can't clearly define "WUs that were lost" If a client requests a WU and it doesn't come back, that's what I call "lost"

Some WUs do run into errors and a report is made to the server.
Some WUs do run into errors and the client is unable to return anything to the server.
Some WUs are downloaded and then a client is uninstalled, discarding data files.
Some donors discard WUs they don't like. (I'm not making any insinuations about whether you've ever done that or not.)

All of those count against the 80% requirement, whether there was anything you could do about it or not. Anything you can do to reduce the number of lost WUs is a good investment of your time.

Also, all WUs that are simply returned after the timeout are counted against the 80% requirement. Presumably there IS something you can do about that but those are "late" WUs, not "lost" WUs.

I wouldn't call the first on the list (where a report IS made to the server) as either "lost" or "late" but rather "errors", some of which might be caused randomly by "bad WUs" and others which are caused by other factors, including excessive overclocking, improper installation, hardware failures, etc. Bad hardware or improper client installation can contribute to any category and you're responsible for fixing that sort of errors.
Thanks for the explanation. I'm confident that all is in order on my end, hardware and configuration wise.
Joe_H wrote:Your problem seeming to start on the 15th rang a bell. That was the weekend the Project 8001, 8004 and 8011 WU's had their deadlines and k-factors adjusted because they were getting too high a bonus. Were your machines doing a lot of those then? There were some reports of persons having WU's dropped as they could no longer be finished within the shortened deadlines measured in hours. The deadlines were made longer within a couple days. If you had a lot of that happening, that could have put your completion percentage low. I might be reaching a bit, that temporary loss should have been made up by now and got you back over 80%. But that might be something to look for in your logs.
You're right. That's all that my machines were getting. My Q9650 was averaging 20k PPD before the change, and about 6k PPD after the change. Those projects were all I received for a good couple of days, much to my annoyance. I didn't notice any dropped WUs on them though, as my machines were still completing them way before deadlines.

This is pretty discouraging. I bought a 3930K that should be here any day now, mostly to run F@H on. Knowing that it's only going to crank out about 10,000 PPD because of this issue is disheartening, to say the least.

Thanks for all of your suggestions and help so far, hopefully we can get this resolved.
Prelude514
Posts: 19
Joined: Sun Feb 05, 2012 5:19 am
Location: Montreal, Canada

Re: Problem with QRB

Post by Prelude514 »

Can someone confirm whether or not bonus was credited for the following units?

Project: 11061 (Run 0, Clone 2200, Gen 14)
Project: 11040 (Run 0, Clone 1128, Gen 4)

Curious, as I manually reentered the pass key before these WUs finished.
ChelseaOilman
Posts: 1037
Joined: Sun Dec 02, 2007 3:47 pm
Location: Colorado @ 10,000 feet

Re: Problem with QRB

Post by ChelseaOilman »

Hi Fever (team 32),
Your WU (P11061 R0 C2200 G14) was added to the stats database on 2012-02-05 19:09:00 for 380 points of credit.

Hi Fever (team 32),
Your WU (P11040 R0 C1128 G4) was added to the stats database on 2012-02-05 18:09:23 for 521 points of credit.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Problem with QRB

Post by bruce »

Prelude514 wrote:Curious, as I manually reentered the pass key before these WUs finished.
If that actually changed anything, then you would NOT get a bonus. Every WU must be uploaded using the same setup as it was downloaded. If the passkey had been wrong, changing it to the right value would have disqualified that WU although the next WU would be downloaded using the (new) passkey.

On the other hand, if your passkey has dropped below the 80% cutoff, you would also get no bonus until whenever you again exceed the 80% minimum. There's no way to know which reason disqualified those WUs.
Prelude514
Posts: 19
Joined: Sun Feb 05, 2012 5:19 am
Location: Montreal, Canada

Re: Problem with QRB

Post by Prelude514 »

ChelseaOilman wrote:Hi Fever (team 32),
Your WU (P11061 R0 C2200 G14) was added to the stats database on 2012-02-05 19:09:00 for 380 points of credit.

Hi Fever (team 32),
Your WU (P11040 R0 C1128 G4) was added to the stats database on 2012-02-05 18:09:23 for 521 points of credit.
Thanks Chelsea.
bruce wrote:
Prelude514 wrote:Curious, as I manually reentered the pass key before these WUs finished.
If that actually changed anything, then you would NOT get a bonus. Every WU must be uploaded using the same setup as it was downloaded. If the passkey had been wrong, changing it to the right value would have disqualified that WU although the next WU would be downloaded using the (new) passkey.

On the other hand, if your passkey has dropped below the 80% cutoff, you would also get no bonus until whenever you again exceed the 80% minimum. There's no way to know which reason disqualified those WUs.

I see. Either way, it was properly entered before I reentered it manually again. I was hoping the client.cfg may have been corrupted, as that might have explained something. Really at wits end here, everything is properly configured. Perhaps I just need a new passkey. I'm sifting through all logs again, looking for EUEs. So far, nada.

If this ends up being a problem on Stanford's end, will I be credited the bonus points that I've lost since Jan. 15th?

Will report back once I finish examining logs.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Problem with QRB

Post by bruce »

I don't see how it could be a problem on Stanford's end. Wouldn't there be a long line of people complaining about the same problem? Still, with no data, I'm not sure how to know whether it's a Stanford issue or an issue on your end.
Post Reply