Stats files and unique id's

This forum contains information about 3rd party applications which may be of use to those who run the FAH client and one place where you might be able to get help when using one of those apps.

Moderator: Site Moderators

Post Reply
Bok
Posts: 5
Joined: Tue Apr 08, 2008 4:35 pm

Stats files and unique id's

Post by Bok »

Hi,

I started looking at doing all stats for Folding@Home at the free-dc stats site I run. I can parse and interpret the data into mysql adding all the ranking and such easily enough, but I get problems because the data contains non-unique id's. http://fah-web.stanford.edu/daily_user_summary.txt

Now, I presume that internally a unique id is used, just like in the team file, so is there any reason that this id is not in the user file at all or am I missing something ?

Thanks

Bok
7im
Posts: 10189
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: Stats files and unique id's

Post by 7im »

All I see on the Team stats are the team number, team name, total score, and total WU. I don't see an "internally unique ID" on the Team list. Which number is the unique ID in your reference?



P.S. Welcome to the forum.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Bok
Posts: 5
Joined: Tue Apr 08, 2008 4:35 pm

Re: Stats files and unique id's

Post by Bok »

thanks for the welcome !

teamnumber is unique in the teamfile, that's what's missing in the user file, a 'usernumber'

Bok
ChelseaOilman
Posts: 1037
Joined: Sun Dec 02, 2007 3:47 pm
Location: Colorado @ 10,000 feet

Re: Stats files and unique id's

Post by ChelseaOilman »

He's probably talking about what looks like duplicate user names. What looks like duplicates aren't. Many people use an email address as their user name and Pandegroups policy is to only post the first part before the @ sign.
If you choose your email address as your username, we will NOT print your full email address. Instead, just the part before the @ sign will be used in any stats listing, etc.
ChelseaOilman
Posts: 1037
Joined: Sun Dec 02, 2007 3:47 pm
Location: Colorado @ 10,000 feet

Re: Stats files and unique id's

Post by ChelseaOilman »

Bok wrote:teamnumber is unique in the teamfile, that's what's missing in the user file, a 'usernumber'
Can you quote an example? I'm not sure what you mean.
7im
Posts: 10189
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: Stats files and unique id's

Post by 7im »

I don't think there is a unique "usernumber" because anyone can use "John" as a user name. I could configure my client to submit work units to the username Bok if I wanted. But there is no way to distinguish between the points you submit to that user account from the points I submit to that account if only looking at a user name.

As a result, some stats sites arbitrarily assign a record number to each user name, and then display each user and team # combo separately. Some sites combine all the Johns in to one account for display purposes. There is no better or easier way to do it, AFAIK. Handle it how you best see fit.

I haven't had to deal with this personally, so I'll let someone else with more Stats experience comment further. Sorry.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Bok
Posts: 5
Joined: Tue Apr 08, 2008 4:35 pm

Re: Stats files and unique id's

Post by Bok »

yup, I agree that's what I've done in the past but it's not really optimum.

This post is more to see if the folding@home admins would perhaps modify the file to contain a unique id to get around these issues and allow the various stats sites to be more accurate :) They must hold an internal id otherwise how would the system itself know where to post points too if you changed your username to be 'Bok'....

Do the admins read the boards at all ?

Bok

p.s. re-reading your post, if there were no way to distinguish, then shouldn't there NOT be non-uniques in the output file? As the folding backend software would not be able to distinguish either and therefore lump them together?? But there are non-uniques which makes me think there is an internal identifier somehow.
Last edited by Bok on Tue Apr 08, 2008 6:29 pm, edited 1 time in total.
7im
Posts: 10189
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: Stats files and unique id's

Post by 7im »

Bok wrote:yup, I agree that's what I've done in the past but it's not really optimum.

This post is more to see if the folding@home admins would perhaps modify the file to contain a unique id to get around these issues and allow the various stats sites to be more accurate :)

Do the admins read the boards at all ?

Bok
Yes, Pande Group members do read and post. Check Vijay's post count.

Questions. Your computer and my computer both submit a work unit to the user name Bok. Should that username get two unique IDs, or just one? Is there a benefit from either choice? If two IDs, how does Stanford know which one of those two computers is yours, and which one is mine?

The problem is that Stanford can't tell them apart. That's the big flaw. Stanford has no way to distinguish unique users who all use the same username of "John" With no way to distinguish between them, there is no good way to assign unique IDs.

Hence the addition of a Passkey number (unique identifier) in the v6 client. However, those are confidential, so that doesn't help your problem, even in the future when we all start using v6 clients.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Bok
Posts: 5
Joined: Tue Apr 08, 2008 4:35 pm

Re: Stats files and unique id's

Post by Bok »

7im wrote:
Yes, Pande Group members do read and post. Check Vijay's post count.

Questions. Your computer and my computer both submit a work unit to the user name Bok. Should that username get two unique IDs, or just one? Is there a benefit from either choice? If two IDs, how does Stanford know which one of those two computers is yours, and which one is mine?
yes, ideally they would be differing ID's, much like most other projects would give. Either that or prevent a user registering the same name as is already taken. (is this the case ? - it doesn't appear to be to me)
7im wrote: The problem is that Stanford can't tell them apart. That's the big flaw. Stanford has no way to distinguish unique users who all use the same username of "John" With no way to distinguish between them, there is no good way to assign unique IDs.
If that's the case, then yes it's the big flaw, but see my added comments to my previous post and you'll see why I was thinking they were holding an internal id somehow.
7im wrote: Hence the addition of a Passkey number (unique identifier) in the v6 client. However, those are confidential, so that doesn't help your problem, even in the future when we all start using v6 clients.
true.

So for now, I'll just lump them together. :mrgreen:

Bok
7im
Posts: 10189
Joined: Thu Nov 29, 2007 4:30 pm
Hardware configuration: Intel i7-4770K @ 4.5 GHz, 16 GB DDR3-2133 Corsair Vengence (black/red), EVGA GTX 760 @ 1200 MHz, on an Asus Maximus VI Hero MB (black/red), in a blacked out Antec P280 Tower, with a Xigmatek Night Hawk (black) HSF, Seasonic 760w Platinum (black case, sleeves, wires), 4 SilenX 120mm Case fans with silicon fan gaskets and silicon mounts (all black), a 512GB Samsung SSD (black), and a 2TB Black Western Digital HD (silver/black).
Location: Arizona
Contact:

Re: Stats files and unique id's

Post by 7im »

There are no easy answers. Lumping is the least troublesome, and is what most other Stats sites do.


EDIT: I also sent you PM.
How to provide enough information to get helpful support
Tell me and I forget. Teach me and I remember. Involve me and I learn.
Bok
Posts: 5
Joined: Tue Apr 08, 2008 4:35 pm

Re: Stats files and unique id's

Post by Bok »

ok,

preliminary stats are at http://stats4.free-dc.org/stats.php?page=teams&proj=fah

I initially ran it off some old files as I was tweaking the scripts, then ran it against the current data, so the 'last update' is probably for a few weeks worth of data.

This needs to run for a few days to get the data looking consistent.

Should I filter out the default team and anonymous/PS3 user ?

Bok
VijayPande
Pande Group Member
Posts: 2058
Joined: Fri Nov 30, 2007 6:25 am
Location: Stanford

Re: Stats files and unique id's

Post by VijayPande »

People have nailed down most of the issues. I'll just elaborate that while the passkeys are private info, we could expose a unique identifier (different from the passkey) for 3rd party stats to use to distinguish donors that have given us passkeys.
Post Reply