Missing points

Moderators: Site Moderators, FAHC Science Team

Missing points

Postby Bastiaan_NL » Thu Sep 10, 2020 4:48 pm

The updates on EOC seem to be very low compared to the HFM.NET "work unit history viewer". EOC and the Stanford stats page show almost the same total points at the time of the update (300k difference 9am update and 120k 12pm update).
Now I know there is some delay between the units being uploaded and the results shown, but looking at older updates I see the same thing.
I am running 3 systems, the hardware combined is good for about 7mil ppd average. There has been some downtime on some hardware (a few hours on one client), but not enough for a 3 mil difference in the last 24 hours.

Link to the Stanford stats page: https://stats.foldingathome.org/donor/2347
Link to the EOC stats page: https://folding.extremeoverclocking.com ... =&u=466987

Looking at the 9am update on EOC it shows 184k with 8 units.
If I look at the HFM.NET history, there is no possible 3 hour (or 8 unit) window (starting 30 minutes before the time of the EOC update, moving the window down a few units) where I see around 180k points.
There are units in those "windows" that are between 180k and 350k points alone.

I checked every system, they all have the passkey and the credit shown on the Advanced Control is the same (give or take a few points) as HFM.NET.
Every system has its own HFM.NET running and they are all linked together showing the same total on every one of them.

Last part of log from the 2080ti as an example, the logs on the other systems look the same:
Code: Select all
13:24:24:WU01:FS01:0x22:Completed 1980000 out of 2000000 steps (99%)
13:25:06:WU01:FS01:0x22:Completed 2000000 out of 2000000 steps (100%)
13:25:06:WU01:FS01:0x22:Average performance: 39.8157 ns/day
13:25:09:WU01:FS01:0x22:Saving result file ..\logfile_01.txt
13:25:09:WU01:FS01:0x22:Saving result file checkpointState.xml
13:25:10:WU01:FS01:0x22:Saving result file positions.xtc
13:25:10:WU01:FS01:0x22:Saving result file science.log
13:25:10:WU01:FS01:0x22:Folding@home Core Shutdown: FINISHED_UNIT
13:25:11:WU01:FS01:FahCore returned: FINISHED_UNIT (100 = 0x64)
13:25:11:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:14906 run:62 clone:0 gen:30 core:0x22 unit:0x0000002081d59d695f4ec9d81405b5ad
13:25:11:WU01:FS01:Uploading 10.54MiB to 129.213.157.105
13:25:17:WU01:FS01:Upload 33.22%
13:25:23:WU01:FS01:Upload 67.03%
13:25:29:WU01:FS01:Upload complete
13:25:29:WU01:FS01:Server responded WORK_ACK (400)
13:25:29:WU01:FS01:Final credit estimate, 179828.00 points
13:25:29:WU01:FS01:Cleaning up


What is going wrong here?


Hardware:
System 1: 3950x running 27 threads + RTX 2080ti
System 2: i3 9100F running 2 threads + 2 RX5700XT
System 3: i7 7700k running 4 threads + GTX 1660ti
Last edited by Bastiaan_NL on Thu Sep 10, 2020 6:11 pm, edited 2 times in total.
ImageImage
Bastiaan_NL
 
Posts: 19
Joined: Wed May 13, 2020 5:34 am
Location: Netherlands

Re: Missing points

Postby Joe_H » Thu Sep 10, 2020 5:47 pm

Looking at both those links the current difference is less than a million points, EOC - 338,571,181 & official - 339,356,503. Please be aware that EOC donloads the flat files once every 3 hours, the time on the site is listed as CDT. The official stats are current based on the last hourly update. I am not seeing a problem, you can not expect the two to be completely in sync.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Joe_H
Site Admin
 
Posts: 6593
Joined: Tue Apr 21, 2009 5:41 pm
Location: W. MA

Re: Missing points

Postby Bastiaan_NL » Thu Sep 10, 2020 5:58 pm

The problem is not the difference between EOC and the Stanford page (that was about 300k at the time of the update), but more that it shows that the last 24 hours was 4.2 mil where it should be close to 6 million points looking at what has been sent from my systems.
Now that's pretty hard to add up, looking at HFM.NET history and doing it all manually, but I stopped at 5+ million points because it was already clear that there was a difference.
A 180k update is impossible if every possible 3 hour window contains more than 500k points at least.
Bastiaan_NL
 
Posts: 19
Joined: Wed May 13, 2020 5:34 am
Location: Netherlands

Re: Missing points

Postby Joe_H » Thu Sep 10, 2020 6:09 pm

The one example of WU being returned does not help in finding a problem, it has been entered into the database. If you want to check further, anyone can look up WUs to see if they are in the database - https://apps.foldingathome.org/wu. If you find specific WUs not in the database, report them and which server they were uploaded to. There could be a configuration or other problem and the reports are not getting to the database such as in this topic - viewtopic.php?f=18&t=36092.
Joe_H
Site Admin
 
Posts: 6593
Joined: Tue Apr 21, 2009 5:41 pm
Location: W. MA

Re: Missing points

Postby Bastiaan_NL » Thu Sep 10, 2020 6:17 pm

Thank you for that link Joe_H!
I'll look into it and I will report back.

[edit]
I'll continue in the thread linked above, seems to be the same server and the same 1343x units that are missing.
I found 10 "missing" units in 2 days so far.
Bastiaan_NL
 
Posts: 19
Joined: Wed May 13, 2020 5:34 am
Location: Netherlands

Re: Missing points

Postby JohnChodera » Fri Sep 11, 2020 4:56 am

We had an issue with the new pllwskifah1.mskcc.org server not returning stats credits until this afternoon. We've managed to fix this! Can you check again and see if this has been remedied?

~ John Chodera // MSKCC
User avatar
JohnChodera
Pande Group Member
 
Posts: 406
Joined: Fri Feb 22, 2013 10:59 pm

Re: Missing points

Postby Bastiaan_NL » Fri Sep 11, 2020 9:06 am

Hey John,

The 11 units that were missing are showing now, thanks for that.
2 more units are now stuck waiting to upload to the server, same as everyone reporting in the "140.163.4.200" topic.
Bastiaan_NL
 
Posts: 19
Joined: Wed May 13, 2020 5:34 am
Location: Netherlands

Re: Missing points

Postby JohnChodera » Sat Sep 12, 2020 3:55 am

Thanks for the update! I believe we've narrowed down the issue to an underperforming NFS mount, and are just trying to get through the backlog right now while we investigate a fix.

Thanks for your patience!

~ John Chodera // MSKCC
User avatar
JohnChodera
Pande Group Member
 
Posts: 406
Joined: Fri Feb 22, 2013 10:59 pm


Return to Discussions of General-FAH topics

Who is online

Users browsing this forum: No registered users and 2 guests

cron