Page 15 of 21

Re: Stats not updating?

PostPosted: Tue Sep 12, 2017 12:38 am
by drougnor
We got two hourly updates this time, the 2pm Eastern and 3pm Eastern, before it stopped again.

What I've been able to piece together of how this system works is that the flat file gets generated with the current timestamp in PST and then gets populated with the current hour's worth of data (Name, Points total, WUs Total, Team number for the Users and Team Number, Team Name, Points total and WU total for the Teams). It appears that the crash in this particular case is happening after the blank files get generated with the timestamp, but before the data gets inserted. The system sees a file that it then zips up and publishes the empty file and zipped archive.

However, as the process manages to complete at least a few times before the next error, this is telling me that the database itself IS indeed whole and hale, esp since we can view the data through the interactive web portal on the official stats page.

Not that any of this is really relevant to most of you, but it was fun for me to piece together over a bit of time of examining error messages and figuring out what I was doing wrong and what that meant for the backend system and how it was apparently constructed.

d

Re: Stats not updating?

PostPosted: Tue Sep 12, 2017 12:55 am
by ChristianVirtual
It is also relevant to me. How else I can send push notification on stats changes when the stats file is empty ... (maybe I should at least push a note that stats is down and hope one in PG has an Apple Watch ;-)

Re: Stats not updating?

PostPosted: Tue Sep 12, 2017 4:00 am
by bruce
It's strange, but we're not seeing a single type of failure over the last few weeks.

In one case, http://fah-web.stanford.edu which contains /cgi-bin and other sub-files was off-line. Last time, the message was that the server was too busy. Now I can get to certain scripts that are inside of fah-web.stanford.edu/cgi-bin. I wonder how many DIFFERENT problems we're actually looking at.

Re: Stats not updating?

PostPosted: Tue Sep 12, 2017 4:28 am
by SombraGuerrero
That's the problem with HTTP500s. Even when you can drill down to some level of error detail, they're still so generic, they can mean anything. It seems to me that there's almost a cascading failure in the file system. Some sort of dirty process or cluster of files that's corrupting permissions, or page faulting or locking down shared resources in an access violation type of way. If it were a hardware failure, it wouldn't so consistently right itself with a simple reboot. That is, unless the storage devices are failing and there are literal read/write or file I/O problems happening. Do we know if the server has a something like a RAID array set up? Something to at least provide file system redundancy if no other type of redundancy?

Re: Stats not updating?

PostPosted: Tue Sep 12, 2017 4:33 am
by QuintLeo
Hopefully they'll get the move to new hardware done SOON, since the old hardware has been showing WAY too much downtime the last couple weeks to be trusted.

Re: Stats not updating?

PostPosted: Tue Sep 12, 2017 8:43 am
by rwh202
And 'Trust' is what a lot of this is about. If donors can't trust that the project is being run and managed in a competent way, then what's the incentive of putting in their own effort?

Yes, I realise that the stats server isn't the same as the 'science', but it provides the most visible and understandable evidence of output to the majority of donors. Also, it it considered trivial to get right, so if something 'simple' can't be done right, what faith is there in the rest of the system?

To suggest that money or resource is the problem is not believable - A few days effort and a couple $100 a month on Azure / AWS and I'm sure they could be up and running (plus, if they spoke nicely, I bet it wouldn't cost a penny).

Re: Stats not updating?

PostPosted: Tue Sep 12, 2017 1:26 pm
by drougnor
A possible potential as well may be a result of the open ended nature of letting people pick names as long and mostly complicated as they want. I'm not sure what kind of error trapping is being done on name creation, but with the influx of names that have a long hash in them that are being used for the Folding-Coin mining operations, there is a slim chance that someone accidentally created a name that is crashing one of the processes.

I'm not saying it's absolutely LIKELY, but it is something I hope the Folding programmers have looked at and eliminated as a potential cause.

d

Re: Stats not updating?

PostPosted: Tue Sep 12, 2017 1:33 pm
by SombraGuerrero
That is feasible. I recently had one of our vendors at work have to change the data type of a field from like varchar to blob2 because users were simply trying to stuff too much in there.

Re: Stats not updating?

PostPosted: Tue Sep 12, 2017 2:31 pm
by Mike.Barr
There was an error accessing/using the database.
The Folding@home team is working to fix this issue.

At least 24 hours since this message posted...Team updates not being posted.

Mike

Re: Stats not updating?

PostPosted: Tue Sep 12, 2017 3:37 pm
by drougnor
Official page is back up. Waiting to see if my system picks up the flat file now.

**edit** So, as I said earlier, the official stats page is presenting, but now the .txt file is no longer being published. The .bz2, while still empty, is there, however.

Re: Stats not updating?

PostPosted: Tue Sep 12, 2017 4:54 pm
by EmperiuM
There is data on the .bz file, but mine, for instance, is the same that I had on the Date of last work unit 2017-09-08 09:06:40.
For me, it didn't count my points since that day :(

Re: Stats not updating?

PostPosted: Tue Sep 12, 2017 5:03 pm
by nsummy
On my stats page it says: Date of last work unit 2017-09-11 11:00:16

:(

Re: Stats not updating?

PostPosted: Tue Sep 12, 2017 5:51 pm
by drougnor
I can verify with A-B comparisons on the last 4 data sets/flat files the system produced have been the identical data.

d

Re: Stats not updating?

PostPosted: Tue Sep 12, 2017 6:11 pm
by msultan
We are looking into the stats issue. Please give me a few hours to see what might be up. Our initial attempt to debug this didnt work.

Re: Stats not updating?

PostPosted: Tue Sep 12, 2017 6:20 pm
by drougnor
By all means take all the time you need. I'm much happier with a complete and correct solution that takes some time than a 'we think it's working, oooops, guess not' rushed patch job.