Project 7809 (7, 192, 16) sudden slow down, "about" error

Moderators: Site Moderators, FAHC Science Team

miranda822
Posts: 5
Joined: Tue Jan 08, 2013 6:02 am

Project 7809 (7, 192, 16) sudden slow down, "about" error

Post by miranda822 »

I have been folding the above WU since January 30th with only a few breaks due to a storm in my area, and the TPF has been less than 2 hours at worst. I was expecting it to be done tonight or tomorrow at the pace it had been on. But somewhere along the line while I was at work, the TPF jumped to 5 hours, 38 minutes.

Also, the "About Project" pane text changed to this:
<!DOCTYPE HTML PUBLIC "=//IETF//DTD HTML 2.0//EN">
<html><head>
<title>500 Internal Server Error</title>
</head><body>
<h1>Internal Server Error</h1>
<p>The server encountered an internal error or
misconfiguration and was unable to complete
your request.</p>
<p>Please contact the server administrator,
root@localhost and inform them of the time the error occurred,
and anything you might have done that may have
aused the error.</p>
<p>More information about this error may be available
in the server log.</p>
<hr>
<address>Apache/2.0.52 (CentOS) Server at fah-web.stanford.edu Port 80</address>
</body></html>
I have tried quitting the program and rebooting, but the problem returned once I started the program again post-reboot. It's still progressing, but I'd rather it go back to its speedier pace from before.

What went wrong? Please advise.
art_l_j_PlanetAMD64
Posts: 472
Joined: Sun May 30, 2010 2:28 pm

Re: P7809 (7, 192, 16) sudden slow down, "about" error

Post by art_l_j_PlanetAMD64 »

miranda822 wrote:I have been folding the above WU since January 30th with only a few breaks due to a storm in my area, and the TPF has been less than 2 hours at worst. I was expecting it to be done tonight or tomorrow at the pace it had been on. But somewhere along the line while I was at work, the TPF jumped to 5 hours, 38 minutes.

I have tried quitting the program and rebooting, but the problem returned once I started the program again post-reboot. It's still progressing, but I'd rather it go back to its speedier pace from before.

What went wrong? Please advise.
The FahCore_a4.exe process, which does the calculations for P7809, may be getting disrupted by some other process using up some CPU time. It only takes a few percent, to disrupt the SMP FahCore calculations, and slow them down.

To observe the %CPU Usage for each 'Image', in much finer detail than the 'Processes' tab in Task Manager, please do this:
  • Press Ctrl+Shift+Esc, or right-click on an empty area of the taskbar and left-click on 'Start Task Manager'.
  • Select the 'Performance' tab.
  • Click on 'Resource Monitor' (near the bottom left), and select the 'Overview' tab.
Now you can see the 'Average CPU %' usage for each Image. FahCore_a4.exe should be getting 98-100% of the CPU time. If it is less, you may need to use (for example) smp:6 instead of smp:8 on the SMP WUs, especially if the GPU slots are folding P807x WUs. Please see my tests that I ran here.
Last edited by art_l_j_PlanetAMD64 on Mon Feb 04, 2013 7:21 am, edited 2 times in total.
art_l_j_PlanetAMD64
Over 1.04 Billion Total Points
Over 185,000 Work Units
Over 3,800,000 PPD
Overall rank (if points are combined) 20 of 1721690
In memory of my Mother May 12th 1923 - February 10th 2012
art_l_j_PlanetAMD64
Posts: 472
Joined: Sun May 30, 2010 2:28 pm

Re: P7809 (7, 192, 16) sudden slow down, "about" error

Post by art_l_j_PlanetAMD64 »

Also, for the '500 Internal Server Error', please see this topic:
129.74.85.15 giving 500 server error for project description
art_l_j_PlanetAMD64
Over 1.04 Billion Total Points
Over 185,000 Work Units
Over 3,800,000 PPD
Overall rank (if points are combined) 20 of 1721690
In memory of my Mother May 12th 1923 - February 10th 2012
miranda822
Posts: 5
Joined: Tue Jan 08, 2013 6:02 am

Re: Project 7809 (7, 192, 16) sudden slow down, "about" erro

Post by miranda822 »

Thanks for your help. I went through the steps, and it's not getting quite that much on average (mid 80s to early 90s), but it's using the majority of the CPU time. I had the SMP set to -1 by default, and the Resource Monitor says it's using 6 threads. I tried manually setting it to thus, but it wouldn't let me because the smp in the WU's description is smp:2.

I didn't know if the error was a related issue or not, let alone that there was a program upgrade available. Should I quit the program and upgrade, or will I lose my current WU?
miranda822
Posts: 5
Joined: Tue Jan 08, 2013 6:02 am

Re: Project 7809 (7, 192, 16) sudden slow down, "about" erro

Post by miranda822 »

UPDATE:

Now that I look at the progress bar again, I'm beginning to think that there was a glitch regarding the TPF/time left. It was at roughly 71% when I first noticed the problem, but now it's at 73.11% and the pace is up again. I had closed a few tabs in Firefox a few minutes ago, but they had been open prior to the problem.
art_l_j_PlanetAMD64
Posts: 472
Joined: Sun May 30, 2010 2:28 pm

Re: Project 7809 (7, 192, 16) sudden slow down, "about" erro

Post by art_l_j_PlanetAMD64 »

miranda822 wrote:Thanks for your help. I went through the steps, and it's not getting quite that much on average (mid 80s to early 90s), but it's using the majority of the CPU time. I had the SMP set to -1 by default, and the Resource Monitor says it's using 6 threads. I tried manually setting it to thus, but it wouldn't let me because the smp in the WU's description is smp:2.

I didn't know if the error was a related issue or not, let alone that there was a program upgrade available. Should I quit the program and upgrade, or will I lose my current WU?
OK, I would say:

1) Keep on folding, do not quit the program.

2) OK, mid 80s to early 90s is enough to severely slow down your SMP folding slot. So, set your SMP slot to use two (2) less than the actual number of cores in your CPU. The current number of cores being used shows up in your log file, when a new 0xa4 WU is started:

Code: Select all

07:18:14:WU01:FS01:0xa4:Project: 8028 (Run 2164, Clone 1, Gen 39)
07:18:14:WU01:FS01:0xa4:
07:18:14:WU01:FS01:0xa4:Assembly optimizations on if available.
07:18:14:WU01:FS01:0xa4:Entering M.D.
07:18:19:WU00:FS01:Upload complete
07:18:19:WU00:FS01:Server responded WORK_ACK (400)
07:18:19:WU00:FS01:Final credit estimate, 1311.00 points
07:18:19:WU00:FS01:Cleaning up
07:18:19:WU01:FS01:0xa4:Mapping NT from 4 to 4 
07:18:20:WU01:FS01:0xa4:Completed 0 out of 500000 steps  (0%)
07:20:52:WU01:FS01:0xa4:Completed 5000 out of 500000 steps  (1%)
The 'Mapping NT from 4 to 4' means 4 cores, so you would set your SMP slot to use 2 cores:
  • In FAHControl, you must be in either the 'Advanced' or 'Expert' mode, selected from the dropdown menu at the upper right
  • Click 'Pause' in FAHControl, and wait until all slots show Paused
  • Click on 'Configure', and select the 'Slots' tab
  • Click on the 'smp' slot to highlight it, then click on 'Edit'
  • In the 'SMP' part of the 'Configure folding slot' window, edit the 'CPUs' to be 2 cores
  • Click on 'OK', then click on 'Save'
  • Click on 'Fold', to get the folding started again
That should improve the folding performance for the SMP slot. Please let me know if this makes any improvement. Thanks!
Last edited by art_l_j_PlanetAMD64 on Mon Feb 04, 2013 1:13 pm, edited 1 time in total.
art_l_j_PlanetAMD64
Over 1.04 Billion Total Points
Over 185,000 Work Units
Over 3,800,000 PPD
Overall rank (if points are combined) 20 of 1721690
In memory of my Mother May 12th 1923 - February 10th 2012
art_l_j_PlanetAMD64
Posts: 472
Joined: Sun May 30, 2010 2:28 pm

Re: Project 7809 (7, 192, 16) sudden slow down, "about" erro

Post by art_l_j_PlanetAMD64 »

miranda822 wrote:UPDATE:

Now that I look at the progress bar again, I'm beginning to think that there was a glitch regarding the TPF/time left. It was at roughly 71% when I first noticed the problem, but now it's at 73.11% and the pace is up again. I had closed a few tabs in Firefox a few minutes ago, but they had been open prior to the problem.
OK, but I would still say to try the change to the number of CPU cores being used. This should prevent the SMP folding slot from being slowed down due to other tasks (like Firefox tabs or GPU P807x folding). Please try it, it's always easy to change it back if the change does not result in any improvement.

EDIT:
If you use 2 cores from a 4-core CPU, then the 'target' CPU % in the 'Resource Monitor' display will be 50%, not 100%. It is whatever the ratio is:
( ( # of cores used ) / ( total number of cores ) ) * 100%
So using 6 cores out of 8, like I am in one computer, results in a target of 75% in the 'Resource Monitor' display.
art_l_j_PlanetAMD64
Over 1.04 Billion Total Points
Over 185,000 Work Units
Over 3,800,000 PPD
Overall rank (if points are combined) 20 of 1721690
In memory of my Mother May 12th 1923 - February 10th 2012
art_l_j_PlanetAMD64
Posts: 472
Joined: Sun May 30, 2010 2:28 pm

Re: Project 7809 (7, 192, 16) sudden slow down, "about" erro

Post by art_l_j_PlanetAMD64 »

Also, an odd number of cores (5, 7) usually does not work, but 3 cores is usually OK. So, if 2 cores works OK for you, then using 3 cores should also work OK. Plus, smp:3 should (of course) reduce the TPF compared to using smp:2.
art_l_j_PlanetAMD64
Over 1.04 Billion Total Points
Over 185,000 Work Units
Over 3,800,000 PPD
Overall rank (if points are combined) 20 of 1721690
In memory of my Mother May 12th 1923 - February 10th 2012
art_l_j_PlanetAMD64
Posts: 472
Joined: Sun May 30, 2010 2:28 pm

Re: Project 7809 (7, 192, 16) sudden slow down, "about" erro

Post by art_l_j_PlanetAMD64 »

miranda822 wrote:I had the SMP set to -1 by default, and the Resource Monitor says it's using 6 threads. I tried manually setting it to thus, but it wouldn't let me because the smp in the WU's description is smp:2.
The number of threads displayed by Resource Monitor for the FahCore_a4.exe process is greater than the actual number of CPU cores. Please see my post above, to determine the actual number of CPU cores.
Last edited by art_l_j_PlanetAMD64 on Mon Feb 04, 2013 1:17 pm, edited 1 time in total.
art_l_j_PlanetAMD64
Over 1.04 Billion Total Points
Over 185,000 Work Units
Over 3,800,000 PPD
Overall rank (if points are combined) 20 of 1721690
In memory of my Mother May 12th 1923 - February 10th 2012
art_l_j_PlanetAMD64
Posts: 472
Joined: Sun May 30, 2010 2:28 pm

Re: Project 7809 (7, 192, 16) sudden slow down, "about" erro

Post by art_l_j_PlanetAMD64 »

miranda822 wrote:UPDATE:

Now that I look at the progress bar again, I'm beginning to think that there was a glitch regarding the TPF/time left. It was at roughly 71% when I first noticed the problem, but now it's at 73.11% and the pace is up again. I had closed a few tabs in Firefox a few minutes ago, but they had been open prior to the problem.
Also, there are some WUs that will 'jump' between two different TPF values, please look here:
art_l_j_PlanetAMD64 wrote:I just got a P8049 WU on my #6 computer (AMD FX-8150 smp:8). I have watched it for about 10 minutes, and it will jump between these two values:
8049 (264, 14, 2), Estimated TPF 1:33, Estimated PPD 13227.48
8049 (264, 14, 2), Estimated TPF 2:15, Estimated PPD 8740.98

It will spend a minute or so at each value, then immediately jump to the other value.
art_l_j_PlanetAMD64
Over 1.04 Billion Total Points
Over 185,000 Work Units
Over 3,800,000 PPD
Overall rank (if points are combined) 20 of 1721690
In memory of my Mother May 12th 1923 - February 10th 2012
PantherX
Site Moderator
Posts: 7020
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: Project 7809 (7, 192, 16) sudden slow down, "about" erro

Post by PantherX »

art_l_j_PlanetAMD64 wrote:Also, an odd number of cores (5, 7) usually does not work, but 3 cores is usually OK. So, if 2 cores works OK for you, then using 3 cores should also work OK. Plus, smp:3 should (of course) reduce the TPF compared to using smp:2.
Odd/Prime numbers like 5 and 7 can normally be used for folding and rarely cause errors. The only time it does cause errors is when a very small project (one with very few atoms) is being folded. Other than that, 5 and 7 should be usable for the majority of projects. I recall having this issue on 2 projects so far over 4 years so it's safe to say that it is usable. Moreover, larger prime/odd numbers which are known to cause issues are handled automatically by the FahCore by rounding down to the lower good number.

miranda822 -> Can you please paste the initial section of your log which contains the system configuration and your F@H configuration as described here (viewtopic.php?f=61&t=16206) so we can better help you. Furthermore, I have marked Project: 7809 (Run 7, Clone 192, Gen 16) for a follow-up in case it is a bad WU.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
art_l_j_PlanetAMD64
Posts: 472
Joined: Sun May 30, 2010 2:28 pm

Re: Project 7809 (7, 192, 16) sudden slow down, "about" erro

Post by art_l_j_PlanetAMD64 »

PantherX wrote:Odd/Prime numbers like 5 and 7 can normally be used for folding and rarely cause errors. The only time it does cause errors is when a very small project (one with very few atoms) is being folded. Other than that, 5 and 7 should be usable for the majority of projects.
OK, I thought I had read somewhere that odd numbers higher than 3 were problematic, but it's good if that is no longer true.

Is this no longer true?
Re: Radeon 7950 not folding. No WUs or problem?
Joe_H wrote:One other issue to go over is that the folding core for ATI GPU's uses up to a full core of your CPU to move data in and out of the GPU. So if you do keep folding on your 7950, you should adjust the SMP setting for that slot. Change the SMP setting from -1, the default which uses all cores available, to 3 or 2. 3 is one exception to the message in FAHControl of setting the number to an even number.
art_l_j_PlanetAMD64
Over 1.04 Billion Total Points
Over 185,000 Work Units
Over 3,800,000 PPD
Overall rank (if points are combined) 20 of 1721690
In memory of my Mother May 12th 1923 - February 10th 2012
Jesse_V
Site Moderator
Posts: 2851
Joined: Mon Jul 18, 2011 4:44 am
Hardware configuration: OS: Windows 10, Kubuntu 19.04
CPU: i7-6700k
GPU: GTX 970, GTX 1080 TI
RAM: 24 GB DDR4
Location: Western Washington

Re: Project 7809 (7, 192, 16) sudden slow down, "about" erro

Post by Jesse_V »

art_l_j_PlanetAMD64 wrote:
PantherX wrote:Odd/Prime numbers like 5 and 7 can normally be used for folding and rarely cause errors. The only time it does cause errors is when a very small project (one with very few atoms) is being folded. Other than that, 5 and 7 should be usable for the majority of projects.
OK, I thought I had read somewhere that odd numbers higher than 3 were problematic, but it's good if that is no longer true.
See:
http://folding.stanford.edu/English/FAQ-SMP#ntoc5
viewtopic.php?f=16&t=23060&p=230790#p230790
viewtopic.php?f=19&t=21920&p=218801#p218801
art_l_j_PlanetAMD64
Posts: 472
Joined: Sun May 30, 2010 2:28 pm

Re: Project 7809 (7, 192, 16) sudden slow down, "about" erro

Post by art_l_j_PlanetAMD64 »

Thanks, Jesse, for those links. From the information in those links, the answer for 5 is "should usually be OK but not 100% guaranteed", and for 7 it is "there are quite a few problems reported, so this is not recommended".

Am I reading that correctly, or did I misunderstand what I read there?
art_l_j_PlanetAMD64
Over 1.04 Billion Total Points
Over 185,000 Work Units
Over 3,800,000 PPD
Overall rank (if points are combined) 20 of 1721690
In memory of my Mother May 12th 1923 - February 10th 2012
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Project 7809 (7, 192, 16) sudden slow down, "about" erro

Post by bruce »

The original definition of poor choices of SMP values was to avoid "large primes" but "large" was never defined. A lot of WUs have been folded since then and the Pande Group has gradually established what works and what does not work. As PantherX has said, the FahCores will no longer let you run 31 cores on your 32-core bigadv machine or 23 on your 24-core machine. [Most certainly "large" primes.] They will let you run 3, 5, or 7 but they should prevent the assignment of proteins with very few atoms to those machines. In other words, the Pande Group has apparently established suitable criteria that can minimize those problems. It's still a statistical thing, though based on the shape of those "few atoms" in 1/7th of the protein. If you could actually run with 11 of your 12 cores, you would have a higher failure rate than somebody with 7 but you would still have quite a few successes, compared to 23 or 31.

The Pande Group doesn't want failed WUs either. We do still see some "bad WUs" but the frequency is going down. Unfortunately a WU can error-out due to the number of cores together with the number of atoms, to overclocking, to hardware faults, and to several other They continue to collect data, so if they need to make additional assignment tweaks, I'm sure they will.
Post Reply