New Blog Post by Dr. Pande

7im · Post by **7im** » Fri Jul 12, 2013 1:07 am

The exponential nature of the bonus curve is what accounts for the larger shaded areas on the right. If I don't use my PC at all, I get a really high bonus, but the day I had to use it for RL stuff, the points dropped off very quickly.

Napoleon · Post by **Napoleon** » Fri Jul 12, 2013 3:32 am

TheWolf wrote:Many have overclocked CPU's, others don't, since it use a full CPU core "in most cases" per GPU naturally I would think the difference would be cause by the CPU type in use and if OCed or not.
Also not to leave out if the GPU is overclocked or not.

Your thoughts?

"Uses a full core" and "needs a full core" are slightly different cases. Most of the time the "use" seems to be just spin waiting for the (NVidia) GPU. Ideally, a device (HDD/SDD for example) is interrupt driven. A thread sets up a task for it, sends the task to the OS which then suspends the thread. Once the OS/driver receives an interrupt from the device (request completed), it resumes the thread: "here's the stuff you asked for, now go do something with it."

It may or may not be that blissful with GPUs. Judging by the use of a full core, there's some form of polling going on with (NVidia) GPUs. There are a few slightly different ways of doing that. In crude pseudocode:

Tight spin wait

Code: Select all

// a small subtask has been started
while( GPU_is_busy ) {
  // Lean & Mean:
  // do absolutely nothing else but loop, core at 100%
}
// proceed with new small subtask

Co-operative spin wait

Code: Select all

// a small subtask has been started
while( GPU_is_busy ) {
  // subtask is expected to finish soon
  // but not quite soon enough to be rude
  // so let's give the OS a chance to schedule
  // something else, and *hopefully* it'll resume us
  // "just on time" if that happens
  yield(); 
}
// proceed with new small subtask

Timered spin wait

Code: Select all

// a small subtask has been started
while( GPU_is_busy ) {
  // OS provides a high resolution timer
  // compared to expected subtask duration, so
  // we'll either let the CPU core sleep for a while
  // or let the OS do something else with it for
  // a fixed duration
  sleep( x );
}
// proceed with new small subtask

Option 3 is rather HW and OS specific, so it isn't a real option for core_17. But apparently proteneer wasn't rude enough to go for option 1. Could also be that some other component (driver?) is yielding and proteneer had no control over it. Mind you, this is just educated guesswork, but I accidentally stumbled upon core_17 behaviour which suggests use of option 2 somewhere: viewtopic.php?f=88&t=24469&start=60#p244976.

Option 2 worked quite well in my case (slow GT430 GPU and slow Atom330 CPU). I don't know how well that applies to a combination of faster GPU & faster CPU, but in my case the GT430 actually only needs about 10% of a single Atom330 core (or logical CPU, since it is a HyperThreaded 2C/4T CPU), YMMV.

Ultimately, CPU speed does make some difference with option 2 even when a full core is free for core_17, because more or less substantial amount of OS scheduling code gets executed between GPU_is_busy checks. So you may see small fluctuation in GPU usage, say 95% - 100%. Whether or not that is the main reason for the variation PG is seeing in their analysis is anyone's guess.

I'd go with 7im's explanation - the exponential nature of the bonus curve is the primary culprit for larger inconsistencies at high end. It has the potential to blow the tiniest fluctuation out of proportion, everything else is small potatoes compared to it. Good luck trying to stick with the max 10% variation policy...

Sometimes it is good to be the low end guy. Some 3k PPD fluctuation doesn't hurt one's feelings nearly as much as a 30k PPD fluctuation would.

Folding Forum

New Blog Post by Dr. Pande

New Blog Post by Dr. Pande

Re: New Blog Post by Dr. Pande

Re: New Blog Post by Dr. Pande