Page 4 of 5

Re: Project 10496 (158,16,66)

PostPosted: Sat Jul 08, 2017 3:12 pm
by _r2w_ben
Can someone upload a screenshot of the Sensors tab of GPU-Z when running p10496 on a 1080 or 1080 Ti?

Re: Project 10496 (158,16,66)

PostPosted: Sat Jul 08, 2017 3:12 pm
by Nathan_P
I recall a long time ago that some gpu projects on core's 11 and 15 performed no better on the best gpu's of the time than those a couple of models down - could this be the same issue? The theory was that some WU were too small to run optimally on some gpu's given the qty of shaders that they had.

Re: Project 10496 (158,16,66)

PostPosted: Sat Jul 08, 2017 4:06 pm
by ComputerGenie
_r2w_ben wrote:Can someone upload a screenshot of the Sensors tab of GPU-Z when running p10496 on a 1080 or 1080 Ti?

What are you looking for?

Re: Project 10496 (158,16,66)

PostPosted: Sat Jul 08, 2017 4:50 pm
by Hypocritus
Leonardo wrote:
QuintLeo wrote:GTX 1080 ti also runs low PPD on this, commonly less than 800k PPD (vs 1 million PLUS for everything else my 1080 ti cards have seen to date).


Is there a possibility your GPU is throttling, due to heat buildup? I have three 1080Tis, all of which typically process a 10496 work unit at about 1M PPD. I have overclocked the GPUs, but only at a minimal 100MHz core boost. They stay relatively cool, typically at about 65 to 70C core temp.

Keep in mind also that a 1080Ti working a 10496 WU pumps a lot of data through PCIe bus. If there are multiple video cards Folding on the same motherboard, it can very easily saturate the bus.


Not to say that you don't have the most clinically-relevant folding setup Leonardo, but, is it by chance located in Alaska? :e?:

However to-your-point, yes, when my temperatures are more controlled, with 10496 I get on average about 90% PPD of what other projects get. When I'm hitting the temperature ceilings, I'm usually getting between 80 - 85%. This equation is pretty consistent whether on my 1070's, or my 1080 Ti's

Re: Project 10496 (158,16,66)

PostPosted: Sat Jul 08, 2017 5:46 pm
by ComputerGenie
I've had some of "bad" RCGs hit the 650k range on my 1080 rig that is in my miner room (kept between 60-75F) the same as they have on the rig in my office (kept between 75-90F*), so I'm inclined to say that temperature isn't a project-specific factor**.

*Not cooled on my days off, just "open air".
**Aside from "normal" electronic component cooling variations.

Re: Project 10496 (158,16,66)

PostPosted: Sat Jul 08, 2017 6:22 pm
by bruce
Nathan_P wrote:I recall a long time ago that some gpu projects on core's 11 and 15 performed no better on the best gpu's of the time than those a couple of models down - could this be the same issue? The theory was that some WU were too small to run optimally on some gpu's given the qty of shaders that they had.


This is still a viable theory. It's not that the FAHCore is inefficient, it's that the WU can't keep all the shaders busy for very long before some data has to move to/from main RAM.

The same logic works for multi-threaded CPU projects. Divide up the WU into N pieces that contain about the same number of atoms. Send each one to a different processor. For atoms near the center of a piece, you can mostly ignore those atoms which are in other pieces. Reprocess the forces on each atom that's near enough to a boundary to be influenced by forces from atoms in another piece.

Once that process is finished, you can establish a "NOW" shape which can again be broken up into N pieces.

If N is too large, there are too many boundaries so there are too many atoms that can't be computed easily because if those atoms in another piece have moved, the motions of the atom you're computing will no longer be correct -- so for a project with a specific number of atoms, there's an optimum number of calculations that can be done in parallel. Having too many shaders will inevitably leave more of them idle more of the time.

Re: Project 10496 (158,16,66)

PostPosted: Sat Jul 08, 2017 7:35 pm
by _r2w_ben
ComputerGenie wrote:
_r2w_ben wrote:Can someone upload a screenshot of the Sensors tab of GPU-Z when running p10496 on a 1080 or 1080 Ti?

What are you looking for?

I'm interested in seeing the shape of the load graphs. If the GPU load drops on a regular basis and another load graph spikes, it could give some indication of the performance bottleneck.

Re: Project 10496 (158,16,66)

PostPosted: Sat Jul 08, 2017 7:56 pm
by ComputerGenie
_r2w_ben wrote:
ComputerGenie wrote:
_r2w_ben wrote:Can someone upload a screenshot of the Sensors tab of GPU-Z when running p10496 on a 1080 or 1080 Ti?

What are you looking for?

I'm interested in seeing the shape of the load graphs. If the GPU load drops on a regular basis and another load graph spikes, it could give some indication of the performance bottleneck.

Image

Re: Project 10496 (158,16,66)

PostPosted: Tue Jul 11, 2017 1:00 am
by _r2w_ben
Thanks ComputerGenie. Unfortunately there doesn't appear to be an obvious correlation.

If anyone wants to explore this further:
  • Click the hamburger menu in GPU-Z
  • Go to Sensors and set the refresh rate to 0.1 seconds
  • Log the data to file for a few frames
  • Import the log file into a spreadsheet
  • Graph gpu load, memory controller load, and bus interface load for p10496
  • Create a similar graph for a project that runs well and compare

Re: Project 10496 (158,16,66)

PostPosted: Wed Jul 12, 2017 11:38 am
by ComputerGenie
_r2w_ben wrote:Thanks ComputerGenie. Unfortunately there doesn't appear to be an obvious correlation.
If anyone wants to explore this further...
If I ever get something on my Win 7 (Ti) box that isn't 10496 (since your post, I've literally only had 1 WU on this box that was another project), I'll do that and post the sheet.

Edit: Here's what I've got (link good for 30 days), I hope it helps in what you're looking for.

Re: Project 10496 (158,16,66)

PostPosted: Wed Jul 12, 2017 9:28 pm
by ComputerGenie
QuintLeo wrote:...The performance is SO CRAZY BAD on my 1080ti cards that I'm seriously considering blocking the workserver on that machine - it's ridiculous to WASTE top-end folding cards on a work unit that performs so much BETTER on MUCH WORSE hardware.
After 2 days of almost nothing but this project on my Ti, I can feel your pain.

Re: Project 10496 (158,16,66)

PostPosted: Thu Jul 13, 2017 3:40 pm
by Nert
I've noticed the same performance difference on my brand new 1080 recently. Out of curiosity does anyone keep ongoing statistics about work unit performance for various cards ? Is there any way to scrape it out of logs ?

I'm folding on CPU and two GPU's. Here's a description of my system in case it adds anything to the analysis:

Processor: Intel I5 4590
OS: Linux Mint
Motherboard: Asus Z97 Pro

GPU0: GTX 1080
PCIE Generation: Gen3
Maximum PCIe Link Width: x16
Maximum PCIe Link Speed: 8.0 GT/s

GPU1: GTX 970

PCIE Generation: Gen3
Maximum PCIe Link Width: 16
Maximum PCIe Link Speed: 8.0 GT/s

Here are some ad hoc captures that I did over the past couple of days.

Columns are WU,PPD,TPF,Date

1080:

9415 1044548 43.00 secs 07/12/17
9415 1043943 43.00 secs 07/12/17
10496 701872 1 mins 53 secs 07/12/17
10496 686456 1 mins 55 secs 07/12/17
10496 704749 1 mins 53 secs 07/12/17
10496 695594 1 mins 54 secs 07/13/17

970:

10496 307778 3 mins 17 secs 07/12/17
11431 288645 4 mins 24 secs 07/12/17
11407 317473 3 mins 09 secs 07/13/17
9431 307397 1 mins 54 secs 07/13/17

Re: Project 10496 (158,16,66)

PostPosted: Thu Jul 13, 2017 7:42 pm
by ComputerGenie
Now, isn't this fun :roll:

Image

3 cards running for the "normal" PPD value of 2 :evil:

Re: Project 10496 (158,16,66)

PostPosted: Fri Jul 14, 2017 3:10 am
by Leonardo
3 X GTX 1080 - you are saturating the PCI-e bus? Also, from your motherboard manual: "The PCIe x16_3 slot shares bandwidth with USB3_E12 and PCIe x1_4. The
PCIe x16_3 is default at x1 mode."

Re: Project 10496 (158,16,66)

PostPosted: Fri Jul 14, 2017 3:43 am
by ComputerGenie
Leonardo wrote:3 X GTX 1080 - you are saturating the PCI-e bus? Also, from your motherboard manual: "The PCIe x16_3 slot shares bandwidth with USB3_E12 and PCIe x1_4. The
PCIe x16_3 is default at x1 mode."

If you mean me, 3 slots doesn't come close to fulling anything and that's nowhere in my manual. :wink: