Mining Motherboard for FAH (with many pcie x1 slots)

A forum for discussing FAH-related hardware choices and info on actual products (not speculation).

Moderator: Site Moderators

Forum rules
Please read the forum rules before posting.
Post Reply
v00d00
Posts: 396
Joined: Sun Dec 02, 2007 4:53 am
Hardware configuration: FX8320e (6 cores enabled) @ stock,
- 16GB DDR3,
- Zotac GTX 1050Ti @ Stock.
- Gigabyte GTX 970 @ Stock
Debian 9.

Running GPU since it came out, CPU since client version 3.
Folding since Folding began (~2000) and ran Genome@Home for a while too.
Ran Seti@Home prior to that.
Location: UK
Contact:

Mining Motherboard for FAH (with many pcie x1 slots)

Post by v00d00 »

Slightly OT, but is anyone using one of these boards for folding? I am coming to the point where I will need to upgrade my AMD system, possibly to a Ryzen and these boards caught my eye. While the whole PCIE x1 vs x16 might be a big thing on 1080's, how does it scale on say a 1050 or 1030.

This is highly speculative btw, but the thought of running a primary card on x16 and some lesser cards, say 1050's, on x1 slots does interest me. If anyone is running one, what sort of numbers are you getting from the x1 slots on the cards you are using. As I understand it, most of these boards only give you 3-4 pcie 3.0 slots, the rest being 2.0 slots. For me this would be fine since I have no interest in deploying 8+ cards from one board initially, but given money and upgrades, may do down the line.
Image
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Mining Motherboard for FAH (with many pcie x1 slots)

Post by bruce »

When people ask about PCIe speed, they're generally trying to get 100% GPU utilization with no measurable throughput reduction due to data movement. That's not what you're looking for. There's no doubt that a GPU in a 1x slot will spend a portion of it's time waiting for data and a portion of it's time waiting for the shaders to complete the calculations on that data ... but let's look at it from the big-picture perspective. The objective should be to get more total work done, whether it's waiting for the slot or for the calculations. One GPU in a x16 slot will get less work done than 2 similar GPUs in x8 slots even if they spend more time waiting for data. For a GPU that's very fast, the general rule of thumb is that a x4 slot will see some loss in performance, but not that much. As you've obviously figured out, a 1050 will see a smaller loss in performance that, say, a 1080.

I have successfully use "slower" GPUs in x1 slots but I didn't explicitly compare a family of various "slower" GPUs and I don't remember any reports from others.

Also the terms 1x/2x/4x/8x/16x are not really meaningful with out a version number: PCI Exprpress 1.0/ 2,0/ 3,0/ or 4,0 so I'm being a bit sloppy in my earlier statements.

The actual results will also depend on the size of the WU being processes ... lots of atoms means more data needs to be moved that a protein with fewer atoms.

Anyway a 1050 connected at PCIe 1.0 at 1x is probably going to be severly throttled, so I don't recommend it. PCIe 3.0 1x is probably plenty to keep a 1030 quite busy. I wish I could give you more accurate data.

.
rwh202
Posts: 425
Joined: Mon Nov 15, 2010 8:51 pm
Hardware configuration: 8x GTX 1080
3x GTX 1080 Ti
3x GTX 1060
Various other bits and pieces
Location: South Coast, UK

Re: Mining Motherboard for FAH (with many pcie x1 slots)

Post by rwh202 »

Just to add to the above - Linux is a better option than windows at the moment for PCIe bandwidth.
A 1050 won't be throttled to any meaningful extent and a 1080Ti will still pull 1 Mil PPD (vs 1.2) on PCIe 2.0 1x
foldy
Posts: 2061
Joined: Sat Dec 01, 2012 3:43 pm
Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441

Re: Mining Motherboard for FAH (with many pcie x1 slots)

Post by foldy »

Yes you need to use Linux as Windows bottlenecks much more on pcie.
v00d00
Posts: 396
Joined: Sun Dec 02, 2007 4:53 am
Hardware configuration: FX8320e (6 cores enabled) @ stock,
- 16GB DDR3,
- Zotac GTX 1050Ti @ Stock.
- Gigabyte GTX 970 @ Stock
Debian 9.

Running GPU since it came out, CPU since client version 3.
Folding since Folding began (~2000) and ran Genome@Home for a while too.
Ran Seti@Home prior to that.
Location: UK
Contact:

Re: Mining Motherboard for FAH (with many pcie x1 slots)

Post by v00d00 »

It's fine, I haven't used Windows for anything serious in over 13 years. Its just an OS for gaming nowadays.

Would probably use RHEL or Devuan and customise it beyond that.
Image
FldngForGrandparents
Posts: 70
Joined: Sun Feb 28, 2016 10:06 pm

Re: Mining Motherboard for FAH (with many pcie x1 slots)

Post by FldngForGrandparents »

You need a physical core per GPU to maintain performance. No hyperthreading, etc. That will limit you with lots of PCIe losts. The best I have done is 7 cards on 16X slots. I have moved to the max of 5 on mixed 16x to 4x for stability and performance.
Image

Dedicated to my grandparents who have passed away from Alzheimer's

Dedicated folding rig on Linux Mint 19.1:
2 - GTX 980 OC +200
1 - GTX 980 Ti OC +20
4 - GTX 1070 FE OC +200
3 - GTX 1080 OC +140
1 - GTX 1080Ti OC +120
Nathan_P
Posts: 1180
Joined: Wed Apr 01, 2009 9:22 pm
Hardware configuration: Asus Z8NA D6C, 2 x5670@3.2 Ghz, , 12gb Ram, GTX 980ti, AX650 PSU, win 10 (daily use)

Asus Z87 WS, Xeon E3-1230L v3, 8gb ram, KFA GTX 1080, EVGA 750ti , AX760 PSU, Mint 18.2 OS

Not currently folding
Asus Z9PE- D8 WS, 2 E5-2665@2.3 Ghz, 16Gb 1.35v Ram, Ubuntu (Fold only)
Asus Z9PA, 2 Ivy 12 core, 16gb Ram, H folding appliance (fold only)
Location: Jersey, Channel islands

Re: Mining Motherboard for FAH (with many pcie x1 slots)

Post by Nathan_P »

on those mining mobo's it might be an x16 slot but its only running at x1
Image
foldy
Posts: 2061
Joined: Sat Dec 01, 2012 3:43 pm
Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441

Re: Mining Motherboard for FAH (with many pcie x1 slots)

Post by foldy »

@FldngForGrandparents: Do you use Windows or Linux? Which CPU do you use? How much was your gain from using physical cores instead of hyperthreading?
v00d00
Posts: 396
Joined: Sun Dec 02, 2007 4:53 am
Hardware configuration: FX8320e (6 cores enabled) @ stock,
- 16GB DDR3,
- Zotac GTX 1050Ti @ Stock.
- Gigabyte GTX 970 @ Stock
Debian 9.

Running GPU since it came out, CPU since client version 3.
Folding since Folding began (~2000) and ran Genome@Home for a while too.
Ran Seti@Home prior to that.
Location: UK
Contact:

Re: Mining Motherboard for FAH (with many pcie x1 slots)

Post by v00d00 »

On an AMD platform, the one core per instance isnt that big a deal if you loaded it with an FX 8 core, or since its AM4, a Ryzen 8 Core. On Intel it might be a bit more in depth, requiring a Xeon cpu solution.

By my reckoning if its AMD based, I dont see any issue with maybe 12 instances, ie, 12x 1050ti's or similar. bind 8 of them to the physicals and throw 2 logicals per process for the rest. Based on a Ryzen 1700. If you work on the fact that a hyperthread or similar is worth 50% of a physical core, throwing 4 sets of two at 4 processes might suffice. But its theoretical. From linux I would simply unhook all the physical cores using isolcpus and then taskset each one manually for the first 8 instances, then taskset the rest while leaving them on the io_scheduler. It would be slightly complicated but fairly hardy once built and configured. Also completely headless with the most streamlined base system and nothing that isnt needed.

Also I dont have a hyper/smt capable cpu so cant test dedicating 2 hypers to a process and comparing to one that uses a single physical core. Im sure someone has done those numbers already.
Image
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Mining Motherboard for FAH (with many pcie x1 slots)

Post by bruce »

v00d00 wrote:If you work on the fact that a hyperthread or similar is worth 50% of a physical core ...
That's not a very good assumption. It really depends on what workload is assigned to the threads.

A pair of threads share the same floating-point registers (& etc.) so you'll get slightly more than 50% out of each thread. That means that running a pair of threads from FAHCore_a4 or _a7 you'll get maybe 55-60% throughput on each one, compared to what you'd get if the other "half" was idle.

For code containing non-floating-point instructions, there are extra fixed-point registers. Each thread has (almost) dedicated hardware. Since FAHCore_a1 primary function is moving data to/from the PCIe bus, I'd expect that if you pair up two copies of the code that drives the GPU, you'd get very little degradation.

This is theory, though, and I've never measured it. How about you actually do some measurements and report your findings.
v00d00
Posts: 396
Joined: Sun Dec 02, 2007 4:53 am
Hardware configuration: FX8320e (6 cores enabled) @ stock,
- 16GB DDR3,
- Zotac GTX 1050Ti @ Stock.
- Gigabyte GTX 970 @ Stock
Debian 9.

Running GPU since it came out, CPU since client version 3.
Folding since Folding began (~2000) and ran Genome@Home for a while too.
Ran Seti@Home prior to that.
Location: UK
Contact:

Re: Mining Motherboard for FAH (with many pcie x1 slots)

Post by v00d00 »

Its a good idea. At present though the only machine I have that is capable of utilising hyperthreads is an i3, which runs windows. Try as I will I haven't been able to get it to fold gpu. The drivers are all installed correctly, other opencl apps work fine, but everytime I assign the slot to gpu, the client deletes the slot. And yes, it is installed to a directory not within Program Files or any other UAC controlled directory. Cpu client on the other hand works well.

Also yes I know the whole 50% thing is a hack value and I haven't tested it fully. I just know from running other programs and games, it doesnt quite measure up to the same as a regular core. Sometimes I will watch cpu utilisation in Open Hardware Monitor while im gaming or rendering and their are two cores that are always at 100% and two that fluctuate. So I ran 7zip and bound it to each core to find out which ones were the hyperthreads while creating an archive. The ones that would be running topped out at 100% while gaming were the hyperthreads, while the other two that ran at around 70-80% were the physical cores.

Anyways I will try and find some way to test it.
Image
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Mining Motherboard for FAH (with many pcie x1 slots)

Post by bruce »

Well, let's see if we can diagnose "the client deletes the slot" problem.

Post the first ~100 lines of your log per the instructions in the signature block of my first post (above) and the segment showing you adding the slot and the slot being removed.

Does FAHBench run on the GPU?
Post Reply