PCIe Slot Speed - x1 x2 x4 x8 x16

A forum for discussing FAH-related hardware choices and info on actual products (not speculation).

Moderator: Site Moderators

Forum rules
Please read the forum rules before posting.
Sparkly
Posts: 73
Joined: Sun Apr 19, 2020 11:01 am

PCIe Slot Speed - x1 x2 x4 x8 x16

Post by Sparkly »

The ”What PCIe slot speed is needed” discussions should be put to rest, so if some people with one of those newer fancy motherboards, with the ability to set PCIe slot speed in BIOS, could take the time to do some testing on that and post the numbers here, that would be great.

Preferably multiple people with different GPUs, since there are indications that slot speed matters more/less for different GPUs.

Something like:

Project - GPU - PPD - Atoms - Slot Speed - PCIe Version - OS
P14201 - RX 580 - 600K - 453 348 - x1 - 2.0 - Win 10 Pro (64bit)
P14253 - RX 580 - 550K - 438 651 - x1 - 2.0 - Win 10 Pro (64bit)
P14417 - RX 580 - 450K - 274 297 - x1 - 2.0 - Win 10 Pro (64bit)
P11744 - RX 580 - 400K - 109 578 - x1 - 2.0 - Win 10 Pro (64bit)
P11761 - RX 580 - 350K - 62 180 - x1 - 2.0 - Win 10 Pro (64bit)

You can verify your actual running speed and PCIe version with GPU-Z from TechPowerUp, by hovering the mouse pointer over the “Bus Interface” box, when you have selected your GPU in the drop down list at the bottom.

https://www.techpowerup.com/download/techpowerup-gpu-z/

This can also be useful to do, if you just want to verify your own configuration, since it might be that you think you are running x16, but have accidentally put your GPU in a physical x16 slot that only has x4/x8 connected to it.


Edit:
- Added OS
- Added Atoms and actual test data
- GPU-Z
Last edited by Sparkly on Tue Jun 23, 2020 5:37 pm, edited 3 times in total.
JimboPalmer
Posts: 2573
Joined: Mon Feb 16, 2009 4:12 am
Location: Greenwood MS USA

Re: PCIe Slot Speed - x1 x2 x4 x8 x16

Post by JimboPalmer »

Operating System will be important, too. Windows has the reputation of needing more PCI-E lanes.
Tsar of all the Rushers
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
MeeLee
Posts: 1375
Joined: Tue Feb 19, 2019 10:16 pm

Re: PCIe Slot Speed - x1 x2 x4 x8 x16

Post by MeeLee »

I've already replied multiple times to this question.
It's up to you to find and accept those settings, or just go against it.
In most cases:
PCIE 1.0 x4 / 2.0 x2 / 3.0 x1 is not recommended for folding. You could fold on it, but will experience serious PPD penalties on budget to mid range GPUs.
PCIE 1.1 x16 / 2.0 x8 / 3.0 x4, good enough for GTX 1600 series GPUs under Linux (an RTX 2060 gets 975k PPD in Linux, which is slower than a Core 21 WU), or up to a GTX 1060 in Windows.
PCIE 2.0 x16 / 3.0 x8 is good enough for up to a 2080Ti under both Linux and Windows. Estimated upcoming 5000+ shader/core gpus might be limited in Windows, but should fold fine in Linux.
PCIE 3.0 x16, there's currently no GPU that would exceed this PCIE bandwidth, but if there were, it would be a GPU with 8000 cores or more.
markdotgooley
Posts: 101
Joined: Tue Apr 21, 2020 11:46 am

Re: PCIe Slot Speed - x1 x2 x4 x8 x16

Post by markdotgooley »

MeeLee wrote:I've already replied multiple times to this question.
It's up to you to find and accept those settings, or just go against it.
In most cases:
PCIE 1.0 x4 / 2.0 x2 / 3.0 x1 is not recommended for folding. You could fold on it, but will experience serious PPD penalties on budget to mid range GPUs.
PCIE 1.1 x16 / 2.0 x8 / 3.0 x4, good enough for GTX 1600 series GPUs under Linux (an RTX 2060 gets 975k PPD in Linux, which is slower than a Core 21 WU), or up to a GTX 1060 in Windows.
PCIE 2.0 x16 / 3.0 x8 is good enough for up to a 2080Ti under both Linux and Windows. Estimated upcoming 5000+ shader/core gpus might be limited in Windows, but should fold fine in Linux.
PCIE 3.0 x16, there's currently no GPU that would exceed this PCIE bandwidth, but if there were, it would be a GPU with 8000 cores or more.
So I probably don't need to buy an old dual-Xeon motherboard complete with Xeons and heatsinks and some memory (for under US$200; at least it would be cheap) just because it has four PCIe 3.0 x16 slots...
Sparkly
Posts: 73
Joined: Sun Apr 19, 2020 11:01 am

Re: PCIe Slot Speed - x1 x2 x4 x8 x16

Post by Sparkly »

MeeLee wrote:I've already replied multiple times to this question.
Point is more to see the actual difference in real life use on FAH, as compared to theoretical values based on shader count, possible GPU bandwidth or whatever, since there will clearly be differences between the WU sizes being handled on different GPUs, and this would most likely also depend on how the FAH Client actually communicates with the GPU via drivers and which implementation of Gromacs, Thinker, Amber or whatnot is being used, so you might end up showing a 10% average PPD penalty between PCIe x16 3.0 and PCIe x1 2.0, if any, and real life numbers would show this.
HaloJones
Posts: 920
Joined: Thu Jul 24, 2008 10:16 am

Re: PCIe Slot Speed - x1 x2 x4 x8 x16

Post by HaloJones »

I don't know that anyone has a controlled environment in which to test this across multiple units and multiple cards
single 1070

Image
Sparkly
Posts: 73
Joined: Sun Apr 19, 2020 11:01 am

Re: PCIe Slot Speed - x1 x2 x4 x8 x16

Post by Sparkly »

HaloJones wrote:I don't know that anyone has a controlled environment in which to test this across multiple units and multiple cards
Multiple people doing multiple speed change tests with different systems is fine, since you can see the difference within each system.
HugoNotte
Posts: 70
Joined: Tue Apr 07, 2020 7:09 pm

Re: PCIe Slot Speed - x1 x2 x4 x8 x16

Post by HugoNotte »

MeeLee wrote:I've already replied multiple times to this question.
It's up to you to find and accept those settings, or just go against it.
In most cases:
PCIE 1.0 x4 / 2.0 x2 / 3.0 x1 is not recommended for folding. You could fold on it, but will experience serious PPD penalties on budget to mid range GPUs.
PCIE 1.1 x16 / 2.0 x8 / 3.0 x4, good enough for GTX 1600 series GPUs under Linux (an RTX 2060 gets 975k PPD in Linux, which is slower than a Core 21 WU), or up to a GTX 1060 in Windows.
PCIE 2.0 x16 / 3.0 x8 is good enough for up to a 2080Ti under both Linux and Windows. Estimated upcoming 5000+ shader/core gpus might be limited in Windows, but should fold fine in Linux.
PCIE 3.0 x16, there's currently no GPU that would exceed this PCIE bandwidth, but if there were, it would be a GPU with 8000 cores or more.
Thank you for the summary!
MeeLee
Posts: 1375
Joined: Tue Feb 19, 2019 10:16 pm

Re: PCIe Slot Speed - x1 x2 x4 x8 x16

Post by MeeLee »

Sparkly wrote:
MeeLee wrote:I've already replied multiple times to this question.
Point is more to see the actual difference in real life use on FAH, as compared to theoretical values based on shader count, possible GPU bandwidth or whatever, since there will clearly be differences between the WU sizes being handled on different GPUs, and this would most likely also depend on how the FAH Client actually communicates with the GPU via drivers and which implementation of Gromacs, Thinker, Amber or whatnot is being used, so you might end up showing a 10% average PPD penalty between PCIe x16 3.0 and PCIe x1 2.0, if any, and real life numbers would show this.
PCIE bandwidth is directly related to core count, not atom count.
If the GPU has little cores, it will process the data much slower, allowing more time for data to pass through the PCIE interface.

The only time when atom count makes a difference is if there are less atoms than the maximum the GPU can process.
Then PCIE bandwidth will become 'less restricted'.
But we're really looking at the maximum GPU you should be running on a set PCIE bandwidth.


@Mark, you buy what you like.
A $200 Xeon system most likely won't be the most efficient system to fold on (as the most efficient is GPU based), and you could shave off a few dollars off your electric bill, by running 2 budget PCs with 2x PCIE 3.0 x8 slots. They might consume less energy than Xeons.
But yea, if it's a $200 system, it's tough to even find a budget system (Celeron, or Pentium system) for that price.
foldy
Posts: 2061
Joined: Sat Dec 01, 2012 3:43 pm
Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441

Re: PCIe Slot Speed - x1 x2 x4 x8 x16

Post by foldy »

I tested on Windows with gtx 1080ti on pcie 2.0 x8 = pcie 3.0 x4 and I saw 10% PPD loss compared to pcie 2.0 x16 = pcie 3.0 x8.
I even tried pci 2.0 x4 = pcie 3.0 x2 and had 50% PPD loss with gtx 1080 ti on Windows.

On Linux I can run gtx 1080ti on pcie 3.0 x1 with only 10% PPD loss compared to pcie 3.0 x16

So my rule of thumb is Windows minimum pcie 3.0 x4 and Linux minimum pcie 3.0 x1. But if we get faster GPUs like RTX 2080ti or next gen by end of the year then these numbers may increase again...
Sparkly
Posts: 73
Joined: Sun Apr 19, 2020 11:01 am

Re: PCIe Slot Speed - x1 x2 x4 x8 x16

Post by Sparkly »

MeeLee wrote:PCIE bandwidth is directly related to core count, not atom count.
If the GPU has little cores, it will process the data much slower, allowing more time for data to pass through the PCIE interface.

The only time when atom count makes a difference is if there are less atoms than the maximum the GPU can process.
Then PCIE bandwidth will become 'less restricted'.
But we're really looking at the maximum GPU you should be running on a set PCIE bandwidth.
The PCIe bandwidth is only used when the CPU is communicating with GPU, since the whole point of sending work to the GPU is to avoid using the CPU, thus the GPU takes over this continued communication internally, after receiving the work, without using the PCIe bandwidth at all, and this would result in the CPU/GPU bandwidth being accessed less frequent at a higher atom count than lower atom count, since the GPU will take longer to calculate the higher atom count request before giving a reply over the available PCIe bandwidth.

This could be compared to a CPU sending a Pi request to the GPU telling it to calculate Pi with 10 digits and return the result, as opposed to requesting a 100 digit Pi calculation before returning the result, where the request in both instances is fairly similar in size going over the PCIe bandwidth, and the result is fairly similar too, since we are talking about a few Byte difference, but the 100 digit request takes longer to calculate, thus the PCIe bandwidth is idle until the answer comes back.

The general size of the downloaded WU for GPU is around 100kB per 1000 atoms and the upload size will vary depending on what is calculated, but let’s say it is 50MB on average, which means there is a limit to how much PCIe bandwidth this will actually manage to use over the often several hours being used to produce this upload file.

PCIe 1.0 x1 is around 250MB per second
PCIe 2.0 x1 is around 500MB per second
PCIe 3.0 x1 is around 1000MB per second
PCIe 3.0 x16 is around 15750MB per second

If you double the PCIe bandwidth that doesn’t mean you will cut your WU calculation time in half, since most of the time used isn’t actually being spent on the PCIe bus, it is being spent on the GPU and in the CPU and RAM, so that is where your bottlenecks will be, being it the CPU version and frequency, single/dual channel RAM and frequency, OS drivers or whatnot, since the 50MB upload file is being generated over several hours in time, thus it is not really flooding the PCIe bus.

And while I was writing this –foldy- basically proved the point, with his/her tests, showing that other factors are generally more important than the hardware PCIe speed, namely which OS is used:
foldy wrote:I tested on Windows with gtx 1080ti on pcie 2.0 x8 = pcie 3.0 x4 and I saw 10% PPD loss compared to pcie 2.0 x16 = pcie 3.0 x8.
I even tried pci 2.0 x4 = pcie 3.0 x2 and had 50% PPD loss with gtx 1080 ti on Windows.

On Linux I can run gtx 1080ti on pcie 3.0 x1 with only 10% PPD loss compared to pcie 3.0 x16

So my rule of thumb is Windows minimum pcie 3.0 x4 and Linux minimum pcie 3.0 x1. But if we get faster GPUs like RTX 2080ti or next gen by end of the year then these numbers may increase again...
markdotgooley
Posts: 101
Joined: Tue Apr 21, 2020 11:46 am

Re: PCIe Slot Speed - x1 x2 x4 x8 x16

Post by markdotgooley »

MeeLee wrote:
Sparkly wrote:
MeeLee wrote:I've already replied multiple times to this question.
@Mark, you buy what you like.
A $200 Xeon system most likely won't be the most efficient system to fold on (as the most efficient is GPU based), and you could shave off a few dollars off your electric bill, by running 2 budget PCs with 2x PCIE 3.0 x8 slots. They might consume less energy than Xeons.
But yea, if it's a $200 system, it's tough to even find a budget system (Celeron, or Pentium system) for that price.
More talking about four relatively fast GPUs (RTX 2060 or faster) plugged into an old motherboard with four PCIe 3.0 x16 slots with the Xeons tending the GPUs and nothing else.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: PCIe Slot Speed - x1 x2 x4 x8 x16

Post by bruce »

Although all four slots accommodate devices with an x16 length, it's unlikely that all four operate t\at that speed. Get GPU-Z and check the speeds.
Sparkly
Posts: 73
Joined: Sun Apr 19, 2020 11:01 am

Re: PCIe Slot Speed - x1 x2 x4 x8 x16

Post by Sparkly »

bruce wrote:Although all four slots accommodate devices with an x16 length, it's unlikely that all four operate t\at that speed. Get GPU-Z and check the speeds.
The Xenon generally has more of everything than its consumer counterpart, so every time the different datacenters upgrade parts of their hardware, the secondary market gets flooded with cheap Xenon stuff via recycling brokers, so you can easily get a Motherboard with Dual Xenon E5-2670, heat sinks and some RAM for around $200.

The E5-2670 has 40 PCIe lanes, so with 2 of them on a board you get 80.
https://ark.intel.com/content/www/us/en ... l-qpi.html

And since the Xenon is server grade, then the motherboard supporting it will most likely also have all 16 PCIe lanes connected to each x16 slot, as opposed to a lot of consumer grade motherboards, where you will often have a x16/x8 configuration, meaning you have a physical size x16 slot, but there are only 8 PCIe lanes connected to it.
MeeLee
Posts: 1375
Joined: Tue Feb 19, 2019 10:16 pm

Re: PCIe Slot Speed - x1 x2 x4 x8 x16

Post by MeeLee »

foldy wrote:I tested on Windows with gtx 1080ti on pcie 2.0 x8 = pcie 3.0 x4 and I saw 10% PPD loss compared to pcie 2.0 x16 = pcie 3.0 x8.
I even tried pci 2.0 x4 = pcie 3.0 x2 and had 50% PPD loss with gtx 1080 ti on Windows.

On Linux I can run gtx 1080ti on pcie 3.0 x1 with only 10% PPD loss compared to pcie 3.0 x16

So my rule of thumb is Windows minimum pcie 3.0 x4 and Linux minimum pcie 3.0 x1. But if we get faster GPUs like RTX 2080ti or next gen by end of the year then these numbers may increase again...
Odd that you get such good results from a 1080Ti (which is slightly faster than an RTX 2060) on a Linux PCIe 3.0 x1 slot.
My 2060 had 975kPPD on core 22, vs 1,04M PPD on Core 21, in linux, on a PCIE 3.0 x1 slot.
On an x4 slot, it would get around 1,2-1,25M PPD on core 22.
Despite my GPU being slower than yours, it still saw a near 20% drop in performance.

Not sure why my numbers are so much different than yours.
Post Reply