One WU, 2 GPU

Moderators: Site Moderators, FAHC Science Team

Post Reply
Ricky
Posts: 483
Joined: Sat Aug 01, 2015 1:34 am
Hardware configuration: 1. 2 each E5-2630 V3 processors, 64 GB RAM, GTX980SC GPU, and GTX980 GPU running on windows 8.1 operating system.
2. I7-6950X V3 processor, 32 GB RAM, 1 GTX980tiFTW, and 2 each GTX1080FTW GPUs running on windows 8.1 operating system.
Location: New Mexico

One WU, 2 GPU

Post by Ricky »

As I understand it, using as many CPU cores as possible for a WU is best for the science; rather than breaking the CPU into many slots. Has anyone thought about using more than one GPU for a WU? Perhaps sync problems would prevent this.
Joe_H
Site Admin
Posts: 7870
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: One WU, 2 GPU

Post by Joe_H »

Yes, it has been thought about. Syncing was the problem, as well as data transfer speeds between GPU's. Perhaps someday these issues will be resolved, but it is not something I would expect anytime in the near future.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
JimboPalmer
Posts: 2573
Joined: Mon Feb 16, 2009 4:12 am
Location: Greenwood MS USA

Re: One WU, 2 GPU

Post by JimboPalmer »

In the CPU case, memory used by other CPU Cores is as easy to access as memory used by this CPU. For most GPUs, they are very distant, the memory on your card is right there but to transfer to the CPUs memory requires pushing data over the slow PCI-E bus to main memory, then pushing it over the PCI-E bus again to the memory of the other GPU. This can easily be 1000 times slower than in the multiple CPU example.
A GTX Tian Z can access its own memory at 672 Gigabytes a second http://www.geforce.com/hardware/desktop ... ifications

PCI-E 1.0 is theoretically 250 Megabytes a second per lane, (and practical can approach 80% of that) PCI-E 2.0 doubles that to 500 Megabyte per second per lane, (still 20% overhead) and PCI-E 3.0 can do 980 Megabytes per second per lane. (and overhead drops to 1.5%) Even with the latest, greatest, versions you are theoretically limited to 15 GB/sec compared to 650GB/sec on the card itself. http://www.techpowerup.com/reviews/Inte ... ing/1.html

Latency is even worse than Bandwidth, we have to pass through the PCI_E bus twice, and co-ordinate the CPU with both GPUs. (many GPU WUs spend several minutes of CPU time passing data through the PCI-E bus before the GPU becomes busy)
Tsar of all the Rushers
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
Post Reply