Hardware Accelerated GPU Scheduling (Win 10)

Moderators: Site Moderators, FAHC Science Team

Hardware Accelerated GPU Scheduling (Win 10)

Postby ajm » Sat Jul 04, 2020 12:50 pm

This new feature of Windows 10 2004 is described on the MS devblog: https://devblogs.microsoft.com/directx/ ... cheduling/

It can bring a potential GPU performance boost, down the road, but for now, rather performance snags are observed, if anything: https://youtu.be/wlrWDb1pKXg

It would probably be wise to keep an eye on it, as it may affect FAH as well.

I just enabled it on one machine (2070S with latest drivers). It was smooth, no problem. I don't see any effect yet.

EDIT: That's how it looks when enabled with FAH on High performance:

Image

EDIT2: A new WU [13414 (5128, 33, 1)] is now stable (over 10%) at PPD 1997353, which is really high for a 2070S limited to 75% power by Afterburner.

EDIT3: Next WU is smaller [16441 (477, 1, 122)] but still delivers PPD 1928293. I' going to try that on another machine with AMD and Nvidia GPUs...

EDIT4: Done (1080 ti and 5700XT). No change yet, like the first time. But the overall some 250-300K more on the 2070S have persisted, for hours now. This new Hardware Accelerated GPU Scheduling seems to confuse Adrenalin's Power tuning, which is stuck at 345W, whereas GPU-Z sees around 100W.

EDIT5: An hour later, the 1080ti has started a new WU, but without any performance gain. The 5700XT is still finishing its "old" WU. But I'm wondering whether FahCore_22 should not be stopped entirely in order to pick up the new deal? That's what was necessary for the 2070S. IN order to check that, I'll have to finish both GPU WUs and restart FAH. That will lead us in some two hours from now.
The 2070S is now crunching 13416 (381, 21, 0) at PPD 1989081.

EDIT6: The 2070S at 75% power is now delivering 2M+ (2018834) with 13414 (6310, 34, 1). But the 1080ti and the 5700XT (both also under powered) are "only" at 1604441 and 1154245, respectively, which is excellent but not as significantly better than usual than for the 2070S. Could be a coincidence so far. We'll see tomorrow.

EDIT7: This morning, the 1080ti/5700XT machine was a complete mess! The AMD card in failed state, and then unable to fold anything, the machine terribly lagging while doing almost nothing, the 1080ti struggling. I could save only the current CPU job... Back on track now. The 2070S is folding smaller WUs (11752s) at 1.6M.

EDIT8: A couple hours later, it appears that the good results of last night were just a coincidence: all PPDs are now stable at the previous level.
Last edited by ajm on Sun Jul 05, 2020 7:10 am, edited 8 times in total.
ajm
 
Posts: 495
Joined: Sat Mar 21, 2020 6:22 am
Location: Lucerne, Switzerland

Re: Hardware Accelerated GPU Scheduling (Win 10)

Postby JimboPalmer » Sat Jul 04, 2020 4:20 pm

Using Nvidia, this should work with Pascal and later generation GPUs. (Volta, Turing, Ampere)

In GPU-Z in sensors, the Bus Interface Load seemed lower, but I did not do a 'before and after' if you consider this look at how it affects GPU-Z sensors.

I set both Core_22 and Core_21 to High Performance.
Last edited by JimboPalmer on Sat Jul 04, 2020 6:18 pm, edited 1 time in total.
Tsar of all the Rushers
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
JimboPalmer
 
Posts: 1954
Joined: Mon Feb 16, 2009 5:12 am
Location: Greenwood MS USA

Re: Hardware Accelerated GPU Scheduling (Win 10)

Postby MeeLee » Sat Jul 04, 2020 5:15 pm

I read the main advantage is gaming.
Anand did some tests and only got a few percent at most (2-3fps out of 60fps) gain.
Though no one seems to know what exactly happens, and what about the GPU is accelerated. And it seems to only be supported by Nvidia GPUs.
More than likely it's a more direct interface between the CPU PCIE lanes, and the GPU, as the GPU drivers remain the same.
Perhaps a better way of locking the GPU to a single core, rather than shuffling it around, or perhaps shuffling it around more?
Nvidia drivers are supposed to lock the GPU to a single thread on the CPU, but that doesn't happen.
It stays longer on a core, but it still jumps around (without the feature enabled).

Since MS is heavily investing in Linux lately, I presume some optimizations may have come from Linux!
MeeLee
 
Posts: 924
Joined: Tue Feb 19, 2019 11:16 pm

Re: Hardware Accelerated GPU Scheduling (Win 10)

Postby ajm » Sat Jul 04, 2020 5:34 pm

JimboPalmer wrote:In GPU-Z in sensors, the Bus Interface Load seemed lower, but I did not do a 'before and after' if you consider this look at how it affects GPU-Z sensors.


Well, I havent, too bad. But others may. I just checked whether the frequencies and the temps were the same. And they are.
ajm
 
Posts: 495
Joined: Sat Mar 21, 2020 6:22 am
Location: Lucerne, Switzerland

Re: Hardware Accelerated GPU Scheduling (Win 10)

Postby JimboPalmer » Sat Jul 04, 2020 6:15 pm

MeeLee wrote:I read the main advantage is gaming.

F@H is not a mainstream app, there is no hint it was tested before and after.
MeeLee wrote:And it seems to only be supported by Nvidia GPUs.

Pascal and later for Nvidia, Navi and later for AMD.
MeeLee wrote:the GPU drivers remain the same.

No, the latest Drivers are needed on both platforms to take advantage of Hardware Accelerated GPU Scheduling. 451.48 on Nvidia, 20.5.1 Beta on AMD
JimboPalmer
 
Posts: 1954
Joined: Mon Feb 16, 2009 5:12 am
Location: Greenwood MS USA

Re: Hardware Accelerated GPU Scheduling (Win 10)

Postby ajm » Sun Jul 05, 2020 10:36 am

Side note: with Win 10 2004, the Task Manager now detects FAH's activity under "3D", but seemingly only for Nvidia cards (AMD visible under Compute 1). And there's no CUDA anymore.
Weird: the graph of Copy stays at zero for the 2070S, but is showing quite intense activity for the 1080ti, often reaching 100%.

Image
ajm
 
Posts: 495
Joined: Sat Mar 21, 2020 6:22 am
Location: Lucerne, Switzerland

Re: Hardware Accelerated GPU Scheduling (Win 10)

Postby bruce » Mon Jul 06, 2020 7:52 pm

Microsoft has not done a good job of recognizing that GPUs are legitimate computing devices. When Stream Computing was in it's infancy, their developers were convinced that the only reason anybody would have a GPU is to display the desktop or play a game. They still don't do a good job of accounting for the activity.
bruce
 
Posts: 19653
Joined: Thu Nov 29, 2007 11:13 pm
Location: So. Cal.

Re: Hardware Accelerated GPU Scheduling (Win 10)

Postby aetch » Mon Jul 06, 2020 9:51 pm

I've been having a play about with it as well.
I can't say if it's any better or not. I had a couple of units for 13416 go through a pair of my systems and they fluctuated in speed quite a lot throughout the run(1 had GPU scheduling enabled, the other was still on Win 10 1903). I also saw that without GPU scheduling my copy was steady in the 30-45% range, while the GPU scheduling was all over the map.
I've found that it's not always the first "copy" that sees activity.
I've also had to restart task manager a few times to see that activity.
It's early days, there's a few bugs to iron out.
aetch
 
Posts: 32
Joined: Thu Jun 25, 2020 4:04 pm

Re: Hardware Accelerated GPU Scheduling (Win 10)

Postby Breach » Mon Jul 06, 2020 10:10 pm

The CUDA option is there in Task Manager provided that you have Hardware-accelerated GPU scheduling set to off. Go figure.

Secondly, I am currently observing a performance loss if that's turned on (nVidia 1070, 2Ghz clocks):

PRCG 16913, 15,9,2

eTPF: 3 mins 37 secs
ePPD: 532,245

Turned it off, and I'm back to 750k PPD after 3-4 frames (I normally get 800k).

Edit: Confirmed with a second WU. Methodology:

Started with a new WU, PRCG: 16911 (4, 8, 5)

Off
eTPF 2 mins 13 secs
ePPD 908582

Switched on GPU scheduling, restarted (OK, some minor PPD loss during the restart), waited 20 mins for a few frames to process:

On
eTPF 2 mins 50 secs
ePPD 675245

So, maybe it's Pascal-specific, or because I may be using a beta core, or because of project specific WUs. But I'm keeping GPU scheduling off for the time being. I'll report this on nVidia's forums.
Windows 10 x64 / i6700k @4.6Ghz / ASUS Sabertooth Z170 / 16GB DDR4 2400 CL10 / MSI 1070 @2000MHz / Creative Titanium HD / Tube amp, Sennheiser 650 / PSU Corsair AX1200i / Samsung 840 Pro, OCZ Vertex 3 SSD, HDDs in RAID
Breach
 
Posts: 199
Joined: Sat Mar 09, 2013 9:07 pm
Location: Brussels, Belgium

Re: Hardware Accelerated GPU Scheduling (Win 10)

Postby JimboPalmer » Thu Jul 09, 2020 1:51 am

I have been running hardware scheduling for 5 days on 3 GPUs, and I am seeing a slight decline in points.

While it functions fine, it does not seem to be a desirable feature for F@H
JimboPalmer
 
Posts: 1954
Joined: Mon Feb 16, 2009 5:12 am
Location: Greenwood MS USA

Re: Hardware Accelerated GPU Scheduling (Win 10)

Postby bruce » Thu Jul 09, 2020 2:39 am

OK, from reading what you guys wrote, we simply don't understand what this provides for games or for FAH, if anything. I can't offer any more information than you already have EXCEPT the words "Pascal or better" did catch my eye. I do remember reading a Pascal tuning guide. (If anybody is familiar with the Navi white-papers, maybe they talk about a similar feature.)
JimboPalmer wrote:
MeeLee wrote:And it seems to only be supported by Nvidia GPUs.

Pascal and later for Nvidia, Navi and later for AMD.


One of the GPU internal functions is performed by a "warp scheduler" which initiates internal GPU processes. Let's start with an ASSUMPTION that you can increase your game's frame rate by telling the warp scheduler to concentrate on finishing the processing of a screen update rather than delaying it behind a pending FAH kernel. Wouldn't you want to do that. (Imagine you're a GAMER rather than an admirer of FAH.) This may improve the performance of GPU memory access but since almost none of FAH's performance depends on GPU memory, we don't really care.

Second, there's a new feature beginning with Pascal called Compute Preemption. Compute Preemption allows compute tasks running on the GPU to be interrupted at instruction-level granularity. The execution context (registers, shared memory, etc.) are swapped to GPU DRAM so that another application can be swapped in and run. Compute preemption offers an advantage for developers:

> Long-running kernels no longer need to be broken up into small timeslices to avoid an unresponsive graphical user interface or kernel timeouts when a GPU is used simultaneously for compute and graphics.

I think that's probably what we're talking about. I expect that FAH would be guilty of dispatching long-running kernels but OpenMM would already have broken it's work up into small timeslices to minimize screen lag on pre-Pascal GPUs. Maybe this new feature would minimize screen-lag on Pascal, but I don't think anybody with Pascal-or-better GPUs ever reported a screen-lag problem.

Again, in an ideal world, maybe it's a nice feature but not useful for FAH.

This involves a lot of speculation on my part, so I could easily be wrong. Feel free to comment.
bruce
 
Posts: 19653
Joined: Thu Nov 29, 2007 11:13 pm
Location: So. Cal.

Re: Hardware Accelerated GPU Scheduling (Win 10)

Postby LazyDev » Tue Jul 14, 2020 10:39 am

I've noticed that the bus interface load has lowered since enabling the option. I'm running a GTX 1070 on PCIe X1.

GPU-z and HWMonitor:
https://prnt.sc/thgajm
Image
LazyDev
 
Posts: 10
Joined: Tue Aug 30, 2016 8:28 pm

Re: Hardware Accelerated GPU Scheduling (Win 10)

Postby bruce » Tue Jul 14, 2020 4:21 pm

LazyDev wrote:I've noticed that the bus interface load has lowered ...
So is the WU running slower?
bruce
 
Posts: 19653
Joined: Thu Nov 29, 2007 11:13 pm
Location: So. Cal.

Re: Hardware Accelerated GPU Scheduling (Win 10)

Postby LazyDev » Mon Jul 20, 2020 8:06 am

bruce wrote:
LazyDev wrote:I've noticed that the bus interface load has lowered ...
So is the WU running slower?

In terms of GPU Performance, I didn't observe a difference. However, as the bus load seems to have dropped, it's allowed me to push boundaries, and such, has opened the opportunity to run higher performing GPU's on just 1 lane. I need to test this in the future, tho.
LazyDev
 
Posts: 10
Joined: Tue Aug 30, 2016 8:28 pm

Re: Hardware Accelerated GPU Scheduling (Win 10)

Postby MeeLee » Mon Jul 20, 2020 9:18 pm

bruce wrote:> Long-running kernels no longer need to be broken up into small timeslices to avoid an unresponsive graphical user interface or kernel timeouts when a GPU is used simultaneously for compute and graphics.

I think that's probably what we're talking about. I expect that FAH would be guilty of dispatching long-running kernels but OpenMM would already have broken it's work up into small timeslices to minimize screen lag on pre-Pascal GPUs. Maybe this new feature would minimize screen-lag on Pascal, but I don't think anybody with Pascal-or-better GPUs ever reported a screen-lag problem.

Again, in an ideal world, maybe it's a nice feature but not useful for FAH.

This involves a lot of speculation on my part, so I could easily be wrong. Feel free to comment.

There's no screen lag on my GPUs (like you say, GTX with 384 cores or more).
In Boinc there are some situations in which screen lag can occur, but it's usually when tripling/quadrupling small WUs on a single GPU.
MeeLee
 
Posts: 924
Joined: Tue Feb 19, 2019 11:16 pm

Next

Return to Discussions of General-FAH topics

Who is online

Users browsing this forum: No registered users and 1 guest

cron