Page 2 of 3

Re: WU 13416 low ppd long run time

PostPosted: Wed Jul 08, 2020 8:08 pm
by HaloJones
just for some balance

P13416 (R376, C170, G1) running around 35% above average
P13416 (R791, C148, G1) ditto

Re: WU 13416 low ppd long run time

PostPosted: Fri Jul 10, 2020 6:24 pm
by bumbel123
Breach wrote:Same observations, 13416 (1297, 133, 1) for example - PPD is lower than usual by about 25%.

That can be the worst case for one single Job/WU and GPU ... I also have seen a drop in average PPD for my bunch of Turing GTX GPUs, since they just get 13416 the last days. But in longer term monitoring its rather 15% over all.

Many of these Jobs have quite normal average recognition, but many are lower. It doesn't harm my gear that much, because I run a bunch of GPUs and CPUs.

But what I can see, my GTX 1660 TI doesn't really suffer compared to the (s)lower models. The GTX 1650/1650S constantly gets much lower recognition, the 1660/1660S are doing a roller coaster drive in terms of PPD.

Anyone with just one GPU, cannot really compensate and does suffer a bit ... I (can) compensate with my Ryzen's, especially my R5 3600 does constantly do 130.000+ PPD, the 0xa7 WU do like MHz more than number of threads obviously.

Re: WU 13416 low ppd long run time

PostPosted: Fri Jul 10, 2020 8:38 pm
by kiore
HaloJones wrote:just for some balance

P13416 (R376, C170, G1) running around 35% above average
P13416 (R791, C148, G1) ditto



Agreed, there are significant variations in length of run and PPD on these units on my RTX 2080ti in windows although many have very low ppd some have significantly higher, I have observed a range from 1.1 to 3.6 mppd between different runs.
I understand that this can be frustrating but these are essential units it seems so I encourage everyone to bear with this minor blip in the machinery and if possible tweak cpu core allowances for the gpu to give these some extra headroom.

Re: WU 13416 low ppd long run time

PostPosted: Sat Jul 11, 2020 8:51 am
by Sparkly
These 13416 things are now spending so much CPU and System resources away from the GPU that the GPU doesn't really do much, because of all the waiting, so maybe it is time to take a closer look at the general CPU to GPU handling of WUs in software. (R1166 C7 G2)

Image

Re: WU 13416 low ppd long run time

PostPosted: Sat Jul 11, 2020 11:21 am
by Bastiaan_NL
Sparkly wrote:These 13416 things are now spending so much CPU and System resources away from the GPU that the GPU doesn't really do much, because of all the waiting, so maybe it is time to take a closer look at the general CPU to GPU handling of WUs in software. (R1166 C7 G2)


Are you still running a CPU client?
I paused the CPU client and the load was back on the GPU's, even though the CPU was only at 80%.
So for now I'm running without the CPU client, 20kppd sacrifice for at least a tenfold gain.

Re: WU 13416 low ppd long run time

PostPosted: Sat Jul 11, 2020 11:28 am
by Sparkly
Bastiaan_NL wrote:Are you still running a CPU client?

I have no CPU slots on these systems, they are GPU only and running dedicated with no other activity on them, these WUs just use an insane amount of CPU and system resources to handle the GPU.

Re: WU 13416 low ppd long run time

PostPosted: Sat Jul 11, 2020 3:29 pm
by Joe_H
You keep bringing this up, the projects would not complete on CPU folding within the timeframe necessary to get the results to other researchers.

There are also known issues with the AMD drivers.

Future developments may make it possible to change the assignment mix, but these WUs and their results are needed immediately.

Re: WU 13416 low ppd long run time

PostPosted: Sat Jul 11, 2020 5:13 pm
by Sparkly
Joe_H wrote:You keep bringing this up, the projects would not complete on CPU folding within the timeframe necessary to get the results to other researchers.

Who said anything about running these projects on CPU, what I am saying is that the CPU to GPU communication is creating an insane amount of overhead that shouldn’t be there in the first place, this is about programming and nothing else.

Re: WU 13416 low ppd long run time

PostPosted: Sat Jul 11, 2020 6:03 pm
by Joe_H
Sparkly wrote:
Joe_H wrote:You keep bringing this up, the projects would not complete on CPU folding within the timeframe necessary to get the results to other researchers.

Who said anything about running these projects on CPU, what I am saying is that the CPU to GPU communication is creating an insane amount of overhead that shouldn’t be there in the first place, this is about programming and nothing else.


This is what you wrote:
Sparkly wrote:...so maybe it is time to take a closer look at the general CPU to GPU handling of WUs in software

On top of all of your other complaints, how else is it to be taken? Programming changes will take time, perhaps a lot of time. But it is only now you bring up "programming". There is also the apparent assumption on your part that the people programming OpenMM and creating the folding core from that code base can make major changes win how the GPU is used.

All that would be in the future, for now they can just use the data from these projects to improve configuration for future projects. Maybe they will identify why some seemingly similar runs end up processing so differently. That might show up as an improvement in a future revision of the core, the projects, or not until an entirely new folding core.

Re: WU 13416 low ppd long run time

PostPosted: Sat Jul 11, 2020 7:08 pm
by Sparkly
Joe_H wrote:On top of all of your other complaints, how else is it to be taken?

What I wrote has nothing to do with sending these projects to CPU, if you even bother to read what it says, since it talks about handling the CPU to GPU communication in the software, something you are correct in that I have commented on elsewhere also, since it is perfectly clear based on hardware comparisons and running the numbers from CPU, GPU and System, that the overhead created is not taken into consideration when handling the WUs on GPU, since a good part of it comes from activating the PCIe bus etc.

Re: WU 13416 low ppd long run time

PostPosted: Sat Jul 11, 2020 11:28 pm
by astrorob
so not to pile on, but i do feel something is weird with 13416. i've got 5 GPUs across 2 machines. 13416 happens to be running on 2 GPUs on a windows 10 machine and 1 GPU on a linux machine. on the windows 10 machine, the two GPUs in question are a GTX 1060 with 8GB of memory and a RTX 2060 with 6GB of memory. on those two GPUs the TPF is 3m39s and 2m12s respectively.

on the linux box, 13416 is running on an ATI RX5500, and the TPF is 25m30s.

my understanding has always been that the nvidia drivers are subpar on all platforms and don't use DMA to move data between the GPU and CPU, leading to higher CPU utilization for the CPU thread associated with the GPU thread. in contrast i understood the AMD drivers to use DMA and generally be more efficient at data tranfser. on the linux box i do see the nvidia CPU thread pegged at 100%, but the ATI cpu thread is ~70% of a single core, which is higher than i've ever seen.

13416 is taking 2+ days to complete on the ATI GPU. because i have to shut down during the day due to high electricity prices (TOU plan), i can't actually complete 13416 on the RX5500 before it times out. therefore all the power and cost of the video card are being wasted.

hopefully this is a transient situation... if this is happening to me then it is certainly happening to 1000s of other people and lots of compute power is going to waste.

if this turns out to be a permanent situation, is there a mechanism for researchers to blacklist certain types of GPU for a given WU?

and an update - it does seem like 13416 is CPU-bound on the RX5500 - the machine in question had 6 threads (out of 8) running rosetta@home, with 2 threads reserved for the GPUs. i've lowered R@H to 5 threads and now i'm seeing 16m55s TPF on the RX5500.

Re: WU 13416 low ppd long run time

PostPosted: Sun Jul 12, 2020 10:59 am
by HaloJones
P13416 has huge variations so you're not comparing apples to apples.

e.g. two 1070s in the same machine, same PCIE speed, same overclock, same drivers, same OS.
P13416 (R577, C150, G1) - TPF 02:20 for 1307292 ppd
P13416 (R910, C67, G0) - TPF 03:14 for 799240 ppd

Don't get hung up on this. Some P13416 are super quick and give great points, some are the opposite. Overall, it balances out.

Re: WU 13416 low ppd long run time

PostPosted: Sun Jul 12, 2020 12:32 pm
by psaam0001
For now, I am not running anymore GPU jobs on my Ryzen3's integrated GPU. All GPU tasks I get are relegated to the GTX 1650 in that same machine.

Paul

Re: WU 13416 low ppd long run time

PostPosted: Sun Jul 12, 2020 3:14 pm
by astrorob
HaloJones wrote:P13416 has huge variations so you're not comparing apples to apples.

e.g. two 1070s in the same machine, same PCIE speed, same overclock, same drivers, same OS.
P13416 (R577, C150, G1) - TPF 02:20 for 1307292 ppd
P13416 (R910, C67, G0) - TPF 03:14 for 799240 ppd

Don't get hung up on this. Some P13416 are super quick and give great points, some are the opposite. Overall, it balances out.


i get it, but there is evidence that something is wrong with these WUs on AMD GPUs running under linux. i'm getting CORE_RESTART and BAD_WORK_UNIT after running for 10+ hours, with Potential Energy erros.

this is only happening on the AMD GPU in my linux machine.

Re: WU 13416 low ppd long run time

PostPosted: Sun Jul 12, 2020 3:31 pm
by bruce
So when Rosetta is running and the GPU is NOT running, how many CPU threads remain idle?

Yes, the NVIdia drivers require one or more CPU threads to move data to/from the GPU. AMD's advertising makes an excellent point that their driver's access to main RAM is a good feature.