INT 8 and 16 for precise calculations?

Moderators: Site Moderators, FAHC Science Team

INT 8 and 16 for precise calculations?

Postby MeeLee » Tue Sep 22, 2020 5:33 pm

Would it in theory be possible, to use INT 8, and 16 shaders (like RT cores, and other), to calculate FAH projects, with an equal precision, by either looping the data 2 to 4x through that shader, or using multiple shaders to perform the duty of a full FP 32 bit (cuda) core?

And if it is, can it be used to either enable GPUs for certain workloads, or even speed up GPU workloads?
MeeLee
 
Posts: 1103
Joined: Tue Feb 19, 2019 11:16 pm

Re: INT 8 and 16 for precise calculations?

Postby Joe_H » Tue Sep 22, 2020 6:16 pm

In theory you can do all kinds of calculations on integer registers to be used in place of floating point. In practice it takes many more cycles to do what can be done on floating point registers and adds more levels of complexity to the code and debugging so that it is rarely used these days. Something that might have been done 30+ years ago when floating point often might not be supported in hardware.

I see absolutely no benefit for F@h in that approach.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Joe_H
Site Admin
 
Posts: 6613
Joined: Tue Apr 21, 2009 5:41 pm
Location: W. MA

Re: INT 8 and 16 for precise calculations?

Postby JimboPalmer » Tue Sep 22, 2020 7:43 pm

FP32 has a 24 bit mantissa, so both INT16 and INT8 are less precise than FP32. If you use 2 INT16 or 3 INT8 registers, and a whole lot of slow code, you could achieve the same precision as is built into the CPU.

So for a 4 times slower WU, you could write that code. I find that I make more correct code when I take advantage of the built in computer abilities. (which is why i wrote in PL/SQL the last decade I was a programmer)

https://en.wikipedia.org/wiki/PL/SQL

You mention using INT code to replace FP omissions, which are present in old, slow GPUs. Slowing old, slow GPUs even further does not seem ideal.
Last edited by JimboPalmer on Wed Sep 23, 2020 3:28 am, edited 1 time in total.
Tsar of all the Rushers
I tried to remain childlike, all I achieved was childish.
A friend to those who want no friends
JimboPalmer
 
Posts: 2041
Joined: Mon Feb 16, 2009 5:12 am
Location: Greenwood MS USA

Re: INT 8 and 16 for precise calculations?

Postby MeeLee » Wed Sep 23, 2020 1:12 am

But despite the slowdown, looking at a 3090, it can calculate up to ~36Tflops, 142 of FP 16, and/or 285 Tensor tflops.
That's 36 Tflops at 32 bit,
Potentially ~70tflops at 16 bit
and/or the same 70tflops at 8 bit.

Not sure if those ray tracing cores can be added to the regular cores.
If optimized, it seems like they could outdo the 32bit cores!
MeeLee
 
Posts: 1103
Joined: Tue Feb 19, 2019 11:16 pm

Re: INT 8 and 16 for precise calculations?

Postby PantherX » Wed Sep 23, 2020 8:50 am

I personally think that rather than "going backwards" maybe we think "forwards" as in, we don't use those Tensor cores for FP32 processing but instead, can we use it for something new and exciting, like AI or ML or unique algorithms. We are already using FP32 for folding so why not see what the "additional" hardware can be used to supplement F@H. I have no idea if F@H can even do those things but dreams are free :)
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
User avatar
PantherX
Site Moderator
 
Posts: 6765
Joined: Wed Dec 23, 2009 10:33 am
Location: Land Of The Long White Cloud

Re: INT 8 and 16 for precise calculations?

Postby bruce » Wed Sep 23, 2020 6:22 pm

MeeLee wrote:But despite the slowdown, looking at a 3090, it can calculate up to ~36Tflops, 142 of FP 16, and/or 285 Tensor tflops.
That's 36 Tflops at 32 bit,
Potentially ~70tflops at 16 bit
and/or the same 70tflops at 8 bit.

Not sure if those ray tracing cores can be added to the regular cores.
If optimized, it seems like they could outdo the 32bit cores!


Yes, in theory, It could work but it would be EXPENSIVE in programming time and debugging time plus the issues of validating some entirely new code. It's easier to wait for the next generation of hardware that adds hardware that will enhance the FP performance.

I think it's better to use the tensor cores for problems that will benefit from the use of tensor mathematics. (I'm sure that there are FAH scientists already considering such things.) In the meantime, you can temporary add another GPU that can produce 36 Tflops at 32 bit plus ?? Tflops at 64 bit and donate it to some needy FAH donor whenever you do your next upgrade.
bruce
 
Posts: 20019
Joined: Thu Nov 29, 2007 11:13 pm
Location: So. Cal.


Return to The Science of FAH -- questions/answers

Who is online

Users browsing this forum: No registered users and 2 guests

cron