Windows vs Linux

A forum for discussing FAH-related hardware choices and info on actual products (not speculation).

Moderator: Site Moderators

Forum rules
Please read the forum rules before posting.
foldy
Posts: 2061
Joined: Sat Dec 01, 2012 3:43 pm
Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441

Re: Windows vs Linux

Post by foldy »

May it be possible to run a PC with Windows but start a VM with Linux inside and fold in the VM with GPU? And that be faster than folding in Windows on GPU?
Kuno
Posts: 31
Joined: Sat Sep 23, 2017 4:59 pm

Re: Windows vs Linux

Post by Kuno »

Only on Server 2016, as it's the only one with full hardware access to the VM other than using ESXI or other super expensive VM software without using GRID compatible cards. Trust me, I've tried and windows just plain sucks when it comes to granting access to the VM.
Aurum
Posts: 296
Joined: Sat Oct 03, 2015 3:15 pm
Location: The Great Basin

Re: Windows vs Linux

Post by Aurum »

I decided to try Linux and built a new PC to install Linux Mint 18.3 by itself and run F@H. What a nightmare, after 3 days (mostly trying to get Mint to run from HDD instead of USB drive) I have the FAHControl on the screen with a GPU and a WU DLed but it won't fold, just says Ready. Any help getting this running would be greatly appreciated.
Now I've got FAHClient running in terminal window but it keeps giving me an error to try setting opencl-index manually. But, when I launch FAHControl it shows a different WU...oh oh...
I set the single GPU to 0,0 and after a moment FAHControl seems to be on the same page as FAHClient. I have no idea what I'm doing or why but it's finally running. Much faster too with my 1080 Ti with an estimated PPD of 1017050.
In Science We Trust Image
Kuno
Posts: 31
Joined: Sat Sep 23, 2017 4:59 pm

Re: Windows vs Linux

Post by Kuno »

Aurum wrote:I decided to try Linux and built a new PC to install Linux Mint 18.3 by itself and run F@H. What a nightmare, after 3 days (mostly trying to get Mint to run from HDD instead of USB drive) I have the FAHControl on the screen with a GPU and a WU DLed but it won't fold, just says Ready. Any help getting this running would be greatly appreciated.
Now I've got FAHClient running in terminal window but it keeps giving me an error to try setting opencl-index manually. But, when I launch FAHControl it shows a different WU...oh oh...
I set the single GPU to 0,0 and after a moment FAHControl seems to be on the same page as FAHClient. I have no idea what I'm doing or why but it's finally running. Much faster too with my 1080 Ti with an estimated PPD of 1017050.
https://forums.evga.com/Guide-to-BuildI ... 82398.aspx you could follow this guide. Literally takes 15 minutes.
Aurum
Posts: 296
Joined: Sat Oct 03, 2015 3:15 pm
Location: The Great Basin

Re: Windows vs Linux

Post by Aurum »

Kuno wrote:https://forums.evga.com/Guide-to-BuildI ... 82398.aspx you could follow this guide. Literally takes 15 minutes.
There you Linux smartalecks go again. One guy said I've installed Linux over a hundred times without a hitch.
Thanks, I'll try it tomorrow when I convert another rig.
In Science We Trust Image
Nert
Posts: 162
Joined: Wed Mar 26, 2014 7:46 pm

Re: Windows vs Linux

Post by Nert »

Aurum - your experience sounds similar to mine when I installed my Mint system more than a year ago. It took a lot of time and effort and I feel your pain. Unfortunately, I'm not able to really offer any help. I basically kept hacking away until I got it to work. All I can say is that once you get it up and running you'll be pleased with the results. When I did my installation, I found a lot of outdated and incomplete information here and in other locations. Let us know if the installation procedure provided in the previous post works for you.
v00d00
Posts: 396
Joined: Sun Dec 02, 2007 4:53 am
Hardware configuration: FX8320e (6 cores enabled) @ stock,
- 16GB DDR3,
- Zotac GTX 1050Ti @ Stock.
- Gigabyte GTX 970 @ Stock
Debian 9.

Running GPU since it came out, CPU since client version 3.
Folding since Folding began (~2000) and ran Genome@Home for a while too.
Ran Seti@Home prior to that.
Location: UK
Contact:

Re: Windows vs Linux

Post by v00d00 »

Never could understand peoples fascination with Mint, but each to their own.

Linux + fah is as simple as install a distro, blacklist nouveau, install nvidia drivers, install opencl (if you need to), download and untar fah to a directory, generate a config, run. The config can be edited to how you like it. For extras you could add fah as a service or crontab it via a script. But at the absolute basic level, you could run it manually.

I currently like Slackware (but it isn't supported by FAH so dont use it, power users only). My suggestion would be Debian or CentOS. Do a minimal install. You dont need X to run FAH and unless you are taking a particular interest in learning how to use Linux, don't bother. Your interest is in how to fold on Linux. Do what is required to achieve that goal.

Installing Linux and FAH is as simple as copying and pasting commands from a guide. People tend to talk of it as rocket science level, but it in reality it isnt any harder than installing a modern copy of windows. Guides exist for installing distros, just use google to find them. Then find a guide for installing nvidia drivers and opencl. The final bit
is installing FAH. One thing I have noticed is people always have issues with FAHControl, generally due to dependencies. So dont use it. FAH doesnt require FAHControl to work. FAHControl is a nice little gui for people who like to configure things using mice and other hid. But you want to fold. If you setup FAHClient by hand, you learn about building a config.xml, maybe making a startup script, and you dont have to worry about the dependencies needed for FAHControl. Besides FAHControl requires X and if you are following the minimal approach you wont have X installed.

As ever if anyone wants a hand, shoot me a pm and I will do my best.
Image
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Windows vs Linux

Post by bruce »

v00d00 wrote:Do a minimal install. You dont need X to run FAH and unless you are taking a particular interest in learning how to use Linux, don't bother. Your interest is in how to fold on Linux. Do what is required to achieve that goal.
Running X doesn't teach you Linux, it teaches you to do the same things using a GUI without ever learning how to do the same thing in a text terminal.
Besides FAHControl requires X and if you are following the minimal approach you wont have X installed.
Has anyone written a guide (script?) so that a person can start X on a minimal install system but avoids automatically starting it? I'd be happy with a system that opens only a terminal screen but in which can start X manually when I want it. My objective would be to permit drivers to be updated without the need to re-blacklist nouveau to install nvidia drivers, I suspect that this isn't easy since the drivers get linked into the kernel. NVidia's Windows installer incorporates all those steps into a single command ... assuming you're running it's GUI and then ends by restarting it's GUI -- which in Linux would be unnecessary if it could be done manually.

I depend on a single copy of FAHControl in some GUI (easiest in one copy of Windows) to control all of the clients within the reach of my LAN. This prevents the errors caused by my inevitable sloppy typing when I change something.
Aurum
Posts: 296
Joined: Sat Oct 03, 2015 3:15 pm
Location: The Great Basin

Re: Windows vs Linux

Post by Aurum »

1. I must've made some mistake on the first PC but got it running FAHClient and FAHControl with latest Nvidia driver. I can see what that headless rig is doing via FAHControl on this Win7 PC, at least for F@H.
2. Tons of obsolete info comes up on searches for anything Linux and that's why I hate Linux with a burning passion.
3. Installed Linux Mint 18.3 Cinnamon (maybe I should've used lighter Xfce) from USB drive, ran updates a few times and then clicked Menu/Driver Manager. Popup showed it had installed nouveau driver and I clicked a radio button for Nvidia's recommended driver 384.90 and it also recommended updating my Intel CPU 0000 microcode, check. Installed & rebooted.
4. Went to http://folding.stanford.edu/beta and DLed fahclient_7.4.16_amd64.deb, fahcontrol_7.4.16-1_all.deb and fahviewer_7.4.16_amd64.deb (even though I never use the viewer).
5. I went to TennesseeTony's thread post #6 https://forums.anandtech.com/threads/gu ... s.2528920/ and did these steps:
wget 'http://launchpadlibrarian.net/109052632 ... 15_all.deb (Note: Delete the apostrophe after wget.)
sudo dpkg -i python-support_1.0.15_all.deb
sudo add-apt-repository ppa:graphics-drivers/ppa (I just added this to Updates)
sudo apt-get update
sudo apt-get upgrade
(Note: Since Nvidia driver already installed I did not do this: sudo apt-get install nvidia-)
6. sudo apt-get -y install python-gnome2 mesa-common-dev freeglut3-dev gedit
(Note: I deleted nvidia-settings since it's already installed and appears in the Menu as NVIDIA X Server Settings.)
7. cd /home/aurum/Downloads
8. sudo dpkg -i fahclient_7.4.16_amd64.deb
9. sudo dpkg -i fahcontrol_7.4.16-1_all.deb
10. sudo dpkg -i fahviewer_7.4.16_amd64.deb
11. restarted from Menu
12. cd /var/lib/fahclient
13. ls (GPU.txt already there so did nothing.)
14. clicked Menu/FAHControl and configured as usual. Exited FAH and restarted it and then I could see & run Rig-20 F@H from FAHControl on my main Win7 PC. Works great.
Last edited by Aurum on Fri Jan 12, 2018 8:38 pm, edited 8 times in total.
In Science We Trust Image
Aurum
Posts: 296
Joined: Sat Oct 03, 2015 3:15 pm
Location: The Great Basin

Re: Windows vs Linux

Post by Aurum »

Then the real problem begins. My folding rigs are headless and I use TightVNC to run them. But I cannot get TightVNC to run even though it looks like it installed. Looking for a guide to help me with that.
I also like to have Piriform Speccy, CPUID CPU-Z, GPU-Z and Windows Task Manager Performance so I can see the load on all CPU threads and balance them right. I've yet to find anything comparable that runs under Linux. If anyone knows good utilities to maintain headless PCs please post them. TIA
BOINC installed and ran fine using this guide: https://boinc.berkeley.edu/wiki/Install ... _on_Ubuntu. The only difference is when connecting to a new project one has to use the username when registering for that project as opposed to using your registered email address on Windows version.
I tried installing Storj but haven't gotten it to work. Storj is a breeze under Win7.
This was a new build and I had a HDD and an M.2 SSD plugged into to the new MSI X99A Raider MB. Oddly the BIOS did not see them. I went ahead and tried installing Linux Mint 18.3 and it installed on the M.2 SSD but doesn't see the HDD. Still much to learn about using Linux.
I can disconnect the monitor, mouse & keyboard and move it from my desk to the garage rack and it boots and runs fine.
In Science We Trust Image
Aurum
Posts: 296
Joined: Sat Oct 03, 2015 3:15 pm
Location: The Great Basin

Re: Windows vs Linux

Post by Aurum »

Nvidia released new drivers in response to Meltdown and Spectre.

NVIDIA 384.111 - If you are using the NVIDIA proprietary drivers, upgrade them to version 384.111. In Linux Mint 17.x and 18.x, this update is available in the Update Manager.
In LMDE (or 390.65 for Windows), it is available on the NVIDIA Website: https://www.geforce.com/drivers

https://forums.geforce.com/default/topi ... 2017-5754/
In Science We Trust Image
foldy
Posts: 2061
Joined: Sat Dec 01, 2012 3:43 pm
Hardware configuration: Folding@Home Client 7.6.13 (1 GPU slots)
Windows 7 64bit
Intel Core i5 2500k@4Ghz
Nvidia gtx 1080ti driver 441

Re: Windows vs Linux

Post by foldy »

nvidida 390.xx performance on Windows 7 with gtx 1080ti is the same for folding as previous drivers.
Aurum
Posts: 296
Joined: Sat Oct 03, 2015 3:15 pm
Location: The Great Basin

Re: Windows vs Linux

Post by Aurum »

Can anyone tell what's causing all 3 GPUs on this Linux rig to enter the Failed state and stop??? After a reboot they start off fine.

Code: Select all

06:35:26:WU02:FS01:0x21:Completed 6187500 out of 6250000 steps (99%)
06:35:42:WARNING:FS01:Size of positions 405 does not match topology 393
06:36:18:WU02:FS01:0x21:Completed 6250000 out of 6250000 steps (100%)
06:36:19:WU00:FS01:Connecting to 171.67.108.45:80
06:36:19:WU00:FS01:Assigned to work server 171.67.108.157
06:36:19:WU00:FS01:Requesting new work unit for slot 01: RUNNING gpu:0:GP102 [GeForce GTX 1080 Ti] 11380 from 171.67.108.157
06:36:19:WU00:FS01:Connecting to 171.67.108.157:8080
06:36:20:WU02:FS01:0x21:Saving result file logfile_01.txt
06:36:20:WU02:FS01:0x21:Saving result file checkpointState.xml
06:36:20:WU02:FS01:0x21:Saving result file checkpt.crc
06:36:20:WU02:FS01:0x21:Saving result file log.txt
06:36:20:WU02:FS01:0x21:Saving result file positions.xtc
06:36:20:WU02:FS01:0x21:Folding@home Core Shutdown: FINISHED_UNIT
06:36:20:WU00:FS01:Downloading 5.14MiB
06:36:20:WU02:FS01:FahCore returned: FINISHED_UNIT (100 = 0x64)
06:36:20:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:9431 run:269 clone:4 gen:164 core:0x21 unit:0x000000ceab436c9d586fdd3577fcecbe
06:36:20:WU02:FS01:Uploading 13.63MiB to 171.67.108.157
06:36:20:WU02:FS01:Connecting to 171.67.108.157:8080
06:36:21:WU00:FS01:Download complete
06:36:21:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:9415 run:1326 clone:1 gen:662 core:0x21 unit:0x00000309ab436c9d585e06d42db59086
06:36:21:WU00:FS01:Starting
06:36:21:WU00:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 00 -suffix 01 -version 704 -lifeline 1868 -checkpoint 30 -gpu-vendor nvidia -opencl-device 0 -cuda-device 0 -gpu 0
06:36:21:WU00:FS01:Started FahCore on PID 4495
06:36:21:WU00:FS01:Core PID:4499
06:36:21:WU00:FS01:FahCore 0x21 started
06:36:21:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
06:36:21:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:9415 run:1326 clone:1 gen:662 core:0x21 unit:0x00000309ab436c9d585e06d42db59086
06:36:21:WU00:FS01:Uploading 6.00KiB to 171.67.108.157
06:36:21:WU00:FS01:Connecting to 171.67.108.157:8080
06:36:21:WU00:FS01:Upload complete
06:36:21:WU00:FS01:Server responded WORK_ACK (400)
06:36:21:WU00:FS01:Cleaning up
06:36:22:WU03:FS01:Connecting to 171.67.108.45:80
06:36:22:WU03:FS01:Assigned to work server 171.67.108.157
06:36:22:WU03:FS01:Requesting new work unit for slot 01: READY gpu:0:GP102 [GeForce GTX 1080 Ti] 11380 from 171.67.108.157
06:36:22:WU03:FS01:Connecting to 171.67.108.157:8080
06:36:22:WU03:FS01:Downloading 5.15MiB
06:36:25:WU03:FS01:Download complete
06:36:25:WU03:FS01:Received Unit: id:03 state:DOWNLOAD error:NO_ERROR project:9414 run:1985 clone:2 gen:285 core:0x21 unit:0x00000160ab436c9d585e069f4140f57c
06:36:25:WU03:FS01:Starting
06:36:25:WU03:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 03 -suffix 01 -version 704 -lifeline 1868 -checkpoint 30 -gpu-vendor nvidia -opencl-device 0 -cuda-device 0 -gpu 0
06:36:25:WU03:FS01:Started FahCore on PID 4507
06:36:25:WU03:FS01:Core PID:4511
06:36:25:WU03:FS01:FahCore 0x21 started
06:36:26:WU02:FS01:Upload 24.29%
06:36:26:WARNING:WU03:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
06:36:26:WU03:FS01:Sending unit results: id:03 state:SEND error:FAULTY project:9414 run:1985 clone:2 gen:285 core:0x21 unit:0x00000160ab436c9d585e069f4140f57c
06:36:26:WU03:FS01:Uploading 6.00KiB to 171.67.108.157
06:36:26:WU03:FS01:Connecting to 171.67.108.157:8080
06:36:26:WU03:FS01:Upload complete
06:36:26:WU03:FS01:Server responded WORK_ACK (400)
06:36:26:WU03:FS01:Cleaning up
06:36:26:WU00:FS01:Connecting to 171.67.108.45:80
06:36:27:WU00:FS01:Assigned to work server 171.67.108.157
06:36:27:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GP102 [GeForce GTX 1080 Ti] 11380 from 171.67.108.157
06:36:27:WU00:FS01:Connecting to 171.67.108.157:8080
06:36:27:WU00:FS01:Downloading 5.15MiB
06:36:32:WU00:FS01:Download complete
06:36:32:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:9415 run:518 clone:0 gen:748 core:0x21 unit:0x0000036cab436c9d585e06ccd0d032d7
06:36:32:WU00:FS01:Starting
06:36:32:WU00:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 00 -suffix 01 -version 704 -lifeline 1868 -checkpoint 30 -gpu-vendor nvidia -opencl-device 0 -cuda-device 0 -gpu 0
06:36:32:WU00:FS01:Started FahCore on PID 4514
06:36:32:WU00:FS01:Core PID:4518
06:36:32:WU00:FS01:FahCore 0x21 started
06:36:32:WU02:FS01:Upload 55.92%
06:36:32:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
06:36:32:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:9415 run:518 clone:0 gen:748 core:0x21 unit:0x0000036cab436c9d585e06ccd0d032d7
06:36:32:WU00:FS01:Uploading 6.00KiB to 171.67.108.157
06:36:32:WU00:FS01:Connecting to 171.67.108.157:8080
06:36:33:WU00:FS01:Upload complete
06:36:33:WU00:FS01:Server responded WORK_ACK (400)
06:36:33:WU03:FS01:Connecting to 171.67.108.45:80
06:36:33:WU00:FS01:Cleaning up
06:36:33:WU03:FS01:Assigned to work server 171.67.108.157
06:36:33:WU03:FS01:Requesting new work unit for slot 01: READY gpu:0:GP102 [GeForce GTX 1080 Ti] 11380 from 171.67.108.157
06:36:33:WU03:FS01:Connecting to 171.67.108.157:8080
06:36:34:WU03:FS01:Downloading 8.86MiB
06:36:38:WU02:FS01:Upload 86.18%
06:36:40:WU03:FS01:Download 57.84%
06:36:41:WU03:FS01:Download complete
06:36:41:WU03:FS01:Received Unit: id:03 state:DOWNLOAD error:NO_ERROR project:9431 run:1591 clone:2 gen:61 core:0x21 unit:0x0000004fab436c9d586fdd406a3a8e71
06:36:41:WU03:FS01:Starting
06:36:41:WU03:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 03 -suffix 01 -version 704 -lifeline 1868 -checkpoint 30 -gpu-vendor nvidia -opencl-device 0 -cuda-device 0 -gpu 0
06:36:41:WU03:FS01:Started FahCore on PID 4521
06:36:41:WU03:FS01:Core PID:4525
06:36:41:WU03:FS01:FahCore 0x21 started
06:36:42:WARNING:WU03:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
06:36:42:WU03:FS01:Sending unit results: id:03 state:SEND error:FAULTY project:9431 run:1591 clone:2 gen:61 core:0x21 unit:0x0000004fab436c9d586fdd406a3a8e71
06:36:42:WU03:FS01:Uploading 6.00KiB to 171.67.108.157
06:36:42:WU03:FS01:Connecting to 171.67.108.157:8080
06:36:42:WU03:FS01:Upload complete
06:36:42:WU03:FS01:Server responded WORK_ACK (400)
06:36:42:WU03:FS01:Cleaning up
06:36:42:WU00:FS01:Connecting to 171.67.108.45:80
06:36:43:WU02:FS01:Upload complete
06:36:43:WU00:FS01:Assigned to work server 171.67.108.157
06:36:43:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GP102 [GeForce GTX 1080 Ti] 11380 from 171.67.108.157
06:36:43:WU00:FS01:Connecting to 171.67.108.157:8080
06:36:43:WU02:FS01:Server responded WORK_ACK (400)
06:36:43:WU02:FS01:Final credit estimate, 58047.00 points
06:36:43:WU02:FS01:Cleaning up
06:36:44:WU00:FS01:Downloading 5.15MiB
06:36:45:WU00:FS01:Download complete
06:36:45:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:9414 run:2006 clone:1 gen:158 core:0x21 unit:0x000000c5ab436c9d585e069f6b89bf58
06:36:45:WU00:FS01:Starting
06:36:45:WU00:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 00 -suffix 01 -version 704 -lifeline 1868 -checkpoint 30 -gpu-vendor nvidia -opencl-device 0 -cuda-device 0 -gpu 0
06:36:45:WU00:FS01:Started FahCore on PID 4528
06:36:45:WU00:FS01:Core PID:4532
06:36:45:WU00:FS01:FahCore 0x21 started
06:36:45:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
06:36:45:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:9414 run:2006 clone:1 gen:158 core:0x21 unit:0x000000c5ab436c9d585e069f6b89bf58
06:36:45:WU00:FS01:Uploading 6.00KiB to 171.67.108.157
06:36:45:WU00:FS01:Connecting to 171.67.108.157:8080
06:36:46:WU02:FS01:Connecting to 171.67.108.45:80
06:36:46:WU00:FS01:Upload complete
06:36:46:WU00:FS01:Server responded WORK_ACK (400)
06:36:46:WU00:FS01:Cleaning up
06:36:46:WU02:FS01:Assigned to work server 171.67.108.157
06:36:46:WU02:FS01:Requesting new work unit for slot 01: READY gpu:0:GP102 [GeForce GTX 1080 Ti] 11380 from 171.67.108.157
06:36:46:WU02:FS01:Connecting to 171.67.108.157:8080
06:36:46:WU02:FS01:Downloading 5.14MiB
06:36:47:WU02:FS01:Download complete
06:36:47:WU02:FS01:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:9415 run:2615 clone:1 gen:21 core:0x21 unit:0x0000001eab436c9d585e06e172f4e9a7
06:36:47:WU02:FS01:Starting
06:36:47:WU02:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 02 -suffix 01 -version 704 -lifeline 1868 -checkpoint 30 -gpu-vendor nvidia -opencl-device 0 -cuda-device 0 -gpu 0
06:36:47:WU02:FS01:Started FahCore on PID 4540
06:36:47:WU02:FS01:Core PID:4544
06:36:47:WU02:FS01:FahCore 0x21 started
06:36:48:WARNING:WU02:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
06:36:48:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:9415 run:2615 clone:1 gen:21 core:0x21 unit:0x0000001eab436c9d585e06e172f4e9a7
06:36:48:WU02:FS01:Uploading 6.00KiB to 171.67.108.157
06:36:48:WU02:FS01:Connecting to 171.67.108.157:8080
06:36:48:WU02:FS01:Upload complete
06:36:48:WU02:FS01:Server responded WORK_ACK (400)
06:36:48:WU02:FS01:Cleaning up
06:36:48:WU00:FS01:Connecting to 171.67.108.45:80
06:36:48:WU00:FS01:Assigned to work server 171.67.108.157
06:36:48:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GP102 [GeForce GTX 1080 Ti] 11380 from 171.67.108.157
06:36:48:WU00:FS01:Connecting to 171.67.108.157:8080
06:36:49:WU00:FS01:Downloading 5.13MiB
06:36:50:WU00:FS01:Download complete
06:36:50:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:9414 run:2076 clone:1 gen:261 core:0x21 unit:0x00000143ab436c9d585e06a025b360d5
06:36:50:WU00:FS01:Starting
06:36:50:WU00:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 00 -suffix 01 -version 704 -lifeline 1868 -checkpoint 30 -gpu-vendor nvidia -opencl-device 0 -cuda-device 0 -gpu 0
06:36:50:WU00:FS01:Started FahCore on PID 4547
06:36:50:WU00:FS01:Core PID:4551
06:36:50:WU00:FS01:FahCore 0x21 started
06:36:51:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
06:36:51:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:9414 run:2076 clone:1 gen:261 core:0x21 unit:0x00000143ab436c9d585e06a025b360d5
06:36:51:WU00:FS01:Uploading 6.00KiB to 171.67.108.157
06:36:51:WU00:FS01:Connecting to 171.67.108.157:8080
06:36:51:WU00:FS01:Upload complete
06:36:51:WU00:FS01:Server responded WORK_ACK (400)
06:36:51:WU00:FS01:Cleaning up
06:36:51:WU02:FS01:Connecting to 171.67.108.45:80
06:36:51:WU02:FS01:Assigned to work server 171.67.108.157
06:36:51:WU02:FS01:Requesting new work unit for slot 01: READY gpu:0:GP102 [GeForce GTX 1080 Ti] 11380 from 171.67.108.157
06:36:51:WU02:FS01:Connecting to 171.67.108.157:8080
06:36:52:WU02:FS01:Downloading 5.13MiB
06:36:53:WU02:FS01:Download complete
06:36:53:WU02:FS01:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:9415 run:2033 clone:2 gen:233 core:0x21 unit:0x0000011aab436c9d585e06dba409c2a3
06:36:53:WU02:FS01:Starting
06:36:53:WU02:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 02 -suffix 01 -version 704 -lifeline 1868 -checkpoint 30 -gpu-vendor nvidia -opencl-device 0 -cuda-device 0 -gpu 0
06:36:53:WU02:FS01:Started FahCore on PID 4559
06:36:53:WU02:FS01:Core PID:4563
06:36:53:WU02:FS01:FahCore 0x21 started
06:36:53:WARNING:WU02:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
06:36:53:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:9415 run:2033 clone:2 gen:233 core:0x21 unit:0x0000011aab436c9d585e06dba409c2a3
06:36:53:WU02:FS01:Uploading 6.00KiB to 171.67.108.157
06:36:53:WU02:FS01:Connecting to 171.67.108.157:8080
06:36:53:WU02:FS01:Upload complete
06:36:53:WU02:FS01:Server responded WORK_ACK (400)
06:36:53:WU02:FS01:Cleaning up
06:36:54:WU00:FS01:Connecting to 171.67.108.45:80
06:36:54:WU00:FS01:Assigned to work server 171.67.108.157
06:36:54:WU00:FS01:Requesting new work unit for slot 01: READY gpu:0:GP102 [GeForce GTX 1080 Ti] 11380 from 171.67.108.157
06:36:54:WU00:FS01:Connecting to 171.67.108.157:8080
06:36:54:WU00:FS01:Downloading 5.13MiB
06:36:55:WU00:FS01:Download complete
06:36:55:WU00:FS01:Received Unit: id:00 state:DOWNLOAD error:NO_ERROR project:9414 run:1008 clone:3 gen:186 core:0x21 unit:0x000000dfab436c9d585e0697dc046f81
06:36:55:WU00:FS01:Starting
06:36:55:WU00:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 00 -suffix 01 -version 704 -lifeline 1868 -checkpoint 30 -gpu-vendor nvidia -opencl-device 0 -cuda-device 0 -gpu 0
06:36:55:WU00:FS01:Started FahCore on PID 4566
06:36:55:WU00:FS01:Core PID:4570
06:36:55:WU00:FS01:FahCore 0x21 started
06:36:56:WARNING:WU00:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
06:36:56:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:9414 run:1008 clone:3 gen:186 core:0x21 unit:0x000000dfab436c9d585e0697dc046f81
06:36:56:WU00:FS01:Uploading 6.00KiB to 171.67.108.157
06:36:56:WU00:FS01:Connecting to 171.67.108.157:8080
06:36:56:WU00:FS01:Upload complete
06:36:56:WU00:FS01:Server responded WORK_ACK (400)
06:36:56:WU00:FS01:Cleaning up
06:36:56:WU02:FS01:Connecting to 171.67.108.45:80
06:36:56:WU02:FS01:Assigned to work server 171.67.108.157
06:36:56:WU02:FS01:Requesting new work unit for slot 01: READY gpu:0:GP102 [GeForce GTX 1080 Ti] 11380 from 171.67.108.157
06:36:56:WU02:FS01:Connecting to 171.67.108.157:8080
06:36:57:WU02:FS01:Downloading 5.16MiB
06:36:58:WU02:FS01:Download complete
06:36:58:WU02:FS01:Received Unit: id:02 state:DOWNLOAD error:NO_ERROR project:9414 run:1265 clone:0 gen:506 core:0x21 unit:0x0000026aab436c9d585e06995a51df7f
06:36:58:WU02:FS01:Starting
06:36:58:WU02:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/lib/fahclient/cores/fahwebx.stanford.edu/cores/Linux/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21 -dir 02 -suffix 01 -version 704 -lifeline 1868 -checkpoint 30 -gpu-vendor nvidia -opencl-device 0 -cuda-device 0 -gpu 0
06:36:58:WU02:FS01:Started FahCore on PID 4573
06:36:58:WU02:FS01:Core PID:4577
06:36:58:WU02:FS01:FahCore 0x21 started
06:36:59:WARNING:WU02:FS01:FahCore returned: BAD_WORK_UNIT (114 = 0x72)
06:36:59:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:9414 run:1265 clone:0 gen:506 core:0x21 unit:0x0000026aab436c9d585e06995a51df7f
06:36:59:WU02:FS01:Uploading 6.00KiB to 171.67.108.157
06:36:59:WU02:FS01:Connecting to 171.67.108.157:8080
06:36:59:WU02:FS01:Upload complete
06:36:59:WU02:FS01:Server responded WORK_ACK (400)
06:36:59:WU02:FS01:Cleaning up
******************************* Date: 2018-01-10 *******************************
In Science We Trust Image
Nert
Posts: 162
Joined: Wed Mar 26, 2014 7:46 pm

Re: Windows vs Linux

Post by Nert »

The problem description and resolution sounds similar to this one posted by SteveWillis:

viewtopic.php?f=16&t=30552

Superficially the two instances seem to be the same. He has several monitoring scripts that will detect this problem [and others] and reboot.
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Windows vs Linux

Post by bruce »

Unfortunately, rebooting is not a cure for an unstable system.

FAH is designed to protect itself from repeated errors which are generally a result of unstable hardware/software. FAH cannot diagnose the cause of the instability, only the frequency of failures. Common causes of instabilities are a result of overclocking or overheating or other sources of hardware failures. Repeated failures cause expensive delays to the scientific research.

You will note that you've had a series of WUs that failed, were then reassigned to other machines which DID complete them successfully, so there's nothing wrong with the WUs.

You need to clean up your hardware/software so that it can complete the assignments given to it.

Hi Aurum_ALL_1KJd683KCGdt9eV9DNUXRDxs4ZgQCEdURJ (team 224497),
Your WU (P9415 R1326 C1 G662) was added to the stats database on 2018-01-09 23:15:23 for 0 points of credit.
Hi xxxx (team xxx),
Your WU (P9415 R1326 C1 G662) was added to the stats database on 2018-01-10 00:15:19 for 55278 points of credit.

Hi Aurum_ALL_1KJd683KCGdt9eV9DNUXRDxs4ZgQCEdURJ (team 224497),
Your WU (P9414 R1985 C2 G285) was added to the stats database on 2018-01-09 23:15:23 for 0 points of credit.
Hi xxx (team xxxx),
Your WU (P9414 R1985 C2 G285) was added to the stats database on 2018-01-10 00:15:19 for 49046.7 points of credit.

Hi Aurum_ALL_1KJd683KCGdt9eV9DNUXRDxs4ZgQCEdURJ (team 224497),
Your WU (P9415 R518 C0 G748) was added to the stats database on 2018-01-09 23:15:23 for 0 points of credit.
Hi xxxxx (team xxxxxxx),
Your WU (P9415 R518 C0 G748) was added to the stats database on 2018-01-10 00:15:19 for 52332 points of credit.

Hi Aurum_ALL_1KJd683KCGdt9eV9DNUXRDxs4ZgQCEdURJ (team 224497),
Your WU (P9431 R1591 C2 G61) was added to the stats database on 2018-01-09 23:15:23 for 0 points of credit.
Hi xxx (team 0),
Your WU (P9431 R1591 C2 G61) was added to the stats database on 2018-01-10 02:16:24 for 41602.4 points of credit.
Post Reply