Project 13421

Moderators: Site Moderators, FAHC Science Team

Project 13421

Postby Nuitari » Thu Jul 30, 2020 7:48 pm

I think this is more of an issue with the client then the project itself. It seems that sometimes one of my GPUs receive a long string of faulty units from Project 13421 and the client marks the GPU failed. that means I have to once again babysit 9 different computers that do folding...

The error is always ERROR:Discrepancy: Forces are blowing up! 23 2 and it is always before any steps seem to complete

This is the latest one I found, but I can dig up past ones.
Code: Select all
18:45:28:WU02:FS01:0x22:Project: 13421 (Run 6192, Clone 19, Gen 0)
18:45:28:WU02:FS01:0x22:Unit: 0x0000000012bc7d9a5f224a095867c495
18:45:28:WU02:FS01:0x22:Reading tar file core.xml
18:45:28:WU02:FS01:0x22:Reading tar file integrator.xml
18:45:28:WU02:FS01:0x22:Reading tar file state.xml.bz2
18:45:28:WU02:FS01:0x22:Reading tar file system.xml.bz2
18:45:28:WU02:FS01:0x22:Digital signatures verified
18:45:28:WU02:FS01:0x22:Folding@home GPU Core22 Folding@home Core
18:45:28:WU02:FS01:0x22:Version 0.0.11
18:45:28:WU02:FS01:0x22:  Checkpoint write interval: 50000 steps (5%) [20 total]
18:45:28:WU02:FS01:0x22:  JSON viewer frame write interval: 10000 steps (1%) [100 total]
18:45:28:WU02:FS01:0x22:  XTC frame write interval: 250000 steps (25%) [4 total]
18:45:28:WU02:FS01:0x22:  Global context and integrator variables write interval: 25000 steps (2.5%) [40 total]
18:45:31:WU02:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 23 2
18:45:31:WU02:FS01:0x22:Saving result file ../logfile_01.txt
18:45:31:WU02:FS01:0x22:Saving result file science.log
18:45:31:WU02:FS01:0x22:Saving result file state.xml.bz2
18:45:31:WU02:FS01:0x22:Folding@home Core Shutdown: BAD_WORK_UNIT


I think the F@H client should have a way to know when a project is more at risk of faulty units and manage it appropriately.
Image
Nuitari
 
Posts: 76
Joined: Sun Jun 09, 2019 5:03 am

Re: Project 13421

Postby bruce » Thu Jul 30, 2020 8:00 pm

The chances of a NaN error are higher for the MoonShot projects (p13400 series) than for other projects. I don't think there's any way to prevent them other than to work on other projects and I don't think there's an easy way to exclude specific projects. Because of the "sprint" concept, JohnChodra is doing whatever he can to maximize the distribution of those projects.

@johnChodra: How about excluding distribution to Advanced so donors like Nuitari can select Advanced and (hopefully) get a preponderance of other projects? (If there are no Advanced projects, the assignment will roll over to Full FAH so he'll likely get one there.) :( Even removing the COVID preference would not guarantee a non-MoonShot assignment.
bruce
 
Posts: 19676
Joined: Thu Nov 29, 2007 11:13 pm
Location: So. Cal.

Re: Project 13421

Postby Nuitari » Fri Jul 31, 2020 4:15 am

I don't have any problem doing the p13400 projects, in fact I would prefer to focus on those. However the client itself need to handle errors much more gracefully then it does now.
Nuitari
 
Posts: 76
Joined: Sun Jun 09, 2019 5:03 am

Re: Project 13421

Postby toTOW » Fri Jul 31, 2020 1:08 pm

What GPU is it ?
Folding@Home beta tester since 2002. Folding Forum moderator since July 2008.

FAH-Addict : latest news, tests and reviews about Folding@Home project.

Image
User avatar
toTOW
Site Moderator
 
Posts: 5619
Joined: Sun Dec 02, 2007 11:38 am
Location: Bordeaux, France

Re: Project 13421

Postby Nuitari » Sat Aug 01, 2020 7:31 pm

One was a GeForce GTX 660, the other an APU (A8-9600)

And this morning, 2 different RX570 and a RX560
I checked the 226 WU of project 13421 that the last rig got and found the following:

54 Success from me, with no other failures
86 faulty were I was the only one (probably will be reassigned)
19 where I was faulty but someone else Ok
35 I was Ok, but someone else reported as faulty
32 where everyone was reporting back Faulty

Except maybe 3 cases, all failures were before we had the first report of "Completed 0 out of 1000000 steps "
Nuitari
 
Posts: 76
Joined: Sun Jun 09, 2019 5:03 am

Re: Project 13421

Postby Nuitari » Tue Aug 11, 2020 3:09 pm

12 GPUs out of 16 this morning were faulty because of project 13421.
All WUs that fail even before the step 0.
Nuitari
 
Posts: 76
Joined: Sun Jun 09, 2019 5:03 am

Re: Project 13421

Postby PantherX » Tue Aug 11, 2020 10:56 pm

Generally speaking, Nvidia GPUs do work quite well with FahCore_22 and Project 13421.

Your GPU is fully supported as it has OpenCL 1.2 and Double Precision support: https://www.techpowerup.com/gpu-specs/g ... x-660.c895

Can you please inform us:
What Driver version you're running?
Where did you download the Driver from (automatic via Windows Update or manually from Nvidia's site)?
What OS you're running?
Any thing unique/special about those systems where the failure occur?
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
User avatar
PantherX
Site Moderator
 
Posts: 6342
Joined: Wed Dec 23, 2009 10:33 am
Location: Land Of The Long White Cloud

Re: Project 13421

Postby Nuitari » Wed Aug 12, 2020 2:30 am

The more powerful NVIDIA cards I have don't get project 13421, the only one that does is a GeForce GTX 660

In fact, it is the only one out of the 4 nvidia cards that is having that project.

Driver 384.130
Ubuntu 16.04
Drivers are the binaries coming from ubuntu
Nothing special otherwise, CPU is an older i5-2320, and its also doing CPU folding.

On the AMD ones, it is the following:
2 systems have the following:
Ubuntu 16.04, mix of RX570 and RX560. Carizzo core (APU) is also affected by the failure issue.
AMDGPU 18.30-641594 from AMD's website

1 system is Ubuntu 18.04
AMDGPU 20.20-1089974 from AMD's website


Interesting log snippets from the Geforce GTX660
Code: Select all
$ grep ERROR log.txt |grep -v NO_ERROR
05:56:48:WU01:FS01:0x22:ERROR:NaNs detected in forces. 8 0
05:56:55:WU02:FS01:0x22:ERROR:NaNs detected in forces. 8 0
05:57:01:WU01:FS01:0x22:ERROR:Force RMSE error of 415.735 with threshold of 5
05:57:05:WU02:FS01:0x22:ERROR:NaNs detected in forces. 8 0
08:33:40:ERROR:WU02:FS00:Exception: Server did not assign work unit
08:51:06:WU00:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
08:51:12:WU01:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
12:29:02:WU00:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 9 0
12:29:58:ERROR:WU00:FS00:Exception: Could not get an assignment
12:29:59:ERROR:WU00:FS00:Exception: Could not get an assignment
12:35:13:ERROR:WU00:FS00:Exception: Failed to connect to 3.21.157.11:80: Connection timed out
14:17:18:WU02:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 6 0
14:17:25:WU01:FS01:0x22:ERROR:Force RMSE error of 445.004 with threshold of 5
14:17:31:WU02:FS01:0x22:ERROR:Force RMSE error of 272.417 with threshold of 5
14:17:35:WU01:FS01:0x22:ERROR:Force RMSE error of 381.459 with threshold of 5
14:17:41:WU02:FS01:0x22:ERROR:Force RMSE error of 267.462 with threshold of 5
14:17:47:WU01:FS01:0x22:ERROR:Force RMSE error of 524.41 with threshold of 5
16:07:45:ERROR:WU01:FS00:Exception: Failed to connect to 3.21.157.11:80: Connection timed out
19:59:36:ERROR:WU00:FS00:Exception: Could not get an assignment
13:51:25:WU00:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 12 0
00:14:07:WU01:FS01:0x22:ERROR:Force RMSE error of 165.891 with threshold of 5
00:14:12:WU00:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 5 0
00:14:18:WU01:FS01:0x22:ERROR:Force RMSE error of 143.528 with threshold of 5
00:14:24:WU00:FS01:0x22:ERROR:Force RMSE error of 1416.69 with threshold of 5
13:29:23:WU00:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 12 0
15:05:35:WU01:FS01:0x22:ERROR:98: Attempting to restart from last good checkpoint by restarting core.
01:14:20:WU01:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 12 0
01:14:24:WU03:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
01:14:27:WU01:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
01:14:31:WU03:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 10 2
01:14:36:WU01:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 25 0
01:14:40:WU03:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
01:14:44:WU01:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 25 0
01:23:40:WU03:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 0 0
09:12:20:ERROR:WU01:FS00:Exception: Failed to connect to 3.21.157.11:80: Connection timed out
20:41:57:WU02:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 0 0
20:42:04:WU03:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 15 2
20:42:07:WU02:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 21 0
06:19:54:ERROR:WU01:FS01:Exception: Transfer failed
06:21:45:WU01:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 21 0
09:26:20:WU00:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
09:26:27:WU02:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
09:26:31:WU00:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
09:26:35:WU02:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
09:26:39:WU00:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
09:26:44:WU02:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
09:26:48:WU00:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
09:26:53:WU02:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
09:26:57:WU00:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
09:27:02:WU02:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 7 0
12:44:31:ERROR:WU00:FS00:Exception: Failed to connect to 3.21.157.11:80: Connection timed out
13:57:49:WU01:FS01:0x22:ERROR:Force RMSE error of 38.2303 with threshold of 5
13:57:53:WU02:FS01:0x22:ERROR:Force RMSE error of 17.6938 with threshold of 5
13:57:57:WU01:FS01:0x22:ERROR:Force RMSE error of 15.6476 with threshold of 5
13:58:02:WU02:FS01:0x22:ERROR:Force RMSE error of 65.3317 with threshold of 5
13:58:07:WU01:FS01:0x22:ERROR:Force RMSE error of 69.6833 with threshold of 5
13:58:11:WU02:FS01:0x22:ERROR:Force RMSE error of 69.8878 with threshold of 5
13:58:19:WU01:FS01:0x22:ERROR:Force RMSE error of 21.2261 with threshold of 5
13:58:22:WU02:FS01:0x22:ERROR:Force RMSE error of 20.6909 with threshold of 5
14:06:58:ERROR:WU01:FS01:Exception: Transfer failed
14:07:01:WU01:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 15 0
14:07:04:WU02:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 15 0
14:07:37:WU01:FS01:0x22:ERROR:Force RMSE error of 596.999 with threshold of 5
14:07:41:WU02:FS01:0x22:ERROR:Force RMSE error of 639.224 with threshold of 5
00:29:32:WU02:FS01:0x22:ERROR:Discrepancy: Forces are blowing up! 6 1


Code: Select all
05:56:48:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:3533 clone:15 gen:0 core:0x22 unit:0x0000000212bc7d9a5f20206f5b5d3eda
05:56:55:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:3553 clone:15 gen:0 core:0x22 unit:0x0000000212bc7d9a5f20206741780e8c
05:57:02:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:3558 clone:15 gen:0 core:0x22 unit:0x0000000212bc7d9a5f2020680636f542
05:57:05:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:3582 clone:15 gen:0 core:0x22 unit:0x0000000212bc7d9a5f202071a2a1087f
08:51:02:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:13421 run:3589 clone:15 gen:0 core:0x22 unit:0x0000000212bc7d9a5f202071b9531fe5
08:51:06:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13421 run:6428 clone:17 gen:0 core:0x22 unit:0x0000000112bc7d9a5f224a0aad570b6b
08:51:12:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:6588 clone:17 gen:0 core:0x22 unit:0x0000000112bc7d9a5f224a1169e6179a
10:40:35:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:13421 run:6600 clone:17 gen:0 core:0x22 unit:0x0000000112bc7d9a5f224a13be52964a
12:29:00:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:13421 run:4178 clone:19 gen:0 core:0x22 unit:0x0000000212bc7d9a5f20708f0b1912b9
12:29:02:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13421 run:2996 clone:21 gen:0 core:0x22 unit:0x0000000212bc7d9a5f1f4f716980b2e6
14:17:14:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:13421 run:3154 clone:21 gen:0 core:0x22 unit:0x0000000212bc7d9a5f1fc0d2cba70df7
14:17:19:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:1842 clone:23 gen:0 core:0x22 unit:0x0000000212bc7d9a5f1f4d0c48905fca
14:17:25:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:2023 clone:23 gen:0 core:0x22 unit:0x0000000212bc7d9a5f1f4d4188a0f371
14:17:31:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:2039 clone:23 gen:0 core:0x22 unit:0x0000000212bc7d9a5f1f4d6f62d34cdc
14:17:35:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:2071 clone:23 gen:0 core:0x22 unit:0x0000000212bc7d9a5f1f4d759e4e3649
14:17:42:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:2104 clone:23 gen:0 core:0x22 unit:0x0000000212bc7d9a5f1f4d859c7c6854
14:17:48:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:2133 clone:23 gen:0 core:0x22 unit:0x0000000212bc7d9a5f1f4d905c76a65e
04:12:19:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:13421 run:3338 clone:29 gen:0 core:0x22 unit:0x0000000212bc7d9a5f1fc0fa9dba1177
13:51:26:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13421 run:5651 clone:22 gen:0 core:0x22 unit:0x0000000212bc7d9a5f2249eb66dbcde5
10:57:44:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:13421 run:8173 clone:60 gen:0 core:0x22 unit:0x0000000012bc7d9a5f284d8ef58578e8
12:46:24:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:13421 run:3471 clone:64 gen:0 core:0x22 unit:0x0000000012bc7d9a5f1fc14ca6785ba0
00:14:03:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:13421 run:1485 clone:69 gen:0 core:0x22 unit:0x0000000012bc7d9a5f1e0dbf9e5b4300
00:14:07:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:7435 clone:69 gen:0 core:0x22 unit:0x0000000012bc7d9a5f224a38f66b7bc1
00:14:13:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13421 run:7502 clone:69 gen:0 core:0x22 unit:0x0000000012bc7d9a5f224a4078515492
00:14:19:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:7514 clone:69 gen:0 core:0x22 unit:0x0000000012bc7d9a5f224a40b385f85f
00:14:25:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13421 run:7521 clone:69 gen:0 core:0x22 unit:0x0000000012bc7d9a5f224a401cc8b4ba
11:42:23:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:13421 run:2077 clone:73 gen:0 core:0x22 unit:0x0000000012bc7d9a5f1f4d7f134a2e4b
13:29:19:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:13421 run:129 clone:74 gen:0 core:0x22 unit:0x0000000012bc7d9a5f1d0fff3d8580ef
13:29:24:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13421 run:6591 clone:74 gen:0 core:0x22 unit:0x0000000012bc7d9a5f224a156d3abbbd
10:35:49:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:13421 run:3660 clone:81 gen:0 core:0x22 unit:0x0000000012bc7d9a5f20206c40b32827
01:14:21:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:7215 clone:37 gen:1 core:0x22 unit:0x0000000212bc7d9a5f224a2e8b55170c
01:14:24:WU03:FS01:Sending unit results: id:03 state:SEND error:FAULTY project:13421 run:7208 clone:45 gen:1 core:0x22 unit:0x0000000112bc7d9a5f224a30e9fd7089
01:14:28:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:7208 clone:49 gen:1 core:0x22 unit:0x0000000212bc7d9a5f224a31dbe82c13
01:14:32:WU03:FS01:Sending unit results: id:03 state:SEND error:FAULTY project:13421 run:7208 clone:53 gen:1 core:0x22 unit:0x0000000112bc7d9a5f224a2f1734bee9
01:14:36:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:7208 clone:60 gen:1 core:0x22 unit:0x0000000112bc7d9a5f224a2f22013846
01:14:41:WU03:FS01:Sending unit results: id:03 state:SEND error:FAULTY project:13421 run:7208 clone:65 gen:1 core:0x22 unit:0x0000000112bc7d9a5f224a2f654039ef
01:14:44:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:7208 clone:70 gen:1 core:0x22 unit:0x0000000112bc7d9a5f224a2e79f7e1ba
13:55:52:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:13421 run:6763 clone:54 gen:1 core:0x22 unit:0x0000000212bc7d9a5f224a1cfcfc3add
15:44:47:WU01:FS01:Sending unit results: id:01 state:SEND error:NO_ERROR project:13421 run:6603 clone:66 gen:1 core:0x22 unit:0x0000000112bc7d9a5f224a10fc9e5c4c
01:23:40:WU03:FS01:Sending unit results: id:03 state:SEND error:FAULTY project:13421 run:6070 clone:50 gen:1 core:0x22 unit:0x0000000112bc7d9a5f2249ffb817cf0c
20:41:58:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:5228 clone:21 gen:1 core:0x22 unit:0x0000000112bc7d9a5f2249e2e4f3496d
20:42:04:WU03:FS01:Sending unit results: id:03 state:SEND error:FAULTY project:13421 run:5224 clone:19 gen:1 core:0x22 unit:0x0000000112bc7d9a5f2249db0e4136af
20:42:07:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:5224 clone:25 gen:1 core:0x22 unit:0x0000000112bc7d9a5f2249db0ca65d7e
06:21:46:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:4845 clone:2 gen:1 core:0x22 unit:0x0000000212bc7d9a5f2249ccacd50bb7
09:26:18:WU02:FS01:Sending unit results: id:02 state:SEND error:NO_ERROR project:13421 run:4844 clone:42 gen:1 core:0x22 unit:0x0000000112bc7d9a5f2249cc5eb8be00
09:26:20:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13421 run:4718 clone:56 gen:1 core:0x22 unit:0x0000000112bc7d9a5f20bd4738e6cfd6
09:26:27:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:4708 clone:1 gen:1 core:0x22 unit:0x0000000312bc7d9a5f20bd465c1ef795
09:26:31:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13421 run:4708 clone:13 gen:1 core:0x22 unit:0x0000000312bc7d9a5f20bd4947b4ce16
09:26:35:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:4708 clone:15 gen:1 core:0x22 unit:0x0000000212bc7d9a5f20bd472fb0d1de
09:26:39:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13421 run:4708 clone:19 gen:1 core:0x22 unit:0x0000000112bc7d9a5f20bd473e69abc7
09:26:44:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:4708 clone:25 gen:1 core:0x22 unit:0x0000000412bc7d9a5f20bd4418671006
09:26:49:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13421 run:4708 clone:26 gen:1 core:0x22 unit:0x0000000212bc7d9a5f20bd4a67513fe6
09:26:53:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:4708 clone:31 gen:1 core:0x22 unit:0x0000000312bc7d9a5f20bd4b97a650e4
09:26:58:WU00:FS01:Sending unit results: id:00 state:SEND error:FAULTY project:13421 run:4708 clone:34 gen:1 core:0x22 unit:0x0000000212bc7d9a5f20bd4785cb695b
09:27:02:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:4708 clone:40 gen:1 core:0x22 unit:0x0000000112bc7d9a5f20bd4ad606d382
13:57:50:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:4506 clone:41 gen:1 core:0x22 unit:0x0000000312bc7d9a5f20709b1066bca2
13:57:53:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:4506 clone:44 gen:1 core:0x22 unit:0x0000000212bc7d9a5f20709da5e6d0ec
13:57:57:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:4506 clone:47 gen:1 core:0x22 unit:0x0000000212bc7d9a5f20709bf85a3e3b
13:58:03:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:4506 clone:56 gen:1 core:0x22 unit:0x0000000112bc7d9a5f20709b617a0b23
13:58:07:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:4506 clone:62 gen:1 core:0x22 unit:0x0000000112bc7d9a5f20709b69bc982f
13:58:11:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:4506 clone:67 gen:1 core:0x22 unit:0x0000000112bc7d9a5f20709b820e8c20
13:58:19:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:4505 clone:2 gen:1 core:0x22 unit:0x0000000112bc7d9a5f20709bb806f7d1
13:58:22:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:4505 clone:14 gen:1 core:0x22 unit:0x0000000112bc7d9a5f20709b3f0af7ee
14:07:01:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:4499 clone:52 gen:1 core:0x22 unit:0x0000000212bc7d9a5f20709bbce5d549
14:07:04:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:4499 clone:54 gen:1 core:0x22 unit:0x0000000112bc7d9a5f20709b62a529cc
14:07:38:WU01:FS01:Sending unit results: id:01 state:SEND error:FAULTY project:13421 run:4498 clone:24 gen:1 core:0x22 unit:0x0000000112bc7d9a5f20709bba692b0c
14:07:41:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:4498 clone:30 gen:1 core:0x22 unit:0x0000000312bc7d9a5f20709b863d1a62
00:29:32:WU02:FS01:Sending unit results: id:02 state:SEND error:FAULTY project:13421 run:4060 clone:46 gen:1 core:0x22 unit:0x0000000112bc7d9a5f207094e404bfd2


Project 16600 does run fine on it though.
Nuitari
 
Posts: 76
Joined: Sun Jun 09, 2019 5:03 am

Re: Project 13421

Postby PantherX » Wed Aug 12, 2020 4:05 am

AFAIK, Project 13420 is restricted to high-end GPUs while Project 13421 is for low-end GPUs. One reason for the restriction is that Project 13421 doesn't scale that well on high-end GPUs but works very well on low end GPUs.

Here's a bit of comparison on Windows with GTX 1080 Ti:
Project 13420: Avg. Time / Frame : 00:02:55 - 812,856.48 PPD
Project 13421: Avg. Time / Frame : 00:01:14 - 155,868.67 PPD

Regarding the issue, have you installed OpenCL package (sudo apt-get install ocl-icd-opencl-dev)? Also, I do believe that the current Drivers for Nvidia GPUs are 4XX series so maybe you can try to update that too?
User avatar
PantherX
Site Moderator
 
Posts: 6342
Joined: Wed Dec 23, 2009 10:33 am
Location: Land Of The Long White Cloud

Re: Project 13421

Postby Nuitari » Wed Aug 12, 2020 4:25 am

That's the package for compiling opencl code, but I will install it.
I will check for the nvidia drivers, I'm not sure what is the latest one for that very old card model.
The card works well otherwise, except for that project
Nuitari
 
Posts: 76
Joined: Sun Jun 09, 2019 5:03 am

Re: Project 13421

Postby PantherX » Wed Aug 12, 2020 9:45 am

Regarding the OpenCL Package, that is what I have seen donors on the Forum report to have worked. While I am not an expert on Linux, there are other members here who might be able to provide some guidance.

Project 134XX are using some of the latest features while other Project haven't used them. Thus, it could be one reason why that Project fails while others work.

A quick search indicates that the latest Nvidia Driver on Linux for GTX 660 is 450.57: https://www.nvidia.com/Download/driverR ... 2107/en-us
User avatar
PantherX
Site Moderator
 
Posts: 6342
Joined: Wed Dec 23, 2009 10:33 am
Location: Land Of The Long White Cloud

Re: Project 13421

Postby Joe_H » Wed Aug 12, 2020 2:05 pm

There are some different theories on why the dev kit being installed gets some things related to OpenCL to work on Linux systems. The one that seems to fit best is that some driver installers will get the OpenCL runtime code in place but leave some unresolved links. The dev kit install patches those up apparently. It might be something else, and the behavior is not exactly the same from one driver version to another.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
Joe_H
Site Admin
 
Posts: 6449
Joined: Tue Apr 21, 2009 5:41 pm
Location: W. MA


Return to Issues with a specific WU

Who is online

Users browsing this forum: No registered users and 2 guests

cron