Project 11430 taking 13 hours for 24k credit.

If you think it might be a driver problem, see viewforum.php?f=79

Moderators: Site Moderators, FAHC Science Team

JulieBehr
Scientist
Posts: 8
Joined: Sat Apr 23, 2016 3:55 pm

Re: Project 11430 taking 13 hours for 24k credit.

Post by JulieBehr »

Hi all,

Thanks so much to Joe_H for forwarding this to me. You are absolutely correct that the gens should have had 5,000,000 steps, and that was erroneously changed to 10x that number. This has been corrected and the work server has been restarted; please let me know if problems persist. I am sincerely sorry you've had issues with this project and greatly appreciate the feedback.
JulieBehr
Scientist
Posts: 8
Joined: Sat Apr 23, 2016 3:55 pm

Re: Project 11430 taking 13 hours for 24k credit.

Post by JulieBehr »

I'm still seeing 50,000,000 steps, even though the config file has been corrected. Still looking into the cause...
PS3EdOlkkola
Posts: 184
Joined: Tue Aug 26, 2014 9:48 pm
Hardware configuration: 10 SMP folding slots on Intel Phi "Knights Landing" system, configured as 24 CPUs/slot
9 AMD GPU folding slots
31 Nvidia GPU folding slots
50 total folding slots
Average PPD/slot = 459,500
Location: Dallas, TX

Re: Project 11430 taking 13 hours for 24k credit.

Post by PS3EdOlkkola »

Please pause the project until you've figured it out. Today my systems started getting 11430s with time frames of 15 minutes or more on GTX 1080s, representing 25K ppd, when these GPUs normally produce 30x higher points.
Image
Hardware config viewtopic.php?f=66&t=17997&p=277235#p277235
tedviens
Posts: 3
Joined: Thu Nov 22, 2012 4:57 am

Re: Project 11430 taking 13 hours for 24k credit.

Post by tedviens »

Now have three gtx-980 ti running. After a few months with gtx-980 ti's running just received my first project 11430 assignment. Assigned 2016-10-16T06:19:29. Now running at 15:14 TPF, estimated credit 24174. 50,000,000 steps.
Joe_H
Site Admin
Posts: 7856
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: Project 11430 taking 13 hours for 24k credit.

Post by Joe_H »

tedviens wrote:Now have three gtx-980 ti running. After a few months with gtx-980 ti's running just received my first project 11430 assignment. Assigned 2016-10-16T06:19:29. Now running at 15:14 TPF, estimated credit 24174. 50,000,000 steps.
Can you provide the Project, Run, Clone and Gen numbers for that WU?
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
tedviens
Posts: 3
Joined: Thu Nov 22, 2012 4:57 am

Re: Project 11430 taking 13 hours for 24k credit.

Post by tedviens »

This is the log of transition from 11430 to following project.

Code: Select all

04:52:25:WU00:FS01:0x21:Completed 42500000 out of 50000000 steps (85%)
05:07:39:WU00:FS01:0x21:Completed 43000000 out of 50000000 steps (0%)
05:22:53:WU00:FS01:0x21:Completed 43500000 out of 50000000 steps (1%)
05:38:06:WU00:FS01:0x21:Completed 44000000 out of 50000000 steps (2%)
05:53:20:WU00:FS01:0x21:Completed 44500000 out of 50000000 steps (3%)
06:08:35:WU00:FS01:0x21:Completed 45000000 out of 50000000 steps (4%)
06:23:48:WU00:FS01:0x21:Completed 45500000 out of 50000000 steps (5%)
06:39:02:WU00:FS01:0x21:Completed 46000000 out of 50000000 steps (6%)
06:54:16:WU00:FS01:0x21:Completed 46500000 out of 50000000 steps (7%)
07:09:29:WU00:FS01:0x21:Completed 47000000 out of 50000000 steps (8%)
07:24:43:WU00:FS01:0x21:Completed 47500000 out of 50000000 steps (9%)
07:39:57:WU00:FS01:0x21:Completed 48000000 out of 50000000 steps (10%)
07:55:11:WU00:FS01:0x21:Completed 48500000 out of 50000000 steps (11%)
08:10:25:WU00:FS01:0x21:Completed 49000000 out of 50000000 steps (12%)
08:25:39:WU00:FS01:0x21:Completed 49500000 out of 50000000 steps (13%)
08:25:40:WU01:FS01:Connecting to 171.67.108.45:80
08:25:40:WU01:FS01:Assigned to work server 140.163.4.243
08:25:40:WU01:FS01:Requesting new work unit for slot 01: RUNNING gpu:0:GM200 [GeForce GTX 980 Ti] from 140.163.4.243
08:25:40:WU01:FS01:Connecting to 140.163.4.243:8080
08:25:41:WU01:FS01:Downloading 2.67MiB
08:25:41:WU01:FS01:Download complete
08:25:41:WU01:FS01:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:11707 run:68 clone:24 gen:11 core:0x21 unit:0x000000118ca304f357e9eb5a6fe1a67b
08:40:53:WU00:FS01:0x21:Completed 50000000 out of 50000000 steps (14%)
08:40:54:WU00:FS01:0x21:Saving result file logfile_01.txt
08:40:54:WU00:FS01:0x21:Saving result file checkpointState.xml
08:40:54:WU00:FS01:0x21:Saving result file checkpt.crc
08:40:54:WU00:FS01:0x21:Saving result file log.txt
08:40:55:WU00:FS01:0x21:Saving result file positions.xtc
08:40:59:WU00:FS01:0x21:Folding@home Core Shutdown: FINISHED_UNIT
08:41:00:WU00:FS01:FahCore returned: FINISHED_UNIT (100 = 0x64)
08:41:00:WU00:FS01:Sending unit results: id:00 state:SEND error:NO_ERROR project:11430 run:1 clone:21 gen:52 core:0x21 unit:0x000000488ca304f1574a00752af75ea7
08:41:01:WU00:FS01:Uploading 43.90MiB to 140.163.4.241
08:41:01:WU01:FS01:Starting
08:41:01:WU00:FS01:Connecting to 140.163.4.241:8080
08:41:01:WU01:FS01:Running FahCore: "C:\Program Files\FAHClient/FAHCoreWrapper.exe" C:\Users\tedviensmck\AppData\Roaming\FAHClient\cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21.exe -dir 01 -suffix 01 -version 704 -lifeline 1636 -checkpoint 15 -opencl-platform 2 -gpu-vendor nvidia -gpu 0
08:41:01:WU01:FS01:Started FahCore on PID 8956
08:41:01:WU01:FS01:Core PID:8852
08:41:01:WU01:FS01:FahCore 0x21 started
08:41:01:WU01:FS01:0x21:*********************** Log Started 2016-10-17T08:41:01Z ***********************
08:41:01:WU01:FS01:0x21:Project: 11707 (Run 68, Clone 24, Gen 11)
08:41:01:WU01:FS01:0x21:Unit: 0x000000118ca304f357e9eb5a6fe1a67b
08:41:01:WU01:FS01:0x21:CPU: 0x00000000000000000000000000000000
08:41:01:WU01:FS01:0x21:Machine: 1
08:41:01:WU01:FS01:0x21:Reading tar file core.xml
08:41:01:WU01:FS01:0x21:Reading tar file system.xml
08:41:01:WU01:FS01:0x21:Reading tar file integrator.xml
08:41:01:WU01:FS01:0x21:Reading tar file state.xml
08:41:01:WU01:FS01:0x21:Digital signatures verified
08:41:01:WU01:FS01:0x21:Folding@home GPU Core21 Folding@home Core
08:41:01:WU01:FS01:0x21:Version 0.0.17
08:41:05:WU01:FS01:0x21:Completed 0 out of 7500000 steps (0%)
08:41:05:WU01:FS01:0x21:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
08:41:13:WU00:FS01:Upload complete
08:41:13:WU00:FS01:Server responded WORK_ACK (400)
08:41:13:WU00:FS01:Final credit estimate, 24170.00 points
08:41:13:WU00:FS01:Cleaning up
08:43:38:WU01:FS01:0x21:Completed 75000 out of 7500000 steps (1%)
08:46:11:WU01:FS01:0x21:Completed 150000 out of 7500000 steps (2%)
08:48:44:WU01:FS01:0x21:Completed 225000 out of 7500000 steps (3%)
08:51:18:WU01:FS01:0x21:Completed 300000 out of 7500000 steps (4%)
08:53:51:WU01:FS01:0x21:Completed 375000 out of 7500000 steps (5%)
08:56:24:WU01:FS01:0x21:Completed 450000 out of 7500000 steps (6%)
08:58:57:WU01:FS01:0x21:Completed 525000 out of 7500000 steps (7%)
09:01:29:WU01:FS01:0x21:Completed 600000 out of 7500000 steps (8%)
09:04:02:WU01:FS01:0x21:Completed 675000 out of 7500000 steps (9%)
09:06:35:WU01:FS01:0x21:Completed 750000 out of 7500000 steps (10%)
09:09:08:WU01
Mod edit: added Code tags to log listing
Gary480six
Posts: 91
Joined: Mon Jan 21, 2008 6:42 pm

Re: Project 11430 taking 13 hours for 24k credit.

Post by Gary480six »

JulieBehr wrote:Hi all,

Thanks so much to Joe_H for forwarding this to me. You are absolutely correct that the gens should have had 5,000,000 steps, and that was erroneously changed to 10x that number. This has been corrected and the work server has been restarted; please let me know if problems persist. I am sincerely sorry you've had issues with this project and greatly appreciate the feedback.

It appears that the problem persists. On October 15 - several days after the problem was corrected, these defective work units were still being assigned.On the 15th, I downloaded P11430 R3 C4 G51 to a GTX 750Ti and it is suffering at 46 Minutes a segment. That is a pretty serious waste of my resources.
Can you Please pull these from the assignment server till you are Sure they are all repaired?

Gary
B.A.T
Posts: 17
Joined: Sat Oct 08, 2011 8:33 am
Hardware configuration: 1: AMD FX 8150, 8 GB 1333MHZ, 256 GB Samsung evo 850, Radeon 280X
2: Intel i3 4170, 8GB ram, 120GB SSDNow V300, 2X KFA2 gtx 1080
3: AMD Phenom II x6 1100T, 8GB 1600MHZ, Radeon 280X
4: AMD FX 4300, 8GB 1600MHz, Crucial C300 128GB, 2X Radeon 290X
Location: Norway

Re: Project 11430 taking 13 hours for 24k credit.

Post by B.A.T »

Same here. Last night I got one of these (11430 run:4 clone:25 gen:51) at 26464.00 points

Code: Select all

20:59:35:WU02:FS00:0x21:Completed 49500000 out of 50000000 steps (13%)
20:59:36:WU01:FS00:Connecting to 171.67.108.45:80
20:59:37:WU01:FS00:Assigned to work server 140.163.4.245
20:59:37:WU01:FS00:Requesting new work unit for slot 00: RUNNING gpu:3:GP104 [GeForce GTX 1070] from 140.163.4.245
20:59:37:WU01:FS00:Connecting to 140.163.4.245:8080
20:59:38:WU01:FS00:Downloading 14.50MiB
20:59:44:WU01:FS00:Download 13.36%
20:59:50:WU01:FS00:Download 27.59%
20:59:56:WU01:FS00:Download 40.52%
21:00:02:WU01:FS00:Download 53.89%
21:00:08:WU01:FS00:Download 68.98%
21:00:14:WU01:FS00:Download 82.34%
21:00:20:WU01:FS00:Download 91.83%
21:00:23:WU01:FS00:Download complete
21:00:23:WU01:FS00:Received Unit: id:01 state:DOWNLOAD error:NO_ERROR project:10496 run:50 clone:12 gen:21 core:0x21 unit:0x0000001a8ca304f556bba9b13fbb21d5
21:05:19:WU00:FS01:0x21:Completed 1750000 out of 2500000 steps (70%)
21:11:09:WU00:FS01:0x21:Completed 1775000 out of 2500000 steps (71%)
21:12:41:WU02:FS00:0x21:Completed 50000000 out of 50000000 steps (14%)
21:12:43:WU02:FS00:0x21:Saving result file logfile_01.txt
21:12:43:WU02:FS00:0x21:Saving result file checkpointState.xml
21:12:44:WU02:FS00:0x21:Saving result file checkpt.crc
21:12:44:WU02:FS00:0x21:Saving result file log.txt
21:12:44:WU02:FS00:0x21:Saving result file positions.xtc
21:12:51:WU02:FS00:0x21:Folding@home Core Shutdown: FINISHED_UNIT
21:12:51:WU02:FS00:FahCore returned: FINISHED_UNIT (100 = 0x64)
21:12:52:WU02:FS00:Sending unit results: id:02 state:SEND error:NO_ERROR project:11430 run:4 clone:25 gen:51 core:0x21 unit:0x0000003c8ca304f1574a00893e1f41fc
21:12:52:WU02:FS00:Uploading 43.15MiB to 140.163.4.241
21:12:52:WU01:FS00:Starting
21:12:52:WU02:FS00:Connecting to 140.163.4.241:8080
21:12:52:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/Core_21.fah/FahCore_21.exe -dir 01 -suffix 01 -version 704 -lifeline 5848 -checkpoint 15 -gpu 1 -gpu-vendor nvidia
21:12:52:WU01:FS00:Started FahCore on PID 4088
21:12:52:WU01:FS00:Core PID:4496
21:12:52:WU01:FS00:FahCore 0x21 started
21:12:52:WU01:FS00:0x21:*********************** Log Started 2016-10-16T21:12:52Z ***********************
21:12:52:WU01:FS00:0x21:Project: 10496 (Run 50, Clone 12, Gen 21)
21:12:52:WU01:FS00:0x21:Unit: 0x0000001a8ca304f556bba9b13fbb21d5
21:12:52:WU01:FS00:0x21:CPU: 0x00000000000000000000000000000000
21:12:52:WU01:FS00:0x21:Machine: 0
21:12:52:WU01:FS00:0x21:Reading tar file core.xml
21:12:52:WU01:FS00:0x21:Reading tar file system.xml
21:12:54:WU01:FS00:0x21:Reading tar file integrator.xml
21:12:54:WU01:FS00:0x21:Reading tar file state.xml
21:12:56:WU01:FS00:0x21:Digital signatures verified
21:12:56:WU01:FS00:0x21:Folding@home GPU Core21 Folding@home Core
21:12:56:WU01:FS00:0x21:Version 0.0.17
21:12:58:WU02:FS00:Upload 6.66%
21:13:25:WU01:FS00:0x21:Completed 0 out of 2000000 steps (0%)
21:13:25:WU01:FS00:0x21:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
21:15:29:WU01:FS00:0x21:Completed 20000 out of 2000000 steps (1%)
21:16:59:WU00:FS01:0x21:Completed 1800000 out of 2500000 steps (72%)
21:17:32:WU01:FS00:0x21:Completed 40000 out of 2000000 steps (2%)
21:19:35:WU01:FS00:0x21:Completed 60000 out of 2000000 steps (3%)
21:19:51:WU02:FS00:Upload 7.24%
21:19:51:WARNING:WU02:FS00:Exception: Failed to send results to work server: Transfer failed
21:19:51:WU02:FS00:Trying to send results to collection server
21:19:51:WU02:FS00:Uploading 43.15MiB to 128.252.203.2
21:19:51:WU02:FS00:Connecting to 128.252.203.2:8080
21:19:57:WU02:FS00:Upload 5.79%
21:20:52:WU02:FS00:Upload 6.81%
21:20:58:WU02:FS00:Upload 8.40%
21:21:04:WU02:FS00:Upload 8.98%
21:21:10:WU02:FS00:Upload 11.44%
21:21:16:WU02:FS00:Upload 16.22%
21:21:22:WU02:FS00:Upload 21.15%
21:21:28:WU02:FS00:Upload 25.78%
21:21:34:WU02:FS00:Upload 30.42%
21:21:38:WU01:FS00:0x21:Completed 80000 out of 2000000 steps (4%)
21:21:40:WU02:FS00:Upload 34.62%
21:21:46:WU02:FS00:Upload 38.82%
21:21:52:WU02:FS00:Upload 43.17%
21:21:58:WU02:FS00:Upload 48.24%
21:22:04:WU02:FS00:Upload 53.16%
21:22:10:WU02:FS00:Upload 57.94%
21:22:16:WU02:FS00:Upload 62.72%
21:22:22:WU02:FS00:Upload 67.65%
21:22:28:WU02:FS00:Upload 72.57%
21:22:35:WU02:FS00:Upload 76.63%
21:22:41:WU02:FS00:Upload 77.35%
21:22:47:WU02:FS00:Upload 80.25%
21:22:53:WU02:FS00:Upload 85.18%
21:22:55:WU00:FS01:0x21:Completed 1825000 out of 2500000 steps (73%)
21:22:59:WU02:FS00:Upload 90.10%
21:23:05:WU02:FS00:Upload 95.17%
21:23:21:WU02:FS00:Upload complete
21:23:21:WU02:FS00:Server responded WORK_ACK (400)
21:23:21:WU02:FS00:Final credit estimate, 26464.00 points
21:23:21:WU02:FS00:Cleaning up
JulieBehr
Scientist
Posts: 8
Joined: Sat Apr 23, 2016 3:55 pm

Re: Project 11430 taking 13 hours for 24k credit.

Post by JulieBehr »

I had updated the base credit as a temporary patch to give appropriate credit for the bigger WUs, but this change was reverted. The project is now paused.
Post Reply