Announcing project 5903 (Core 14)

Moderators: slegrand, Site Moderators, PandeGroup

Announcing project 5903 (Core 14)

Postby ihaque » Fri Mar 13, 2009 11:59 pm

Project 5903 just went up on advmethods. This is an extension of project 5902 to correct for my own mistake in not making enough of those WUs :oops:. Since 5902's already been through the whole testing cycle and this project is an extension of that, we'll be moving through the usual testing regime more quickly than normal.

Please note, the behavior of this core is different from what you may expect. The following behaviors are NORMAL, and the points credit takes it into consideration:

* Varying time per frame
* Low GPU temperatures
* Fluctuating CPU usage (somewhat higher than GPU core 11).


WU's are worth 1680 points, with a preferred deadline of 3 days and final deadline of 5 days. Thanks for folding!
User avatar
ihaque
Pande Group Member
 
Posts: 239
Joined: Mon Dec 03, 2007 4:20 am
Location: Stanford

Re: Announcing project 5903 (Core 14)

Postby stevehat1 » Sat Mar 14, 2009 3:16 am

Got 9 of them running ATM, farthest along is 38% and no EUE's.

2x 9800 GX2's @700/1836/1050, 2x 9800 GTX @810/2052/1100 and 3x 8800GT's that are O'Ced slightly. Temps and CPU usage seem to be in line with previous 59xx WU's. Looks good so far!! :D
ImageImage
stevehat1
 
Posts: 81
Joined: Fri Jun 06, 2008 12:33 pm

Re: Announcing project 5903 (Core 14)

Postby AZBrandon » Sat Mar 14, 2009 5:27 am

Just got my first one.

9800GTX+ 738/1890/2000

At 35% completion it appears to be averaging 5150 ppd for me. A nice 25% or so more than I get from the 511 point WU's:
Code: Select all
Project : 5903
 Core    : Unknown
 Frames  : 100
 Credit  : 1680


 -- FAH-GPU --

 Min. Time / Frame : 4mn 27s  - 5436.40 ppd
 Avg. Time / Frame : 4mn 33s  - 5316.92 ppd
 Cur. Time / Frame : 5mn 09s  - 4697.48 ppd
 R3F. Time / Frame : 5mn 03s  - 4790.50 ppd
 Eff. Time / Frame : 4mn 42s  - 5147.23 ppd
AZBrandon
 
Posts: 225
Joined: Sat Jan 17, 2009 1:43 am

Re: Announcing project 5903 (Core 14)

Postby powerarmour » Sat Mar 14, 2009 2:11 pm

No problems with this one on a 9800GTX+ (740/1836/1100) so far, currently averaging 5129 PPD at the moment.
Image
powerarmour
 
Posts: 176
Joined: Wed Oct 29, 2008 1:00 am
Location: Surrey, UK

Re: Announcing project 5903 (Core 14)

Postby ihaque » Sat Mar 14, 2009 6:59 pm

This one's going out to the world. Thanks for advanced testing.
User avatar
ihaque
Pande Group Member
 
Posts: 239
Joined: Mon Dec 03, 2007 4:20 am
Location: Stanford

Re: Announcing project 5903 (Core 14)

Postby neo23 » Sun Mar 15, 2009 9:05 am

Just got my first one on a 8800 GT 256 MB /Core 700, Shader 1783, Ram 854
According to Fahmon PPD is 3829, 27% completed. Running fine so far.
neo23
 
Posts: 10
Joined: Sun Aug 17, 2008 6:53 pm

Re: Announcing project 5903 (Core 14)

Postby mikeb12 » Sun Mar 15, 2009 11:34 am

Vista64 - 178.24
Project : 5903
Credit : 1680

GTX260(192)-1512 shader --
Min. Time / Frame : 3mn 32s - 6846.79 ppd
Avg. Time / Frame : 3mn 45s - 6451.20 ppd

9800GT-1782 shader --
Min. Time / Frame : 5mn 16s - 4593.42 ppd
Avg. Time / Frame : 5mn 19s - 4550.22 ppd
mikeb12
 
Posts: 185
Joined: Tue Feb 12, 2008 11:51 am
Location: South Carolina USA

Re: Announcing project 5903 (Core 14)

Postby EdmundBlackadder » Mon Mar 16, 2009 12:35 am

Project : 5903
Core : Unknown
Frames : 100
Credit : 1680
-- GPU-1 --
Min. Time / Frame : 4mn 15s - 5692.24 ppd
Avg. Time / Frame : 4mn 17s - 5647.94 ppd
eff. Time / Frame : 4mn 24s - 5498.18 ppd
sorry, more errors with me (unattended), otherwise stable (5903 done: 4; 5903 failed: 1; 25%); not 59xx WUs not failing any more.
Code: Select all
[23:16:14] Folding@Home GPU Core - Beta
[23:16:14] Version 1.24 (Mon Mar 2 19:49:32 PST 2009)
[23:16:14]
[23:16:14] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[23:16:14] Build host: vspm46
[23:16:14] Board Type: Nvidia
[23:16:14] Core      :
[23:16:14] Preparing to commence simulation
[23:16:14] - Looking at optimizations...
[23:16:14] - Created dyn
[23:16:14] - Files status OK
[23:16:14] - Expanded 64402 -> 357580 (decompressed 555.2 percent)
[23:16:14] Called DecompressByteArray: compressed_data_size=64402 data_size=357580, decompressed_data_size=357580 diff=0
[23:16:14] - Digital signature verified
[23:16:14]
[23:16:14] Project: 5903 (Run 3, Clone 622, Gen 0)
[23:16:14]
[23:16:14] Assembly optimizations on if available.
[23:16:14] Entering M.D.
[23:16:20] Tpr hash work/wudata_07.tpr:  4185968345 573952239 3417775324 1452465871 3104706293
[23:16:20] Working on Protein
[23:16:21] Client config found, loading data.
[23:16:21] Starting GUI Server
[23:20:24] Completed 1%
[04:17:26] Completed 68%
[04:19:00] SEH code: 3221225477
[04:19:00] Run: exception thrown during GuardedRun
[04:19:00] Run: exception thrown in GuardedRun -- Gromacs cannot continue further.
[04:19:00] Going to send back what have done -- stepsTotalG=8000000
[04:19:00] Work fraction=0.6828 steps=8000000.
[04:19:04] logfile size=234499 infoLength=234499 edr=0 trr=23
[04:19:04] - Writing 235035 bytes of core data to disk...
[04:19:04] Done: 234523 -> 7409 (compressed to 3.1 percent)
[04:19:04]   ... Done.
[04:19:04]
[04:19:04] Folding@home Core Shutdown: UNSTABLE_MACHINE
[04:19:07] CoreStatus = 7A (122)
[04:19:07] Sending work to server
[04:19:07] Project: 5903 (Run 3, Clone 622, Gen 0)
[04:19:07] - Read packet limit of 540015616... Set to 524286976.
[04:19:07] + Attempting to send results [March 16 04:19:07 UTC]
[04:19:09] + Results successfully sent
Last edited by EdmundBlackadder on Mon Mar 16, 2009 9:31 am, edited 2 times in total.
HW: ASUS P5B-Deluxe P45 BIOS v1306,E8600@4GHZ,8GB OCZ @1066MHZ,XFX 9800GTX+ S@1944MHZ;
SW: XP 64bit,182.06,client 6.23 -BIGWU -advmethods,PhysX on,BOINC 6.4.5;24/7
Image
User avatar
EdmundBlackadder
 
Posts: 29
Joined: Tue Feb 10, 2009 3:01 pm
Location: Germany

Re: Announcing project 5903 (Core 14)

Postby P5-133XL » Mon Mar 16, 2009 3:30 am

The Project: 5903 (Run 5, Clone 605, Gen 2) appears to be rejecting on a checkpoint that shouldn't exist. Then it starts over and seems to be able to run. I'm showing the previous WU's also, to show that there shouldn't be a checkpoint.

Code: Select all
[01:19:29] [01:17:02] Project: 5903 (Run 8, Clone 283, Gen 2)
[01:17:02] - Read packet limit of 540015616... Set to 524286976.


[01:17:02] + Attempting to send results [March 16 01:17:02 UTC]
[01:17:04] + Results successfully sent
[01:17:04] Thank you for your contribution to Folding@Home.
[01:17:04] + Number of Units Completed: 21

[01:17:08] - Preparing to get new work unit...
[01:17:08] + Attempting to get work packet
[01:17:08] - Connecting to assignment server
[01:17:08] - Successful: assigned to (171.64.122.70).
[01:17:08] + News From Folding@Home: GPU folding beta
[01:17:09] Loaded queue successfully.
[01:17:10] + Closed connections
[01:17:10]
[01:17:10] + Processing work unit
[01:17:10] Core required: FahCore_14.exe
[01:17:10] Core found.
[01:17:10] Working on queue slot 00 [March 16 01:17:10 UTC]
[01:17:10] + Working ...
[01:17:10]
[01:17:10] *------------------------------*
[01:17:10] Folding@Home GPU Core - Beta
[01:17:10] Version 1.24 (Mon Mar 2 19:49:32 PST 2009)
[01:17:10]
[01:17:10] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[01:17:10] Build host: vspm46
[01:17:10] Board Type: Nvidia
[01:17:10] Core      :
[01:17:10] Preparing to commence simulation
[01:17:10] - Looking at optimizations...
[01:17:10] - Created dyn
[01:17:10] - Files status OK
[01:17:10] - Expanded 68392 -> 357580 (decompressed 522.8 percent)
[01:17:10] Called DecompressByteArray: compressed_data_size=68392 data_size=357580, decompressed_data_size=357580 diff=0
[01:17:10] - Digital signature verified
[01:17:10]
[01:17:10] Project: 5903 (Run 0, Clone 631, Gen 1)
[01:17:10]
[01:17:10] Assembly optimizations on if available.
[01:17:10] Entering M.D.
[01:17:16] Tpr hash work/wudata_00.tpr:  583365291 872409027 3555431135 2449003968 221186236
[01:17:20] Working on Protein
[01:17:21] Client config found, loading data.
[01:17:21] Starting GUI Server
[01:19:21] SEH code: 3221225477
[01:19:21] Run: exception thrown during GuardedRun
[01:19:21] Run: exception thrown in GuardedRun -- Gromacs cannot continue further.
[01:19:21] Going to send back what have done -- stepsTotalG=8000000
[01:19:21] Work fraction=0.0025 steps=8000000.
[01:19:25] logfile size=0 infoLength=0 edr=0 trr=23
[01:19:25] - Writing 641 bytes of core data to disk...
[01:19:25] Done: 129 -> 126 (compressed to 97.6 percent)
[01:19:25]   ... Done.
[01:19:26]
[01:19:26] Folding@home Core Shutdown: EARLY_UNIT_END
[01:19:29] CoreStatus = 72 (114)
[01:19:29] Sending work to server
[01:19:29] Project: 5903 (Run 0, Clone 631, Gen 1)
[01:19:29] - Read packet limit of 540015616... Set to 524286976.


[01:19:29] + Attempting to send results [March 16 01:19:29 UTC]
[01:19:29] + Results successfully sent
[01:19:29] Thank you for your contribution to Folding@Home.
[01:19:33] - Preparing to get new work unit...
[01:19:33] + Attempting to get work packet
[01:19:33] - Connecting to assignment server
[01:19:34] - Successful: assigned to (171.64.122.70).
[01:19:34] + News From Folding@Home: GPU folding beta
[01:19:34] Loaded queue successfully.
[01:19:34] + Closed connections
[01:19:39]
[01:19:39] + Processing work unit
[01:19:39] Core required: FahCore_14.exe
[01:19:39] Core found.
[01:19:39] Working on queue slot 01 [March 16 01:19:39 UTC]
[01:19:39] + Working ...
[01:19:40]
[01:19:40] *------------------------------*
[01:19:40] Folding@Home GPU Core - Beta
[01:19:40] Version 1.24 (Mon Mar 2 19:49:32 PST 2009)
[01:19:40]
[01:19:40] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[01:19:40] Build host: vspm46
[01:19:40] Board Type: Nvidia
[01:19:40] Core      :
[01:19:40] Preparing to commence simulation
[01:19:40] - Looking at optimizations...
[01:19:40] - Files status OK
[01:19:40] - Expanded 68572 -> 357580 (decompressed 521.4 percent)
[01:19:40] Called DecompressByteArray: compressed_data_size=68572 data_size=357580, decompressed_data_size=357580 diff=0
[01:19:40] - Digital signature verified
[01:19:40]
[01:19:40] Project: 5903 (Run 5, Clone 605, Gen 2)
[01:19:40]
[01:19:40] Assembly optimizations on if available.
[01:19:40] Entering M.D.
[01:19:46] Will resume from checkpoint file
[01:19:46] Tpr hash work/wudata_01.tpr:  799377372 3197405648 2810141372 3888859366 698090282
[01:19:51] Working on Protein
[01:19:52] Client config found, loading data.
[01:19:52] Resuming from checkpoint
[01:19:52] fcCheckPointResume: retrieved and current tpr file hash:
[01:19:52]    0      2700001    799377372
[01:19:52]    1   1048811363   3197405648
[01:19:52]    2   3188350757   2810141372
[01:19:52]    3   1068767803   3888859366
[01:19:52]    4   1048929131    698090282
[01:19:52] fcCheckPointResume: file hashes different -- aborting.
[01:19:52] mdrun_gpu returned
[01:19:52] Checkpoint failure
[01:19:52]
[01:19:52] Folding@home Core Shutdown: UNSTABLE_MACHINE
[01:19:56] CoreStatus = 7A (122)
[01:19:56] Sending work to server
[01:19:56] Project: 5903 (Run 5, Clone 605, Gen 2)
[01:19:56] - Read packet limit of 540015616... Set to 524286976.
[01:19:56] - Error: Could not get length of results file work/wuresults_01.dat
[01:19:56] - Error: Could not read unit 01 file. Removing from queue.
[01:19:56] - Preparing to get new work unit...
[01:19:56] + Attempting to get work packet
[01:19:56] - Connecting to assignment server
[01:19:56] - Successful: assigned to (171.64.122.70).
[01:19:56] + News From Folding@Home: GPU folding beta
[01:19:56] Loaded queue successfully.
[01:19:57] + Closed connections
[01:20:02]
[01:20:02] + Processing work unit
[01:20:02] Core required: FahCore_14.exe
[01:20:02] Core found.
[01:20:02] Working on queue slot 02 [March 16 01:20:02 UTC]
[01:20:02] + Working ...
[01:20:02]
[01:20:02] *------------------------------*
[01:20:02] Folding@Home GPU Core - Beta
[01:20:02] Version 1.24 (Mon Mar 2 19:49:32 PST 2009)
[01:20:02]
[01:20:02] Compiler  : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86
[01:20:02] Build host: vspm46
[01:20:02] Board Type: Nvidia
[01:20:02] Core      :
[01:20:02] Preparing to commence simulation
[01:20:02] - Looking at optimizations...
[01:20:02] - Created dyn
[01:20:02] - Files status OK
[01:20:02] - Expanded 68572 -> 357580 (decompressed 521.4 percent)
[01:20:02] Called DecompressByteArray: compressed_data_size=68572 data_size=357580, decompressed_data_size=357580 diff=0
[01:20:02] - Digital signature verified
[01:20:02]
[01:20:02] Project: 5903 (Run 5, Clone 605, Gen 2)
[01:20:02]
[01:20:03] Assembly optimizations on if available.
[01:20:03] Entering M.D.
[01:20:08] Tpr hash work/wudata_02.tpr:  799377372 3197405648 2810141372 3888859366 698090282
[01:20:14] Working on Protein
[01:20:15] Client config found, loading data.
[01:20:15] Starting GUI Server
[01:29:41] Completed 1%
[01:39:52] Completed 2%
[01:49:59] Completed 3%
[01:59:48] Completed 4%
[02:09:59] Completed 5%
[02:20:06] Completed 6%
[02:30:02] Completed 7%
[02:40:13] Completed 8%
[02:50:14] Completed 9%
[03:00:37] Completed 10%
[03:10:53] Completed 11%
Image
P5-133XL
Site Moderator
 
Posts: 4001
Joined: Sun Dec 02, 2007 4:36 am
Location: Salem. OR USA

Re: Announcing project 5903 (Core 14)

Postby Smirnoff » Mon Mar 16, 2009 12:39 pm

These things are really time consuming, but at least my GPUs aren't getting toasted, like with the 511 pointers. :)
Image
Smirnoff
 
Posts: 23
Joined: Wed Oct 22, 2008 12:15 pm

Re: Announcing project 5903 (Core 14)

Postby db597 » Mon Mar 16, 2009 2:37 pm

Smirnoff wrote:but at least my GPUs aren't getting toasted, like with the 511 pointers. :)


Makes you wonder though... in terms of GPU durability, which is worse - high constant temps or fluctuating temps/loads?
MSI GTX460 336SP 1GB | Intel Q6600 | 2GB DDR2
Palit GTX260 216SP 55nm | Intel E5200 | 2GB DDR2
db597
 
Posts: 124
Joined: Mon Dec 24, 2007 12:26 am

Re: Announcing project 5903 (Core 14)

Postby AZBrandon » Mon Mar 16, 2009 3:21 pm

Well after a month of zero errors at 1890mhz shaders, I've gotten two failures already with the 5903. I returned my GTX+ to stock shaders (it was already stock for core and memory) and I guess we'll see how it does. It seems these new units are either more sensitive to error or the on/off/on/off pulsing is actually causing more instability. You know how metals fatigue when they are repeatedly heated and cooled? Perhaps that's what the new WU's are doing - causing fatigue by repeatedly heating and cooling the GPU instead of heating to a constant temperature and leaving it there. I start to wonder like db597 above said, if it isn't worse to have millions of heat cycles instead of a nice steady temperature without cycling up and down every 2 seconds.

Code: Select all
[19:22:54] Completed 73%
[19:27:43] Completed 74%
[19:32:17] Completed 75%
[19:35:38] SEH code: 3221225477
[19:35:38] Run: exception thrown during GuardedRun
[19:35:38] Run: exception thrown in GuardedRun -- Gromacs cannot continue further.
[19:35:38] Going to send back what have done -- stepsTotalG=8000000
[19:35:38] Work fraction=0.7577 steps=8000000.
[19:35:42] logfile size=108167 infoLength=108167 edr=0 trr=23
[19:35:42] - Writing 108703 bytes of core data to disk...
[19:35:42] Done: 108191 -> 6783 (compressed to 6.2 percent)
[19:35:42]   ... Done.
[19:35:42]
[19:35:42] Folding@home Core Shutdown: UNSTABLE_MACHINE
[19:35:45] CoreStatus = 7A (122)
[19:35:45] Sending work to server
[19:35:45] Project: 5903 (Run 11, Clone 79, Gen 1)


Code: Select all
[08:12:14] Completed 89%
[08:16:47] Completed 90%
[08:21:27] Completed 91%
[08:24:56] SEH code: 3221225477
[08:24:56] Run: exception thrown during GuardedRun
[08:24:56] Run: exception thrown in GuardedRun -- Gromacs cannot continue further.
[08:24:56] Going to send back what have done -- stepsTotalG=8000000
[08:24:56] Work fraction=0.9169 steps=8000000.
[08:25:00] logfile size=137438 infoLength=137438 edr=0 trr=23
[08:25:00] - Writing 137974 bytes of core data to disk...
[08:25:00] Done: 137462 -> 7506 (compressed to 5.4 percent)
[08:25:00]   ... Done.
[08:25:00]
[08:25:00] Folding@home Core Shutdown: EARLY_UNIT_END
[08:25:02] CoreStatus = 72 (114)
[08:25:02] Sending work to server
[08:25:02] Project: 5903 (Run 8, Clone 579, Gen 1)
AZBrandon
 
Posts: 225
Joined: Sat Jan 17, 2009 1:43 am

Re: Announcing project 5903 (Core 14)

Postby ihaque » Mon Mar 16, 2009 6:10 pm

AZBrandon wrote:It seems these new units are either more sensitive to error


That's quite possible, especially for the 5902/5903 series. Remember that these WUs take longer to execute (hence the larger # of points). If overclocked shaders only occasionally make mistakes, and it takes a few to crash out a WU, it's more likely that a WU will pick up enough errors to crash over a longer execution.

It's also possible that the core itself has some crashiness, but many of the problems I've seen so far can be attributed to hardware.
User avatar
ihaque
Pande Group Member
 
Posts: 239
Joined: Mon Dec 03, 2007 4:20 am
Location: Stanford

Re: Announcing project 5903 (Core 14)

Postby 50KALKILLER » Sat Mar 21, 2009 12:08 am

From what I have seen now 3 times, p 5903's on core 14 hate being paused and then resumed at anytime later. I've now had to restart the gpu client 3 times because of pausing the folding and then resuming it later and having it just hang there but still do the core checkpoints. Any word on maybe why this would be happening? Is there any chance that the core may hate being paused during doing work before resting again or vise versa?

9800GTX+ @ stock(756/1836/1123)
182.08 forceware
Vista home prem x86_64
50KALKILLER
 
Posts: 21
Joined: Fri Mar 13, 2009 12:05 am

Re: Announcing project 5903 (Core 14)

Postby ihaque » Sat Mar 21, 2009 12:11 am

50KALKILLER wrote:From what I have seen now 3 times, p 5903's on core 14 hate being paused and then resumed at anytime later. I've now had to restart the gpu client 3 times because of pausing the folding and then resuming it later and having it just hang there but still do the core checkpoints. Any word on maybe why this would be happening? Is there any chance that the core may hate being paused during doing work before resting again or vise versa?


It's hard for me to say, because I can't consistently reproduce the issue on my machine here - checkpointing seems to be working fine for me. However, I'm running on XP32, so I wonder if there's some interaction on Vista or 64-bit that's causing problems.
User avatar
ihaque
Pande Group Member
 
Posts: 239
Joined: Mon Dec 03, 2007 4:20 am
Location: Stanford

Next

Return to NVIDIA specific issues

Who is online

Users browsing this forum: No registered users and 1 guest