R9-290X not recognizing _x16 as task

It seems that a lot of GPU problems revolve around specific versions of drivers. Though AMD has their own support structure, you can often learn from information reported by others who fold.

Moderators: Site Moderators, FAHC Science Team

Post Reply
prjindigo
Posts: 31
Joined: Wed Mar 30, 2011 7:49 am

R9-290X not recognizing _x16 as task

Post by prjindigo »

Catalyst 13.12, Sapphire R9 290X, "client-type beta" declared, WU 11293 constantly loading on slot reset: ppd reading 2k/frame about 29 minutes. Every 22 seconds the load will spike from zero to about 80% but the GPU clock never leaves 300mhz.

Same configuration was hammering out WU 8900 _x17 last night at 2:54 per frame. This is the secondary card in the system beside a Titan - no conflicts encountered between drivers in three months at redline production. Both on pcie-16 3.0 slots (Sabertooth X79)
(systems monitored using EVGA Precision 4.2.1, R9 idling at 41°C getting jacksquat done)

Is there some way to force _x17 module to load to check to see if it is just _x16 incompatibility? If not can you block _x16 modules from being set to me? _x16 isn't even keeping the card in safe operation temp range.
(titan is pounding out WU 7660 _x15 at 4:20 per frame just fine)
bollix47
Posts: 2941
Joined: Sun Dec 02, 2007 5:04 am
Location: Canada

Re: R9-290X not recognizing _x16 as task

Post by bollix47 »

Please see the answer here.
Image
prjindigo
Posts: 31
Joined: Wed Mar 30, 2011 7:49 am

Re: R9-290X not recognizing _x16 as task

Post by prjindigo »


Re: Project 11923 (31,462,49)

Postby Joe_H » Sat Jan 04, 2014 10:07 am
There may not be much you can do about the 11293 WU failing. Core_16 works better with older Catalyst drivers, version 12.8 is the newest one known to be fairly stable. Some driver versions since then have worked, but inefficiently, and others have not worked at all. Core_17 however works better with newer drivers.

At the moment the WS for Project 8900 is listed as down. That might be related to the blog post that they are in the process of generating more Core_17 WU's to be ready possibly later today. In the meantime, the two Core_16 projects were announced to be close to completion a few months ago and their WS is set to a low assignment priority so they only get assigned when Core_17 work is unavailable for AMD cards.
So if that work unit fails to operate correctly on AMD drivers above 12.8 why is it still being assigned to cards that will not run on drivers below 13.0a and thus wasting fifteen to twenty hours in which the cards could finish off three x17 modules?
bruce
Posts: 20910
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: R9-290X not recognizing _x16 as task

Post by bruce »

The FAH servers do not know which drivers you're running so it cannot be used in an assignment decision.

Server 171.67.108.44 which has projects for FahCore_16 is set at an extremely low priority so those WUs will only be assigned when no WUs for FahCore_17 are available. If you check the blog and other recent posts on the forum, you'll see that there has been a period when core_17 assignments were unavailable. it looks like there is a general policy of making Core_16 assignments only when core_17 assignments are unavailable. I think that makes sense since most GPUs with most drivers will run Core_16 assignments and at least for those folks, running them is probably preferable to having the server tell everyone that no assignments are available.
prjindigo
Posts: 31
Joined: Wed Mar 30, 2011 7:49 am

Re: R9-290X not recognizing _x16 as task

Post by prjindigo »

Process 8900(47, 0, 138) started exhibiting same behavior at around 94% completion: pulsing clock around 300mhz, no heat in sink.
On restart 8900(47,0,138) reverted to 58.06%. Client had to be forced offline.

Some sort of memory hole?
Manual restarts of client every 25% should solve the issue for _x17.

Can confirm, restart every hour. 8900(167,7,31) crashed at 54%.
PantherX
Site Moderator
Posts: 7020
Joined: Wed Dec 23, 2009 9:33 am
Hardware configuration: V7.6.21 -> Multi-purpose 24/7
Windows 10 64-bit
CPU:2/3/4/6 -> Intel i7-6700K
GPU:1 -> Nvidia GTX 1080 Ti
§
Retired:
2x Nvidia GTX 1070
Nvidia GTX 675M
Nvidia GTX 660 Ti
Nvidia GTX 650 SC
Nvidia GTX 260 896 MB SOC
Nvidia 9600GT 1 GB OC
Nvidia 9500M GS
Nvidia 8800GTS 320 MB

Intel Core i7-860
Intel Core i7-3840QM
Intel i3-3240
Intel Core 2 Duo E8200
Intel Core 2 Duo E6550
Intel Core 2 Duo T8300
Intel Pentium E5500
Intel Pentium E5400
Location: Land Of The Long White Cloud
Contact:

Re: R9-290X not recognizing _x16 as task

Post by PantherX »

If it is a memory leak, it must be very unique since you are the first to report this kind of behavior.

For further troubleshooting, please post the log file which will contain your system configuration and F@H settings.
ETA:
Now ↞ Very Soon ↔ Soon ↔ Soon-ish ↔ Not Soon ↠ End Of Time

Welcome To The F@H Support Forum Ӂ Troubleshooting Bad WUs Ӂ Troubleshooting Server Connectivity Issues
Joe_H
Site Admin
Posts: 7856
Joined: Tue Apr 21, 2009 4:41 pm
Hardware configuration: Mac Pro 2.8 quad 12 GB smp4
MacBook Pro 2.9 i7 8 GB smp2
Location: W. MA

Re: R9-290X not recognizing _x16 as task

Post by Joe_H »

Is your GPU overclocked, factory or otherwise? What you have described sounds more like a driver crash and reset. The same symptoms of the client reporting progress and then reverting to a previous checkpoint have been mentioned by a number of other folders. You can check the Windows logs for driver resets that correspond to the GPU stopping processing and going to the low clock speed.
Image

iMac 2.8 i7 12 GB smp8, Mac Pro 2.8 quad 12 GB smp6
MacBook Pro 2.9 i7 8 GB smp3
prjindigo
Posts: 31
Joined: Wed Mar 30, 2011 7:49 am

Re: R9-290X not recognizing _x16 as task

Post by prjindigo »

probable resolution: Force pcie-16X into 3.0 mode and lock it there.

For some reason the Sapphire 290X was bumping the slot in and out of 3.0, when it did that for whatever reason it would miss-time and pop the API in the nose. Roughly the same as the problem we had last year with youtube videos and AMD cards. I've had it running flat out at 1100MHz for more than 24 hours and not a single quirk has occurred on the same driver set I started out using.

The feature to lock pcie generation is in cmos.
Post Reply