FAHClient V7.1.48 released (6th Open-Beta)

Archive from previous release

Moderators: Site Moderators, PandeGroup

Re: FAHClient V7.1.48 released (6th Open-Beta)

Postby davidcoton » Tue Feb 21, 2012 5:30 pm

MtM wrote:Imho you're overreacting...

I hope not, and you may be right about the effort required to fix. Actually I'm not arguing the issues either way. I'm asking for enough information about the intention of the public release to answer JC's question about priority issues. Also I'm trying to establish logical links between issues for scoping releases (NOT saying that the fixes are related, but asking whether it is logical to fix one and leave another in the same area).

All the best with your investigations! Sounds like a big mess of different enumerations -- I hope you and Bruce can sort out which should be used where, and how to determine them reliably.

EDIT: From what has been written here, the Linux install issues may require a new build environment, and then backwards support (older Linux) may not be trivial. The OSX issues are not just installers, but fairly basic gtk version issues.

David
Image
davidcoton
 
Posts: 167
Joined: Wed Nov 05, 2008 3:19 pm
Location: Cambridge, UK

Re: FAHClient V7.1.48 released (6th Open-Beta)

Postby MtM » Tue Feb 21, 2012 6:35 pm

Again only talking about gpgpu enumeration: it's a mess because no gpgpu platform has a readable field containing the pciID of the detected device. If it did, one could correlate that with the nvapi displayhandle or Amd/Ati Display Library pciID, and never just assume the devices are given based on a pciID sort. But, even if you don't have that field, as long as this sort is consistent, it should be enough.

That's why I say that if the sort is not working in some cases, it's a problem. Why I say that, well if it's not working, there is no other approach ( this also makes the pciID sort based order something of an 'industry standard' so I don't expect this sort to not work unless you're trying a driver which has not gone through whql checks and which might use another sort method ).

As I said, I haven't seen anyone show proof of this sort being wrong. Though I add that not long ago I was wondering about this sort being used at all times or not, for the same reasons I expect others to suspect that this order is to blame for some perceived difficulty: other applications don't use the same sort method as the gpgpu platforms, they use primarily the order in which WDM lists the devices.

So, for those issues, there actually isn't an issue other then a wrongly made assumption as to how things 'should' work.
MtM
 
Posts: 3233
Joined: Fri Jun 27, 2008 2:20 pm
Location: The Netherlands

Re: FAHClient V7.1.48 released (6th Open-Beta)

Postby pgwalsh » Tue Feb 21, 2012 7:00 pm

I noticed simulation-info on os x v7.1.48 no longer provides run_time or simulation_time. I didn't see this in the release notes, was this on purpose?
User avatar
pgwalsh
 
Posts: 67
Joined: Tue Nov 18, 2008 6:02 pm
Location: Colorado Springs, CO

Re: FAHClient V7.1.48 released (6th Open-Beta)

Postby smoking2000 » Tue Feb 21, 2012 8:09 pm

pgwalsh wrote:I noticed simulation-info on os x v7.1.48 no longer provides run_time or simulation_time. I didn't see this in the release notes, was this on purpose?

I also ran into that with my Perl interface, I think the change in question is:
jcoffland wrote:FAHViewer:
v7.1.45:
  • Use more fine grained WU progress estimated by the clinet. #808

AFAIK the simulation-info command was added to the FAHClient remote interface specifically for FAHViewer, and I assume that Joe doesn't count on others using it too who are more likely to use the queue-info instead.
User avatar
smoking2000
 
Posts: 570
Joined: Mon Dec 03, 2007 6:20 am
Location: Amsterdam

Re: FAHClient V7.1.48 released (6th Open-Beta)

Postby bruce » Tue Feb 21, 2012 8:25 pm

davidcoton wrote:All the best with your investigations! Sounds like a big mess of different enumerations -- I hope you and Bruce can sort out which should be used where, and how to determine them reliably.


Let's put it this way. Windows enumerates devices in a certain way but that's not a universal answer since it clearly won't work for Linux/OSX, therefore anybody who wants V7 to conform to Windows conventions is going to have to get used to the idea tha V7 uses what I'll call (for the purpose of this discussion) a universal method that will work in both Linux AND in Windows but which may be called "wrong" by many of you. In fact, it's only wrong if it's not consistent with itself.

I have no documented cases where V7.1.48 is not self-consistent but I'll keep looking. I will not accept a problem report where V7 doesn't match what's reported by some other methodology or where someone has tried to force V7 to conform to their idea of what's "right".
bruce
Site Admin
 
Posts: 14971
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: FAHClient V7.1.48 released (6th Open-Beta)

Postby MtM » Tue Feb 21, 2012 8:35 pm

bruce wrote:
davidcoton wrote:All the best with your investigations! Sounds like a big mess of different enumerations -- I hope you and Bruce can sort out which should be used where, and how to determine them reliably.


Let's put it this way. Windows enumerates devices in a certain way but that's not a universal answer since it clearly won't work for Linux/OSX, therefore anybody who wants V7 to conform to Windows conventions is going to have to get used to the idea tha V7 uses what I'll call (for the purpose of this discussion) a universal method that will work in both Linux AND in Windows but which may be called "wrong" by many of you. In fact, it's only wrong if it's not consistent with itself.

I have no documented cases where V7.1.48 is not self-consistent but I'll keep looking. I will not accept a problem report where V7 doesn't match what's reported by some other methodology or where someone has tried to force V7 to conform to their idea of what's "right".


Actually this is a bad statement because it doesn't take into the account the order in which to increase the -gpu x which should match the gpgpu enumeration. So if FAHClient would be found to be inconsistent with the actual gpgpu enumeration, but not with itself, it would be a bad thing. But as I said I don't think that will happen, as the pciID sort seems to be an industry standard ( though it's not documented anywhere to make it official so to say :( ).

The fact that FAHClient is consistent with this standard doesn't grant it the right to proclaim it is the standard, even if f@h has been on the foreground of gpu assisted computing for a long time :)
MtM
 
Posts: 3233
Joined: Fri Jun 27, 2008 2:20 pm
Location: The Netherlands

Re: FAHClient V7.1.48 released (6th Open-Beta)

Postby pgwalsh » Tue Feb 21, 2012 8:42 pm

smoking2000 wrote:
pgwalsh wrote:I noticed simulation-info on os x v7.1.48 no longer provides run_time or simulation_time. I didn't see this in the release notes, was this on purpose?

I also ran into that with my Perl interface, I think the change in question is:
jcoffland wrote:FAHViewer:
v7.1.45:
  • Use more fine grained WU progress estimated by the clinet. #808

AFAIK the simulation-info command was added to the FAHClient remote interface specifically for FAHViewer, and I assume that Joe doesn't count on others using it too who are more likely to use the queue-info instead.

Thanks for pointing to that bullet point, I overlooked it. It's obviously not an issue, just have to put old code back in, but I don't think it was necessary to remove it either.

I haven't encountered the the following issue on this latest client, but it's only been running for a couple hours.
On 7.1.44 I would periodically get:
Connected to localhost.
Escape character is '^]'.

After receiving finding this, I could no longer telnet to the client.

If the client was started from a reboot, which launched as a background service, there was no way to stop it unless I killed the processes. I also found that if I needed to restart my machine, FAHControl would block it on first attempt until the client quit, which would require another reboot from the menu. This does happen with reboot from the command line, but that requires superuser anyway. That's not desirable behavior in my opinion.
User avatar
pgwalsh
 
Posts: 67
Joined: Tue Nov 18, 2008 6:02 pm
Location: Colorado Springs, CO

Re: FAHClient V7.1.48 released (6th Open-Beta)

Postby Jesse_V » Tue Feb 21, 2012 8:47 pm

Boy I really hope they get that GPU order problem figured out. Good luck to you guys.

I was thinking, should v7 mention in its install screen about the possibility of GPU lag? I may have mentioned this before, but based on people's post in this forum, it seems that GPU have a much higher probability of causing lag compared to SMP, and I get the impression that lower-powered GPUs and certain drivers experience the issue more often. I would guess that the likelihood of this problem would reduce over time as drivers, cores, OS scheduling, and various other things improve. I mean I'm completely for GPU folding, I think it's fantastic that they can be utilized and produce the science that they do. The one problem I see is that there's a possibility of lag and other errors on systems, which would be a real problem for anyone who didn't know how to handle it and I see the possibility of it driving them away from F@h. Based on what's currently on the installation WinGuide page, I see that SMP and GPU folding are presented on pretty much equal ground, and I didn't see mention of possible GPU issues such as lag. We certainly want to minimize things like this viewtopic.php?f=59&t=19880 occurring, but with v7 installing a GPU slot if it can, that might be a more frequent occurrence. So I think we should at least mentioned it somewhere. Maybe something like this:
Uniprocessor - Uses one CPU
SMP - Uses two or more CPUs to complete the overall simulations faster
GPU - Uses the Graphics Processor Unit that powers 3D applications such as games and can be extremely fast and productive, but there is a small possibility of system lag

Something like that. Maybe a wordsmith can come along and say it better, but what do you guys think about this?
User avatar
Jesse_V
 
Posts: 2169
Joined: Mon Jul 18, 2011 4:44 am
Location: Big Lake, Alaska

Re: FAHClient V7.1.48 released (6th Open-Beta)

Postby MtM » Tue Feb 21, 2012 8:50 pm

FAHClient stopped reboots here as well ( Win7 X64 Ultimate ).

I have not changed any settings ( windows ) to lower the amount of time allowed for processes to finish during an shutdown, I would agree that FAHClient should be able to exit within the normal allowed timeout period ( this is 15 seconds iirc?? ). When I select pause, it normally goes on pause ( which means all fahcore's have been closed ), with a fair margin towards these 15 seconds, but on other times it does not probably because the core was busy writing a checkpoint already. Maybe the call from FAHClient/FAHCoreWrapper to exit and write a checkpoint if supported is blocking/suspending the function which is writing the checkpoint?

Edit:

@ Jesse

I was actually expecting that V7 would make it as recommended download only after the gpu cores have been updated to support a 'limit usage to' option. Even if it does not, the other gpu clients also do not field such a warning, but then again they are listed in the 'advanced high performance' section of the downloads so they already carry that disclaimer which I would say includes the lag one can experience. I do think that a warning is in place in V7 if it goes live before the FAHCore's support a limit. I think the warning should be shown when adding a gpu slot, or when selecting gpu from the install menu. The choice of words should be informative without being alarming.

Come to think of it, a general warning about high utilization and related temperatures might be in order as well considering the amount of people who might install it without having made that connection beforehand? I'm not sure if these kinds of warning should be given during install or on the download page though.
MtM
 
Posts: 3233
Joined: Fri Jun 27, 2008 2:20 pm
Location: The Netherlands

Re: FAHClient V7.1.48 released (6th Open-Beta)

Postby Jesse_V » Tue Feb 21, 2012 11:15 pm

MtM wrote:I was actually expecting that V7 would make it as recommended download only after the gpu cores have been updated to support a 'limit usage to' option. Even if it does not, the other gpu clients also do not field such a warning, but then again they are listed in the 'advanced high performance' section of the downloads so they already carry that disclaimer which I would say includes the lag one can experience. I do think that a warning is in place in V7 if it goes live before the FAHCore's support a limit. I think the warning should be shown when adding a gpu slot, or when selecting gpu from the install menu. The choice of words should be informative without being alarming.

Come to think of it, a general warning about high utilization and related temperatures might be in order as well considering the amount of people who might install it without having made that connection beforehand? I'm not sure if these kinds of warning should be given during install or on the download page though.

That's a good point. The SMP/GPU clients are a bit hidden, and the bolded text about beta software is pretty obvious. I guess the clients were stable enough to be transferred to v7 as slots, but I agree that the bolded beta text did cover any and all issues including lag. Apparently the GPU usage limit slider requires cores to be updated, and who knows how long that will take. I'd guess at this point that v7 would go to folding.stanford.edu before that happened. And yes any warning about lag and any other issues should be very carefully worded. Are high temperatures really a serious issue on desktops? Laptop users are probably much more aware of that kind of thing (my left hand is currently being heated by my CPU/GPU). Seems to me that high utilization and higher temps is kind of to be expected when participating in distributed computing, but lag is a different story. It's not only annoying and disruptive, but I don't think it's really expected. It's been my experience that SMP does a pretty good job at back down for other applications, but I can't say the same for GPUs. My vote is that a concise message displayed on v7 itself, that way both those who read the instructions and those that don't will see it.
User avatar
Jesse_V
 
Posts: 2169
Joined: Mon Jul 18, 2011 4:44 am
Location: Big Lake, Alaska

Re: FAHClient V7.1.48 released (6th Open-Beta)

Postby 7im » Tue Feb 21, 2012 11:31 pm

MtM wrote:...

Come to think of it, a general warning about high utilization and related temperatures might be in order as well considering the amount of people who might install it without having made that connection beforehand? I'm not sure if these kinds of warning should be given during install or on the download page though.


I agree, not that they would work any better than previous warnings or instructions. The question about why the fah client is only using one of many cpu cores is still ever present. :?

At least V7 will change the question. ;)
Please do not mistake my brevity as dispassion or condescension. I recognize the time you spend reading the forum is time you could use elsewhere, so my short responses save you time. Please do not hesitate to ask for clarification if I was too terse.
User avatar
7im
 
Posts: 11289
Joined: Thu Nov 29, 2007 4:30 pm
Location: Arizona

Re: FAHClient V7.1.48 released (6th Open-Beta)

Postby MtM » Tue Feb 21, 2012 11:45 pm

@Jesse

Well let's say that with my pets and hair/dust, I need to clean my fan's every couple of weeks. If I do not, and also not fold, there is no problem. If I forget and keep folding, my gpu's end up going into throttle down mode because they reach over 105c ( and yes this has happened, ashamed to admit it but it did ). So I would say it is a concern because there will be a group of people ( like other people I fix problems for, who all seem to think cleaning heatsinks is not something which falls under maintenance ) which might get concerned, and in some extreme cases it might be with good reason. A disclaimer is not meant to cover all possible cases, it's mostly meant to cover corner cases like this. Though, it should be worded so it doesn't reflect to these corner cases as that is to alarming and will scare people without good reason, it should be noted that it causes higher temperatures. But how exactly, not sure.

@7im, I know... But they should still be there, the notices. The amount of people asking about things already mentioned might impress, I still believe most people would read them. If you take active cpu's into account, the amount of questions suddenly doesn't seem that high.

Also, it's in another category. SMP not using all cores if you don't supply the flag is not the same as experiencing a negative side effect and being able to say 'I was not warned in any way this could happen'. That's a big difference.

Changing questions indicate progress, and progress is good :)
MtM
 
Posts: 3233
Joined: Fri Jun 27, 2008 2:20 pm
Location: The Netherlands

Re: FAHClient V7.1.48 released (6th Open-Beta)

Postby Jesse_V » Wed Feb 22, 2012 12:04 am

Here's a possible bug: when I pause all slots all the progress bars go to 0%. Shouldn't they remember their value? I don't recall this happening in previous versions. I'd speculate that it's because the progress bar is now tied to the core more than it was before, so when the core shuts down the values go to zero, but that's a guess.
User avatar
Jesse_V
 
Posts: 2169
Joined: Mon Jul 18, 2011 4:44 am
Location: Big Lake, Alaska

Re: FAHClient V7.1.48 released (6th Open-Beta)

Postby 7im » Wed Feb 22, 2012 12:18 am

Jesse_V wrote:Here's a possible bug: when I pause all slots all the progress bars go to 0%. Shouldn't they remember their value? I don't recall this happening in previous versions. I'd speculate that it's because the progress bar is now tied to the core more than it was before, so when the core shuts down the values go to zero, but that's a guess.


Bug or behavior change? And you didn't mention whether the bar resumes when you resume folding? Did it? Was it the right value?

They would need all of that info for a bug report, if it's a bug.
User avatar
7im
 
Posts: 11289
Joined: Thu Nov 29, 2007 4:30 pm
Location: Arizona

Re: FAHClient V7.1.48 released (6th Open-Beta)

Postby MtM » Wed Feb 22, 2012 12:34 am

pause -> 0 percent -> start -> 99.9 percent ( 2 seconds ) -> last progress / actual progress

I don't like it, progress bars should retain value when paused.
MtM
 
Posts: 3233
Joined: Fri Jun 27, 2008 2:20 pm
Location: The Netherlands

PreviousNext

Return to Archive: V7.1.52 Windows/Linux (previous release)

Who is online

Users browsing this forum: No registered users and 0 guests