Flush/delete a specific work queue?

Moderators: Site Moderators, PandeGroup

Flush/delete a specific work queue?

Postby BuddhaChu » Thu Apr 07, 2011 4:09 pm

I'm having an issue with the client on my Mac Mini. I have two work queues listed and one is folding away, working fine. The other one is attempting to send it's results flooding the log screen with all it's info making the log for the "good" WU hard to read.

I know there was a way to flush a work queue from the command line with the v6 and prior clients. From the logs, it looks like the WU has issues and the stream can't be read and the server refuses the connection so it's stuck at the "send results" stage. Is there a way I can get rid of the work queue that's giving me grief and get it out of the client and out of my life?

I'm looking to get rid of work queue 01:

Image

Log mess:

Code: Select all
15:00:22:Unit 00:
15:00:22:Unit 00:*------------------------------*
15:00:22:Unit 00:Folding@Home Gromacs SMP Core
15:00:22:Unit 00:Version 2.22 (May 7 2010)
15:00:22:Unit 00:
15:00:22:Unit 00:Preparing to commence simulation
15:00:22:Unit 00:- Ensuring status. Please wait.
15:00:22:Sending unit results: id:01 state:SEND project:6058 run:0 clone:187 gen:245 core:0xa3 unit:0x7fffca114d96de8800f500bb000017aa
15:00:22:Unit 01: Uploading 28.44KiB
15:00:22:Connecting to 171.64.65.54:8080
15:00:22:WARNING: Exception: Failed to send results to work server: Failed to read stream
15:00:22:Trying to send results to collection server
15:00:22:Unit 01: Uploading 28.44KiB
15:00:22:Connecting to 171.67.108.25:8080
15:00:22:WARNING: WorkServer connection failed on port 8080 trying 80
15:00:22:Connecting to 171.67.108.25:80
15:00:22:ERROR: Exception: Failed to connect to 171.67.108.25:80: Connection refused
15:00:27:Server connection id=1 on 0.0.0.0:36330 from 127.0.0.1:57275
15:00:31:Unit 00:- Looking at optimizations...
15:00:31:Unit 00:- Working with standard loops on this execution.
15:00:31:Unit 00:Examination of work files indicates 8 consecutive improper terminations of core.
15:00:32:Unit 00:- Expanded 1762587 -> 2249733 (decompressed 127.6 percent)
15:00:32:Unit 00:Called DecompressByteArray: compressed_data_size=1762587 data_size=2249733, decompressed_data_size=2249733 diff=0
15:00:32:Unit 00:- Digital signature verified
15:00:32:Unit 00:
15:00:32:Unit 00:Project: 6063 (Run 0, Clone 114, Gen 295)
15:00:32:Unit 00:
15:00:32:Unit 00:Entering M.D.
15:00:38:Unit 00:Using Gromacs checkpoints
15:00:39:Unit 00:Resuming from checkpoint
15:00:39:Unit 00:Verified 00/wudata_01.log
15:00:39:Unit 00:Verified 00/wudata_01.trr
15:00:39:Unit 00:Verified 00/wudata_01.edr
15:00:39:Unit 00:Completed 236084 out of 500000 steps  (47%)
15:01:22:Sending unit results: id:01 state:SEND project:6058 run:0 clone:187 gen:245 core:0xa3 unit:0x7fffca114d96de8800f500bb000017aa
15:01:22:Unit 01: Uploading 28.44KiB
15:01:22:Connecting to 171.64.65.54:8080
15:01:22:WARNING: Exception: Failed to send results to work server: Failed to read stream
15:01:22:Trying to send results to collection server
15:01:22:Unit 01: Uploading 28.44KiB
15:01:22:Connecting to 171.67.108.25:8080
15:01:22:WARNING: WorkServer connection failed on port 8080 trying 80
15:01:22:Connecting to 171.67.108.25:80
15:01:22:ERROR: Exception: Failed to connect to 171.67.108.25:80: Connection refused
15:02:59:Sending unit results: id:01 state:SEND project:6058 run:0 clone:187 gen:245 core:0xa3 unit:0x7fffca114d96de8800f500bb000017aa
15:02:59:Unit 01: Uploading 28.44KiB
15:02:59:Connecting to 171.64.65.54:8080
15:03:00:WARNING: Exception: Failed to send results to work server: Failed to read stream
15:03:00:Trying to send results to collection server
15:03:00:Unit 01: Uploading 28.44KiB
15:03:00:Connecting to 171.67.108.25:8080
15:03:00:WARNING: WorkServer connection failed on port 8080 trying 80
15:03:00:Connecting to 171.67.108.25:80
15:03:00:ERROR: Exception: Failed to connect to 171.67.108.25:80: Connection refused
15:05:37:Sending unit results: id:01 state:SEND project:6058 run:0 clone:187 gen:245 core:0xa3 unit:0x7fffca114d96de8800f500bb000017aa
15:05:37:Unit 01: Uploading 28.44KiB
15:05:37:Connecting to 171.64.65.54:8080
15:05:37:WARNING: Exception: Failed to send results to work server: Failed to read stream
15:05:37:Trying to send results to collection server
15:05:37:Unit 01: Uploading 28.44KiB
15:05:37:Connecting to 171.67.108.25:8080
15:05:37:WARNING: WorkServer connection failed on port 8080 trying 80
15:05:37:Connecting to 171.67.108.25:80
15:05:37:ERROR: Exception: Failed to connect to 171.67.108.25:80: Connection refused
15:09:51:Sending unit results: id:01 state:SEND project:6058 run:0 clone:187 gen:245 core:0xa3 unit:0x7fffca114d96de8800f500bb000017aa
15:09:51:Unit 01: Uploading 28.44KiB
15:09:51:Connecting to 171.64.65.54:8080
15:09:51:WARNING: Exception: Failed to send results to work server: Failed to read stream
15:09:51:Trying to send results to collection server
15:09:51:Unit 01: Uploading 28.44KiB
15:09:51:Connecting to 171.67.108.25:8080
15:09:51:WARNING: WorkServer connection failed on port 8080 trying 80
15:09:51:Connecting to 171.67.108.25:80
15:09:51:ERROR: Exception: Failed to connect to 171.67.108.25:80: Connection refused
15:15:12:Unit 00:Completed 240000 out of 500000 steps  (48%)
15:16:42:Sending unit results: id:01 state:SEND project:6058 run:0 clone:187 gen:245 core:0xa3 unit:0x7fffca114d96de8800f500bb000017aa
15:16:42:Unit 01: Uploading 28.44KiB
15:16:42:Connecting to 171.64.65.54:8080
15:16:42:WARNING: Exception: Failed to send results to work server: Failed to read stream
15:16:42:Trying to send results to collection server
15:16:42:Unit 01: Uploading 28.44KiB
15:16:42:Connecting to 171.67.108.25:8080
15:16:42:WARNING: WorkServer connection failed on port 8080 trying 80
15:16:42:Connecting to 171.67.108.25:80
15:16:43:ERROR: Exception: Failed to connect to 171.67.108.25:80: Connection refused
15:27:48:Sending unit results: id:01 state:SEND project:6058 run:0 clone:187 gen:245 core:0xa3 unit:0x7fffca114d96de8800f500bb000017aa
15:27:48:Unit 01: Uploading 28.44KiB
15:27:48:Connecting to 171.64.65.54:8080
15:27:48:WARNING: Exception: Failed to send results to work server: Failed to read stream
15:27:48:Trying to send results to collection server
15:27:48:Unit 01: Uploading 28.44KiB
15:27:48:Connecting to 171.67.108.25:8080
15:27:48:WARNING: WorkServer connection failed on port 8080 trying 80
15:27:48:Connecting to 171.67.108.25:80
15:27:48:ERROR: Exception: Failed to connect to 171.67.108.25:80: Connection refused
15:33:53:Unit 00:Completed 245000 out of 500000 steps  (49%)
15:45:45:Sending unit results: id:01 state:SEND project:6058 run:0 clone:187 gen:245 core:0xa3 unit:0x7fffca114d96de8800f500bb000017aa
15:45:45:Unit 01: Uploading 28.44KiB
15:45:45:Connecting to 171.64.65.54:8080
15:45:45:WARNING: Exception: Failed to send results to work server: Failed to read stream
15:45:45:Trying to send results to collection server
15:45:45:Unit 01: Uploading 28.44KiB
15:45:45:Connecting to 171.67.108.25:8080
15:45:45:WARNING: WorkServer connection failed on port 8080 trying 80
15:45:45:Connecting to 171.67.108.25:80
15:45:45:ERROR: Exception: Failed to connect to 171.67.108.25:80: Connection refused
15:52:43:Unit 00:Completed 250000 out of 500000 steps  (50%)
Last edited by BuddhaChu on Thu Apr 07, 2011 5:12 pm, edited 1 time in total.
BuddhaChu
 
Posts: 152
Joined: Wed Apr 16, 2008 2:38 am

Re: Flush/delete a specific work queue?

Postby 7im » Thu Apr 07, 2011 4:17 pm

Please do not mistake my brevity as dispassion or condescension. I recognize the time you spend reading the forum is time you could use elsewhere, so my short responses save you time. Please do not hesitate to ask for clarification if I was too terse.
User avatar
7im
 
Posts: 13326
Joined: Thu Nov 29, 2007 4:30 pm
Location: Arizona

Re: Flush/delete a specific work queue?

Postby BuddhaChu » Thu Apr 07, 2011 4:36 pm

Thank you sir! I see the entry on the main FAHClient wiki page now for the flag/config differences.

EDIT: Command didn't work. Tried the following two commands and neither got rid of the WU.

Code: Select all
./FAHClient --dump 1
./FAHClient --dump 01


Looks like something happened..."slot 00" is ready, but I took the command for the 01 work queue and it still exists according to the GUI output.
Code: Select all
buddhas-mac-mini:MacOS buddha$ ./FAHClient --dump 1
16:39:30:************************* Folding@home Client *************************
16:39:30:    Website: http://folding.stanford.edu/
16:39:30:  Copyright: (c) 2009,2010 Stanford University
16:39:30:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
16:39:30:       Args: --dump 1
16:39:30:     Config: <none>
16:39:30:******************************** Build ********************************
16:39:30:    Version: 7.1.21
16:39:30:       Date: Mar 23 2011
16:39:30:       Time: 16:13:32
16:39:30:    SVN Rev: 2883
16:39:30:     Branch: fah/trunk/client
16:39:30:   Compiler: GNU 4.2.1 (Apple Inc. build 5664)
16:39:30:    Options: -std=gnu++98 -O3 -funroll-loops -mfpmath=sse -ffast-math
16:39:30:             -fno-unsafe-math-optimizations -msse3
16:39:30:   Platform: darwin 10.5.0
16:39:30:       Bits: 64
16:39:30:       Mode: Release
16:39:30:******************************* System ********************************
16:39:30:         OS: Mac OS X 10.6.7 (Build 10J869)
16:39:30:        CPU: Intel(R) Core(TM)2 Duo CPU P7350 @ 2.00GHz
16:39:30:     CPU ID: GenuineIntel Family 6 Model 23 Stepping 10
16:39:30:       CPUs: 2
16:39:30:     Memory: 4.00GiB
16:39:30:Free Memory: 2.70GiB
16:39:30:    Threads: POSIX_THREADS
16:39:30:       GPUs: 0
16:39:30:       CUDA: Not detected
16:39:30: On Battery: false
16:39:30: UTC offset: -5
16:39:30:        PID: 10906
16:39:30:        CWD: /Applications/Folding@home Client.app/Contents/MacOS
16:39:30:***********************************************************************
16:39:30:Enabled folding slot 00: READY smp:2
16:39:30:Unit processing completed
16:39:31:Clean exit



buddhas-mac-mini:MacOS buddha$ ./FAHClient --dump 01
16:42:37:************************* Folding@home Client *************************
16:42:37:    Website: http://folding.stanford.edu/
16:42:37:  Copyright: (c) 2009,2010 Stanford University
16:42:37:     Author: Joseph Coffland <joseph@cauldrondevelopment.com>
16:42:37:       Args: --dump 01
16:42:37:     Config: <none>
16:42:37:******************************** Build ********************************
16:42:37:    Version: 7.1.21
16:42:37:       Date: Mar 23 2011
16:42:37:       Time: 16:13:32
16:42:37:    SVN Rev: 2883
16:42:37:     Branch: fah/trunk/client
16:42:37:   Compiler: GNU 4.2.1 (Apple Inc. build 5664)
16:42:37:    Options: -std=gnu++98 -O3 -funroll-loops -mfpmath=sse -ffast-math
16:42:37:             -fno-unsafe-math-optimizations -msse3
16:42:37:   Platform: darwin 10.5.0
16:42:37:       Bits: 64
16:42:37:       Mode: Release
16:42:37:******************************* System ********************************
16:42:37:         OS: Mac OS X 10.6.7 (Build 10J869)
16:42:37:        CPU: Intel(R) Core(TM)2 Duo CPU P7350 @ 2.00GHz
16:42:37:     CPU ID: GenuineIntel Family 6 Model 23 Stepping 10
16:42:37:       CPUs: 2
16:42:37:     Memory: 4.00GiB
16:42:37:Free Memory: 2.69GiB
16:42:37:    Threads: POSIX_THREADS
16:42:37:       GPUs: 0
16:42:37:       CUDA: Not detected
16:42:37: On Battery: false
16:42:37: UTC offset: -5
16:42:37:        PID: 10923
16:42:37:        CWD: /Applications/Folding@home Client.app/Contents/MacOS
16:42:37:***********************************************************************
16:42:37:Enabled folding slot 00: READY smp:2
16:42:37:Unit processing completed
16:42:37:Clean exit



I'm guessing I'm taking the command with the correct syntax. This is the output from the 'FAHClient --help' command line.

Code: Select all
  --dump                  <string>                Dump either 'all' or a specific work unit,
                                                  identified by its queue ID, then exit.
BuddhaChu
 
Posts: 152
Joined: Wed Apr 16, 2008 2:38 am

Re: Flush/delete a specific work queue?

Postby BuddhaChu » Thu Apr 07, 2011 5:19 pm

What does this cryptic note in the wiki for the --dump switch mean?

"the unit will not actually be removed until the WS can be notified."

According to my logs, I can't talk to the work/collection server, so is this work queue "stuck" indefinitely?
BuddhaChu
 
Posts: 152
Joined: Wed Apr 16, 2008 2:38 am

Re: Flush/delete a specific work queue?

Postby 7im » Thu Apr 07, 2011 5:24 pm

No, it should go away when the WU expires, or talks to the Work Server.

Please post the section of the log showing the communication problem with the work server.
User avatar
7im
 
Posts: 13326
Joined: Thu Nov 29, 2007 4:30 pm
Location: Arizona

Re: Flush/delete a specific work queue?

Postby GreyWhiskers » Thu Apr 07, 2011 5:31 pm

I had similar issue on my Windows XP installation. I'm not sure how these topics will carry over to your Mac Mini, though.

Bruce gave me the advice that allowed me to successfully dump the two "dead" queue items and proceed. (you might want to look at the thread http://foldingforum.org/viewtopic.php?f=19&t=18220&p=182633#p182633)

Re: Project: 6511 (Run 0, Clone 312, Gen 20) EUE at 2%.
by bruce » Wed Apr 06, 2011 12:51 pm

GreyWhiskers wrote:
how do I get rid of the two WUs that the client is repetitively trying to upload?



You have two WUs you need to get rid of: WorkQueue 01 and 00.

Assuming Windows, go to the Start+All_Programs and find the shortcut for FAHClient. Copy it to the desktop. Edit the Target by adding a parameter string to the extreme right end of whatever is already there. ...FAHClient.exe" --dump 1. Stop the production copy of FAHClient and run that shortcut. Replace 1 with 0 and run it again.



Re: Project: 6511 (Run 0, Clone 312, Gen 20) EUE at 2%.
by GreyWhiskers » Wed Apr 06, 2011 3:41 pm

@Bruce. Perfect. The modified shortcut cleaned up the offending Work Queue items, and when I restarted the FAHControl, it went right back to processing the WU it was working on before. Of course, I had to re'add the connection to the other computers I have running V7, but its all OK now.


Edited opening sentence to add ack of Mac vs Windows.
User avatar
GreyWhiskers
 
Posts: 780
Joined: Mon Oct 25, 2010 5:57 am
Location: Saratoga, California USA

Re: Flush/delete a specific work queue?

Postby codysluder » Thu Apr 07, 2011 6:14 pm

7im wrote:No, it should go away when the WU expires, or talks to the Work Server.

Please post the section of the log showing the communication problem with the work server.

I see three issues here.

The OP is asking how to read the log when it's obliterated by all the error messages. That's probably worthy of a ticket, if there isn't already one open.

Second, V7 needs better communications skills. It seems to have more understandable error messages, but there are a lot more current topics with people having trouble uploading V7 results than people having trouble uploading V6 results so V7 must be more picky. I do believe that there are still a lot more people uploading V6 results than V7 results so on a per WU basis, V7 is failing more.

Third, we'll have to get a Mac expert to translate bruce's comment from MSWindows to MacOS (and I only speak "Windows').
codysluder
 
Posts: 2222
Joined: Sun Dec 02, 2007 12:43 pm

Re: Flush/delete a specific work queue?

Postby codysluder » Thu Apr 07, 2011 6:25 pm

A couple of things you might try:

Select the WU that's in SEND in the context menu (Windows right-click; no idea how to do it in OSX) select Pause. This may not work, though, because I think you can only Select slots, not WUs. Maybe the big Pause button at the top will suspend both and then you can select just the slot itself and pick Fold in the context menu.

Another thing to try that might get rid of the WU: Select the Running WU and select Finish. Once it's done and it (hopefully) uploads, go to Configure Slots and delete the current slot and then add a new one exactly like it. That might discard WUs that are trying to upload.

The right way, though, is to figure out how to use --dump N
codysluder
 
Posts: 2222
Joined: Sun Dec 02, 2007 12:43 pm

Re: Flush/delete a specific work queue?

Postby BuddhaChu » Thu Apr 07, 2011 6:33 pm

7im: comm logs are in the first post (I had a screen cap there at first, then swapped it out for the actual text)

Grey & Cody: I took those commands already. Output is in the third post in this thread. I stopped all F@H processes, then took the command with the dump switch in the terminal on the command line.

Cody: I agree about the logs getting mixed together in the window. I have some client feedback I wrote down after installing the client last week, but havn't posted it yet. My feedback was going to be about two slots (ex: a GPU slot and a CPU slot) using the same window and things getting hard to follow/troubleshoot. Looks like two work queue logs sharing the same log windows is hard to follow as well.
BuddhaChu
 
Posts: 152
Joined: Wed Apr 16, 2008 2:38 am

Re: Flush/delete a specific work queue?

Postby bruce » Thu Apr 07, 2011 6:55 pm

I think maybe you guys are talking about Ticket #157
bruce
Site Admin
 
Posts: 16869
Joined: Thu Nov 29, 2007 10:13 pm
Location: So. Cal.

Re: Flush/delete a specific work queue?

Postby BuddhaChu » Thu Apr 07, 2011 7:28 pm

In regards to the muddy logs by slot, yes. I made the comment that multiple work queues with a queue being "chatty" also makes it hard to use the log output to troubleshoot as well. That's not in bug #157. Maybe someone should add that as a comment?

https://fah-web.stanford.edu/projects/F ... ticket/157
BuddhaChu
 
Posts: 152
Joined: Wed Apr 16, 2008 2:38 am

Re: Flush/delete a specific work queue?

Postby BuddhaChu » Fri Apr 08, 2011 7:00 am

I sorted out my original problem the old fashioned way. With the F@H client shut down, I dragged the /Users/buddha/Library/Application Support/FAHClient//work/01 directory to the trash.

WU gone from client & problem solved.
BuddhaChu
 
Posts: 152
Joined: Wed Apr 16, 2008 2:38 am


Return to V7.1.52 Windows/Linux

Who is online

Users browsing this forum: No registered users and 0 guests