Tcl Source Code

View Ticket
Login
Ticket UUID: 947693
Title: Closing a non-blocking pipe can block on Win
Type: Bug Version: obsolete: 8.4.6
Submitter: stevebold Created on: 2004-05-04 13:37:46
Subsystem: 25. Channel System Assigned To: davygrvy
Priority: 5 Medium Severity:
Status: Closed Last Modified: 2004-11-24 04:26:12
Resolution: Fixed Closed By: davygrvy
    Closed on: 2004-11-23 21:26:12
Description:
If I create a command channel and configure it as non-
blocking,what should
happen if I close the channel while the child process 
continues to run?
It seems reasonable that a non-blocking channel should 
not block on close,
and the manual states that flushing for a non-blocking 
channel occurs in
the background. I admit it does not explicitly state that 
close cannot block in
such a case. 

The effect I am seeing using is that on Solaris 2.8 close 
does return
immediately but on Windows XP it blocks.

The traceback at which it blocks is the following:

   WaitForSingleObject
   Tcl_WaitPid(Tcl_Pid_ * 0x00000790, int * 0x0026f1c4, 
int 0) line 2537
   TclCleanupChildren(Tcl_Interp * 0x00979788, int 1, 
Tcl_Pid_ * * 0x00996240, Tcl_Channel_ * 0x00995770) 
line 294 + 21 bytes
   PipeClose2Proc(void * 0x00996338, Tcl_Interp * 
0x00979788, int 0) line 2065 + 27 bytes
   CloseChannel(Tcl_Interp * 0x00979788, Channel * 
0x00994c28, int 0) line 2277 + 22 bytes
   FlushChannel(Tcl_Interp * 0x00979788, Channel * 
0x00994c28, int 0) line 2182 + 17 bytes
   Tcl_Close(Tcl_Interp * 0x00979788, Tcl_Channel_ * 
0x00994c28) line 2582 + 15 bytes
   Tcl_UnregisterChannel(Tcl_Interp * 0x00979788, 
Tcl_Channel_ * 0x00994c28) line 847 + 13 bytes
   Tcl_CloseObjCmd(void * 0x00000000, Tcl_Interp * 
0x00979788, int 2, Tcl_Obj * const * 0x0097a964) line 
544 + 13 bytes
   TclEvalObjvInternal(Tcl_Interp * 0x00979788, int 2, 
Tcl_Obj * const * 0x0097a964, const char * 
0x00000000, int 0, int 0) line 3084 + 25 bytes
   TclExecuteByteCode(Tcl_Interp * 0x00979788, 
ByteCode * 0x00995900) line 1401 + 33 bytes
   TclCompEvalObj(Tcl_Interp * 0x00979788, Tcl_Obj * 
0x00993570) line 980 + 13 bytes
   Tcl_EvalObjEx(Tcl_Interp * 0x00979788, Tcl_Obj * 
0x00993570, int 0) line 4004 + 13 bytes
   Tcl_CatchObjCmd(void * 0x00000000, Tcl_Interp * 
0x00979788, int 2, Tcl_Obj * const * 0x0026fb58) line 
254 + 18 bytes
   TclEvalObjvInternal(Tcl_Interp * 0x00979788, int 2, 
Tcl_Obj * const * 0x0026fb58, const char * 
0x00992471, int 17, int 0) line 3084 + 25 bytes
   Tcl_EvalEx(Tcl_Interp * 0x00979788, const char * 
0x00992278, int 522, int 0) line 3674 + 36 bytes
   Tcl_FSEvalFile(Tcl_Interp * 0x00979788, Tcl_Obj * 
0x00993ac8) line 1603 + 19 bytes
   Tcl_Main(int 1, char * * 0x00976cac, int (Tcl_Interp 
*)* 0x00401005 _Tcl_AppInit) line 292 + 18 bytes
   main() line 117 + 19 bytes
   TCLSH84! mainCRTStartup + 227 bytes
   KERNEL32! 77e7eb69()

Note that, even though the channel is non-blocking, 
TclCleanupChildren()
calls Tcl_WaitPid() passing with argument options set to 
0, i.e. the WNOHANG
bit is clear so Tcl_WaitPid() does block.

I originally encountered this problem with an rsh 
command that blocks.
However, I can get the same effect just starting a 
tclsh. My test script
is as follows:

# Start test script

proc getData {f} {
   if {[eof $f]} {
      set ::runFinished 1
      after cancel $::runTimeoutId
      return
   }

   puts "getData [gets $f]"
}


proc runFailed {args} {
   set ::runFinished 1
   puts "runFailed $args"
}

set timeout 5
set ::runData ""

# Originally doing this with an 'rsh' command that failed 
to terminate.
# set cmd "rsh dv45ixl ls"

# For simplicity, can just start an interactive tclsh, this 
waits
# for a command on stdin so it also hangs.
set cmd "tclsh84"

puts "Starting cmd $cmd"
set f [open "|$cmd"]
fconfigure $f -blocking 0
fileevent $f readable "getData $f"
set ::runTimeoutId [after [expr $timeout * 1000] 
runFailed $cmd]
vwait runFinished
unset ::runFinished

puts "closing command channel"
catch {close $f}
puts "Command channel closed"

# End test script

I originally encountered this using my own build of Tcl 
8.4.3 and the traceback relates to this version.
I have also reproduced the problem using ActiveTcl 
8.4.6.
User Comments: davygrvy added on 2004-11-24 04:26:12:
Logged In: YES 
user_id=7549

ok, no reports of problems.. closing. :)

davygrvy added on 2004-11-09 11:12:39:
Logged In: YES 
user_id=7549

patch committed to HEAD (r1.51 of tclWinPipe.c)

Let's leave this open for a bit to see if any new issues
arise because of this change.

davygrvy added on 2004-10-07 05:17:04:
Logged In: YES 
user_id=7549

The child process should see that its stdin for EOF, thus
signalling to it that it should close.  netstat.exe is one
of many commandline windows apps that don't do the "normal"
behavior.

As adding [kill] sounds like a painful exercise in
bureaucracy, I'll just add the dropping of an exitcode for
non-blocking pipes to match the UNIX behavior.

I expect that many existing scripts will have a problem with
this, unfortunetly.

dkf added on 2004-06-15 16:15:54:
Logged In: YES 
user_id=79902

I'd suggest that [::tcl::WinKill] or something like that
would not cause too much trouble (private name in Tcl's
private area); it's a global [kill] command that causes much
more trouble.

stevebold added on 2004-06-11 23:13:35:
Logged In: YES 
user_id=810219

The discussion of TIP 88 here:

http://aspn.activestate.com/ASPN/Mail/Browse/Threaded/tcl-
core/1252059

suggests that 'kill' is not likely to be accepted as the name
for this and that settling on a name may be difficult. Would
it be easier to separate a fix for the Windows behaviour of
non-blocking close from the interface change to support
process signalling/termination on Windows and UNIX?


Also I'll record two relevant docs that I referenced on 
comp.lang.tcl:

http://support.microsoft.com/default.aspx?scid=kb;en-
us;178893

claims to offer a clean way to shutdown processes. However,
it doesn't seem to work for processes launched from Tcl,
where DETACHED_PROCESS is specified as a flag.

http://www.microsoft.com/msj/0698/win320698.aspx

Explains the helper process needed to make
GenerateConsoleCtrlEvent() work and discusses a much nicer
solution that is only available with Win 2000 and its
successors.

dkf added on 2004-06-08 02:50:50:
Logged In: YES 
user_id=79902

Reminder: a new core command with a public name requires a TIP.

Interim Workaround: call it ::tcl::WinKill or something like
that. :)

davygrvy added on 2004-06-06 09:06:38:
Logged In: YES 
user_id=7549

new test case:

set p [open "|netstat 1" r]
fconfigure $p -blocking 0
read $p
kill $p            <- new to the core!
close $p

davygrvy added on 2004-06-06 09:04:22:

File Added - 89645: patch.txt

Logged In: YES 
user_id=7549

New patch ready.  It includes a new [kill] command.  Docs nor 
tests have been added yet.  They'll come next.

davygrvy added on 2004-06-03 05:00:01:
Logged In: YES 
user_id=7549

Here's a good test case:

set p [open "|netstat 1" r]
fconfigure $p -blocking 0
read $p
close $p

davygrvy added on 2004-06-03 02:49:10:

File Added - 89327: patch.txt

davygrvy added on 2004-06-03 02:49:09:
Logged In: YES 
user_id=7549

Attached is a patch that allows non-blocking close behavior, 
but as you'll see the child process is not closed or even 
zombied.

davygrvy added on 2004-06-03 02:12:57:
Logged In: YES 
user_id=7549

Could be, yes.  If we do, we won't get an exitcode.  It's 
arguable that the windows pipe driver isn't doing this 
correctly.  It's been brought up before on c.l.t.

I'll look into it.

stevebold added on 2004-06-02 23:23:54:
Logged In: YES 
user_id=810219

For comparison, using 8.4.3 on Solaris 2.8, waitpid() is called
at the following stack trace:

=>[1] waitpid(0x38b5, 0xffbee17c, 0x40, 0x0, 0x0, 0x0), at 
0xff21a418 
  [2] Tcl_WaitPid(0x38b5, 0xffbee17c, 0x40, 0x38b5, 0x0, 
0x0), at 0x97b58 
  [3] Tcl_ReapDetachedProcs(0xf80d0, 0xc0564, 0xc0564, 
0x0, 0x21d9c, 0x9782c), at 0x898a8 
  [4] PipeCloseProc(0xf3088, 0xd4520, 0x0, 0x0, 0xde080, 
0xdd7d8), at 0x978f8 
  [5] CloseChannel(0xd23f8, 0xe03b0, 0x0, 0xc0564, 0xe40f8, 
0xd4520), at 0x7202c 
  [6] FlushChannel(0xd4520, 0x0, 0x0, 0xe40f8, 0x0, 
0xe03b0), at 0x71f0c 
  [7] Tcl_Close(0x0, 0xe03b0, 0xc0564, 0xe03b0, 0xe40f8, 
0xd4520), at 0x72430 
  [8] Tcl_UnregisterChannel(0xd4520, 0xe03b0, 0xe40f8, 
0xc0564, 0xf80f0, 0xc0564), at 0x710e4 
  [9] Tcl_CloseObjCmd(0x0, 0xd4520, 0x2, 0xd7704, 0x791fc, 
0xe4f08), at 0x79268 
  [10] TclEvalObjvInternal(0xd4520, 0x0, 0x0, 0x0, 0x0, 0x0), 
at 0x329b0 
  [11] TclExecuteByteCode(0xd4520, 0xc355c, 0xc3564, 
0xde0ed, 0x0, 0xd7704), at 0x5e83c 
  [12] TclCompEvalObj(0x0, 0x16c, 0xd4520, 0xc0564, 
0xde080, 0xdd7d8), at 0x5de88 
  [13] Tcl_EvalObjEx(0x0, 0xdd7d8, 0x20000, 0xc0564, 0x0, 
0xd4520), at 0x33934 
  [14] Tcl_RecordAndEvalObj(0x0, 0xdd7d8, 0x20000, 
0x20000, 0xd4520, 0x9), at 0x6d894 
  [15] Tcl_Main(0xc3554, 0xdcc80, 0xe7320, 0xc0564, 0x0, 
0xd4520), at 0x1c110 
  [16] main(0x1, 0xffbeeb54, 0xffbeeb5c, 0xc3400, 0x0, 0x0), 
at 0x1bac0 

The UNIX specific function PipeCloseProc() contains this:


    if (pipePtr->isNonBlocking || TclInExit()) {
    
   /*
         * If the channel is non-blocking or Tcl is being cleaned 
up, just
         * detach the children PIDs, reap them (important if we 
are in a
         * dynamic load module), and discard the errorFile.
         */
        
        Tcl_DetachPids(pipePtr->numPids, pipePtr->pidPtr);
        Tcl_ReapDetachedProcs();


There is no corresponding logic in the Windows specific 
equivalent function
PipeClose2Proc(). Should there be?

davygrvy added on 2004-06-01 15:10:53:
Logged In: YES 
user_id=7549

Should Tcl_WaitPid be called with WNOHANG when the 
channel is non-blocking?

Attachments: