Ticket UUID: | 947693 | |||
Title: | Closing a non-blocking pipe can block on Win | |||
Type: | Bug | Version: | obsolete: 8.4.6 | |
Submitter: | stevebold | Created on: | 2004-05-04 13:37:46 | |
Subsystem: | 25. Channel System | Assigned To: | davygrvy | |
Priority: | 5 Medium | Severity: | ||
Status: | Closed | Last Modified: | 2004-11-24 04:26:12 | |
Resolution: | Fixed | Closed By: | davygrvy | |
Closed on: | 2004-11-23 21:26:12 | |||
Description: |
If I create a command channel and configure it as non- blocking,what should happen if I close the channel while the child process continues to run? It seems reasonable that a non-blocking channel should not block on close, and the manual states that flushing for a non-blocking channel occurs in the background. I admit it does not explicitly state that close cannot block in such a case. The effect I am seeing using is that on Solaris 2.8 close does return immediately but on Windows XP it blocks. The traceback at which it blocks is the following: WaitForSingleObject Tcl_WaitPid(Tcl_Pid_ * 0x00000790, int * 0x0026f1c4, int 0) line 2537 TclCleanupChildren(Tcl_Interp * 0x00979788, int 1, Tcl_Pid_ * * 0x00996240, Tcl_Channel_ * 0x00995770) line 294 + 21 bytes PipeClose2Proc(void * 0x00996338, Tcl_Interp * 0x00979788, int 0) line 2065 + 27 bytes CloseChannel(Tcl_Interp * 0x00979788, Channel * 0x00994c28, int 0) line 2277 + 22 bytes FlushChannel(Tcl_Interp * 0x00979788, Channel * 0x00994c28, int 0) line 2182 + 17 bytes Tcl_Close(Tcl_Interp * 0x00979788, Tcl_Channel_ * 0x00994c28) line 2582 + 15 bytes Tcl_UnregisterChannel(Tcl_Interp * 0x00979788, Tcl_Channel_ * 0x00994c28) line 847 + 13 bytes Tcl_CloseObjCmd(void * 0x00000000, Tcl_Interp * 0x00979788, int 2, Tcl_Obj * const * 0x0097a964) line 544 + 13 bytes TclEvalObjvInternal(Tcl_Interp * 0x00979788, int 2, Tcl_Obj * const * 0x0097a964, const char * 0x00000000, int 0, int 0) line 3084 + 25 bytes TclExecuteByteCode(Tcl_Interp * 0x00979788, ByteCode * 0x00995900) line 1401 + 33 bytes TclCompEvalObj(Tcl_Interp * 0x00979788, Tcl_Obj * 0x00993570) line 980 + 13 bytes Tcl_EvalObjEx(Tcl_Interp * 0x00979788, Tcl_Obj * 0x00993570, int 0) line 4004 + 13 bytes Tcl_CatchObjCmd(void * 0x00000000, Tcl_Interp * 0x00979788, int 2, Tcl_Obj * const * 0x0026fb58) line 254 + 18 bytes TclEvalObjvInternal(Tcl_Interp * 0x00979788, int 2, Tcl_Obj * const * 0x0026fb58, const char * 0x00992471, int 17, int 0) line 3084 + 25 bytes Tcl_EvalEx(Tcl_Interp * 0x00979788, const char * 0x00992278, int 522, int 0) line 3674 + 36 bytes Tcl_FSEvalFile(Tcl_Interp * 0x00979788, Tcl_Obj * 0x00993ac8) line 1603 + 19 bytes Tcl_Main(int 1, char * * 0x00976cac, int (Tcl_Interp *)* 0x00401005 _Tcl_AppInit) line 292 + 18 bytes main() line 117 + 19 bytes TCLSH84! mainCRTStartup + 227 bytes KERNEL32! 77e7eb69() Note that, even though the channel is non-blocking, TclCleanupChildren() calls Tcl_WaitPid() passing with argument options set to 0, i.e. the WNOHANG bit is clear so Tcl_WaitPid() does block. I originally encountered this problem with an rsh command that blocks. However, I can get the same effect just starting a tclsh. My test script is as follows: # Start test script proc getData {f} { if {[eof $f]} { set ::runFinished 1 after cancel $::runTimeoutId return } puts "getData [gets $f]" } proc runFailed {args} { set ::runFinished 1 puts "runFailed $args" } set timeout 5 set ::runData "" # Originally doing this with an 'rsh' command that failed to terminate. # set cmd "rsh dv45ixl ls" # For simplicity, can just start an interactive tclsh, this waits # for a command on stdin so it also hangs. set cmd "tclsh84" puts "Starting cmd $cmd" set f [open "|$cmd"] fconfigure $f -blocking 0 fileevent $f readable "getData $f" set ::runTimeoutId [after [expr $timeout * 1000] runFailed $cmd] vwait runFinished unset ::runFinished puts "closing command channel" catch {close $f} puts "Command channel closed" # End test script I originally encountered this using my own build of Tcl 8.4.3 and the traceback relates to this version. I have also reproduced the problem using ActiveTcl 8.4.6. | |||
User Comments: |
davygrvy added on 2004-11-24 04:26:12:
Logged In: YES user_id=7549 ok, no reports of problems.. closing. :) davygrvy added on 2004-11-09 11:12:39: Logged In: YES user_id=7549 patch committed to HEAD (r1.51 of tclWinPipe.c) Let's leave this open for a bit to see if any new issues arise because of this change. davygrvy added on 2004-10-07 05:17:04: Logged In: YES user_id=7549 The child process should see that its stdin for EOF, thus signalling to it that it should close. netstat.exe is one of many commandline windows apps that don't do the "normal" behavior. As adding [kill] sounds like a painful exercise in bureaucracy, I'll just add the dropping of an exitcode for non-blocking pipes to match the UNIX behavior. I expect that many existing scripts will have a problem with this, unfortunetly. dkf added on 2004-06-15 16:15:54: Logged In: YES user_id=79902 I'd suggest that [::tcl::WinKill] or something like that would not cause too much trouble (private name in Tcl's private area); it's a global [kill] command that causes much more trouble. stevebold added on 2004-06-11 23:13:35: Logged In: YES user_id=810219 The discussion of TIP 88 here: http://aspn.activestate.com/ASPN/Mail/Browse/Threaded/tcl- core/1252059 suggests that 'kill' is not likely to be accepted as the name for this and that settling on a name may be difficult. Would it be easier to separate a fix for the Windows behaviour of non-blocking close from the interface change to support process signalling/termination on Windows and UNIX? Also I'll record two relevant docs that I referenced on comp.lang.tcl: http://support.microsoft.com/default.aspx?scid=kb;en- us;178893 claims to offer a clean way to shutdown processes. However, it doesn't seem to work for processes launched from Tcl, where DETACHED_PROCESS is specified as a flag. http://www.microsoft.com/msj/0698/win320698.aspx Explains the helper process needed to make GenerateConsoleCtrlEvent() work and discusses a much nicer solution that is only available with Win 2000 and its successors. dkf added on 2004-06-08 02:50:50: Logged In: YES user_id=79902 Reminder: a new core command with a public name requires a TIP. Interim Workaround: call it ::tcl::WinKill or something like that. :) davygrvy added on 2004-06-06 09:06:38: Logged In: YES user_id=7549 new test case: set p [open "|netstat 1" r] fconfigure $p -blocking 0 read $p kill $p <- new to the core! close $p davygrvy added on 2004-06-06 09:04:22: File Added - 89645: patch.txt Logged In: YES user_id=7549 New patch ready. It includes a new [kill] command. Docs nor tests have been added yet. They'll come next. davygrvy added on 2004-06-03 05:00:01: Logged In: YES user_id=7549 Here's a good test case: set p [open "|netstat 1" r] fconfigure $p -blocking 0 read $p close $p davygrvy added on 2004-06-03 02:49:10: File Added - 89327: patch.txt davygrvy added on 2004-06-03 02:49:09: Logged In: YES user_id=7549 Attached is a patch that allows non-blocking close behavior, but as you'll see the child process is not closed or even zombied. davygrvy added on 2004-06-03 02:12:57: Logged In: YES user_id=7549 Could be, yes. If we do, we won't get an exitcode. It's arguable that the windows pipe driver isn't doing this correctly. It's been brought up before on c.l.t. I'll look into it. stevebold added on 2004-06-02 23:23:54: Logged In: YES user_id=810219 For comparison, using 8.4.3 on Solaris 2.8, waitpid() is called at the following stack trace: =>[1] waitpid(0x38b5, 0xffbee17c, 0x40, 0x0, 0x0, 0x0), at 0xff21a418 [2] Tcl_WaitPid(0x38b5, 0xffbee17c, 0x40, 0x38b5, 0x0, 0x0), at 0x97b58 [3] Tcl_ReapDetachedProcs(0xf80d0, 0xc0564, 0xc0564, 0x0, 0x21d9c, 0x9782c), at 0x898a8 [4] PipeCloseProc(0xf3088, 0xd4520, 0x0, 0x0, 0xde080, 0xdd7d8), at 0x978f8 [5] CloseChannel(0xd23f8, 0xe03b0, 0x0, 0xc0564, 0xe40f8, 0xd4520), at 0x7202c [6] FlushChannel(0xd4520, 0x0, 0x0, 0xe40f8, 0x0, 0xe03b0), at 0x71f0c [7] Tcl_Close(0x0, 0xe03b0, 0xc0564, 0xe03b0, 0xe40f8, 0xd4520), at 0x72430 [8] Tcl_UnregisterChannel(0xd4520, 0xe03b0, 0xe40f8, 0xc0564, 0xf80f0, 0xc0564), at 0x710e4 [9] Tcl_CloseObjCmd(0x0, 0xd4520, 0x2, 0xd7704, 0x791fc, 0xe4f08), at 0x79268 [10] TclEvalObjvInternal(0xd4520, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x329b0 [11] TclExecuteByteCode(0xd4520, 0xc355c, 0xc3564, 0xde0ed, 0x0, 0xd7704), at 0x5e83c [12] TclCompEvalObj(0x0, 0x16c, 0xd4520, 0xc0564, 0xde080, 0xdd7d8), at 0x5de88 [13] Tcl_EvalObjEx(0x0, 0xdd7d8, 0x20000, 0xc0564, 0x0, 0xd4520), at 0x33934 [14] Tcl_RecordAndEvalObj(0x0, 0xdd7d8, 0x20000, 0x20000, 0xd4520, 0x9), at 0x6d894 [15] Tcl_Main(0xc3554, 0xdcc80, 0xe7320, 0xc0564, 0x0, 0xd4520), at 0x1c110 [16] main(0x1, 0xffbeeb54, 0xffbeeb5c, 0xc3400, 0x0, 0x0), at 0x1bac0 The UNIX specific function PipeCloseProc() contains this: if (pipePtr->isNonBlocking || TclInExit()) { /* * If the channel is non-blocking or Tcl is being cleaned up, just * detach the children PIDs, reap them (important if we are in a * dynamic load module), and discard the errorFile. */ Tcl_DetachPids(pipePtr->numPids, pipePtr->pidPtr); Tcl_ReapDetachedProcs(); There is no corresponding logic in the Windows specific equivalent function PipeClose2Proc(). Should there be? davygrvy added on 2004-06-01 15:10:53: Logged In: YES user_id=7549 Should Tcl_WaitPid be called with WNOHANG when the channel is non-blocking? |
Attachments:
- patch.txt [download] added by davygrvy on 2004-06-06 09:04:22. [details]