Tcl Source Code

View Ticket
Login
Ticket UUID: 717848
Title: Notifier lock in sub-process on windows
Type: Bug Version: obsolete: 8.4.2
Submitter: andreas_kupries Created on: 2003-04-08 22:20:28
Subsystem: 01. Notifier Assigned To: patthoyts
Priority: 4 Severity:
Status: Open Last Modified: 2005-11-04 04:56:03
Resolution: Works For Me Closed By:
    Closed on:
Description:
When running the attached "main.tcl" script the sub 
process started by it ("sub.tcl") will cease processing 
any type of event [x].

Deactivation of the "fileevent readable" command 
in "sub.tcl" causes the problem to disappear.

[x] The problem was initially caught in the testsuite for 
the pop3 module/package in tcllib. The command [info 
hostname] was called in the 'accept' command of a 
listening socket locked up and did not return. Trying to 
defer the offending code via [after] failed. The after was 
executed, but did not trigger anymore (despite the fact 
the application was in an eventloop). Executing [info 
hostname] before the 'fileevent readable' causes the 
lockup to disappear too.

Theories:
* [info hostname] is ok before [fileevent], caches the 
result and therefore fine afterward.
* [fileevent readable] messes up the notifier so bad that 
not even "after" events are processed anymore.
User Comments: davygrvy added on 2004-05-06 04:17:13:

File Added - 86203: sub.tcl

davygrvy added on 2004-05-06 04:17:12:
Logged In: YES 
user_id=7549

If I change sub.tcl according to the attachment, it doesn't 
lock up with 8.4.6.  Theory: fileevent readable stdin done, 
and proc done is not closing itself and tcl is forever calling 
[done] trying to close stdin eating all the CPU.

I'll leave this bug open, as I don't know what to do next 
about this.

bsdfan3 added on 2003-08-12 22:15:52:
Logged In: YES 
user_id=604556

How about sending a SIGCHLD (er, the windoze equivalent)to 
the parent process when the child is ready?

andreas_kupries added on 2003-04-09 23:38:58:
Logged In: YES 
user_id=75003

Good, second person to replicate this lockup, so definitely a 
reproducible bug (First was Jeff, after I was able to create this 
condensed version of the problem).


Right, immediately after open the server doesn't have to be 
up. We could insert a delay, say one second, in main.tcl to 
make this race unlikely (or loop the socket command until 
there is no connect error). ... In the original code the port of 
the listening socket was written back over the pipe, telling the 
main process when it was up. 

My rewrite of the testsuite makes now use of a special 
microserver which uses a second socket for the control 
connection, and the sub process connects back to the 
spawning process. This works around the lock. It also means 
that I know exactly when the process is up (= connect back 
has been done), and when it is has started to listen (= port 
number came back over the control connection).

davygrvy added on 2003-04-09 15:05:56:
Logged In: YES 
user_id=7549

Yup, happens here too.  Needs investigating.  It locks-up 
rather stiffly.

One comment on the example, though.  Just because [open] 
came back doesn't mean the server in the sub-process is up 
yet.  I don't think there is a guarentee of that, even though it 
looks to be happening.

andreas_kupries added on 2003-04-09 05:21:22:

File Added - 47169: sub.tcl

andreas_kupries added on 2003-04-09 05:20:28:

File Added - 47167: main.tcl

Attachments: