Tcl Source Code

View Ticket
Login
Ticket UUID: 2270477
Title: TclFinalizeIOSubsystem can loop endlessly
Type: Bug Version: obsolete: 8.5.5
Submitter: nobody Created on: 2008-11-12 16:34:46
Subsystem: 25. Channel System Assigned To: andreas_kupries
Priority: 9 Immediate Severity:
Status: Closed Last Modified: 2008-12-03 01:25:18
Resolution: Fixed Closed By: andreas_kupries
    Closed on: 2008-12-02 18:25:18
Description:
TclFinalizeIOSubsystem contains a loop which runs until all channels in the linked list have the CHANNEL_DEAD flag set.

If any channels are not referenced by interpreters, the loop calls Tcl_Close() to close the channel and, presumably, remove it from the linked list.  However, under some circumstances, Tcl_Close does not remove the channel from the list, and instead schedules it for removal at a later time.  If this happens, TclFinalizeIOSubsystem will spin forever.

I have seen this happen when expect exits while running a test suite under dejagnu.  Unfortunately, I do not have a simple reproduction case.  But perhaps the loop should guard against this case by checking for CHANNEL_CLOSED as well as CHANNEL_DEAD?  I really don't understand the design well enough to be sure.
User Comments: andreas_kupries added on 2008-12-03 01:25:15:
Done now.

andreas_kupries added on 2008-12-02 06:36:40:
Ok, I will, albeit tomorrow as well.

ferrieux added on 2008-12-02 06:29:44:
You inferred correctly. In hindsight, that first attempt of mine was terrible. The second attempt is after spending some time reverse engineering that stuff, so I _think_ it is much more gentle. To be sure that it fixes the original issue is another story though (suggestions welcome to repro).

However I'd prefer that you do the commit, because I have only a 8.6 sandbox ready, and it is cluttered with other patches (notably the half close)... and it's being late :-)

andreas_kupries added on 2008-12-02 05:32:27:
Alexandre's, from your comments I infer that we should apply iofinloop2.patch to the 8.5 and 8.6 branches as well, despite them not crashing, at least not right now. Right ? If yes, can you do that ?

ferrieux added on 2008-12-02 05:04:23:
Yes :)
Notice that the second hunk of Andreas' modified patch only applies to the latest 8.4.x (where he had applied my unfortunate first trial). If you want to apply to 8.4.19, remove that second hunk. The only thing left is:

--- tcl84.orig/generic/tclIO.c2008-12-01 13:36:03.000000000 -0800
+++ tcl84/generic/tclIO.c2008-12-01 13:32:36.000000000 -0800
@@ -235,7 +235,7 @@
      statePtr != NULL;
      statePtr = statePtr->nextCSPtr) {
     chanPtr = statePtr->topChanPtr;
-    if (!(statePtr->flags & CHANNEL_DEAD)) {
+    if (!(statePtr->flags & (CHANNEL_INCLOSE|CHANNEL_CLOSED|CHANNEL_DEAD))) {
 active = 1;
 break;
     }

andreas_kupries added on 2008-12-02 04:40:27:

File Added - 303654: iofinloop84.patch

andreas_kupries added on 2008-12-02 04:26:22:
AFAIK the original patch was for 8.6, and applied to 8.5 as well.
Both branches use macros to handle flag modifications.
In 8.4 the flag modifications are written out.
Manual application is needed :(

dgp added on 2008-12-02 04:21:50:
Something's wrong, or I'm missing something.

These patches don't appear to apply to Tcl 8.4.19+,
which is where I have the crashes.

andreas_kupries added on 2008-12-02 00:15:36:
I can certainly revert ... Ah, I see a new patch was done. 

The interesting thing is that I ran the test suite here (linux) after applying the patch and had no errors at all. For all branches. I am guessing that you (Don) are either on a different platform, or copiled the core with different settings than I did.

Which also means that right I am not a good place to check the new patch. I will not know if my error-less state is because the new patch fixed anything or because whatever else.

Don, can you please try Alexandre's new patch and report if that fixes the issue for you ?

ferrieux added on 2008-12-02 00:08:25:
The attached patch, now done after a more careful review of the code ;-), should do the job.
Indeed the previous one was dangerous, in that it could truly demote channels from already-closed to pending-close...

This one, as suggested by the OP in fact, skips any of: CHANNEL_CLOSED,CHANNEL_INCLOSE,CHANNEL_DEAD.
It does pass the unix test suite !

File Added: iofinloop2.patch

ferrieux added on 2008-12-02 00:08:24:

File Added - 303625: iofinloop2.patch

ferrieux added on 2008-12-01 23:19:37:
Reopening because the patch breaks some tests on unix, see [Bug 2371600].
Andreas, can you revert on all branches ?

andreas_kupries added on 2008-11-26 05:18:49:
Accepted, and committed to 8.4, 8.5, and head branches.

ferrieux added on 2008-11-17 07:48:19:

File Added - 301722: iofinloop.patch

Indeed I'm wondering how the loop is supposed to break out when that branch is activated.
It would seem that the attached patch, setting the CHANNEL_DEAD flag also in this case, would fix the bug. It does pass the test suite, though I don't know whether finalization is really covered in today's tests.

Andreas, please review.




File Added: iofinloop.patch

Attachments: