Tcl Source Code

View Ticket
Login
Ticket UUID: 3574493
Title: TclpFinalizeSockets hangs on windows
Type: Bug Version: obsolete: 8.5.12
Submitter: ralfixx Created on: 2012-10-04 15:06:15
Subsystem: 27. Channel Types Assigned To: dgp
Priority: 9 Immediate Severity:
Status: Closed Last Modified: 2012-11-08 00:30:53
Resolution: Fixed Closed By: dgp
    Closed on: 2012-11-07 17:30:53
Description:
This seems related to the change introduced by  ID: 1437595.

When running the attached code on Linux, the program opens the server
socket, waits one second and then exits as expected.

When running the same code on Windows-7 against tcl 8.3, the program
behaves as above on Linux.

When running the same code on Windows-7 against tcl 8.5 (Activestate, 
or non-threaded makes no difference) the program blocks
in the _exit() call.  

The reason is to be seen in TclFinalize() where on
Windows it waits for the socket thread to exit when unloading the TCL
DLL, but the corresponding thread either does no longer exist, or does
not get the Message, or does not respond to it:

win/tclWin32Dll.c:
  DLLMain()
        case DLL_PROCESS_DETACH:
        ...
    Tcl_Finalize();
=>
generic/TclEvent.c
 Tcl_Finalize()
   => Tcl_FinalizeThread()
      => TclFinalizeIOSubsystem();
         => TclpFinalizeSockets();

=>
win/tclWinSock.c
 TclpFinalizeSockets()
 => PostMessage(tsdPtr->hwnd, SOCKET_TERMINATE, 0, 0);
   // this blocks:
    WaitForSingleObject(tsdPtr->readyEvent, INFINITE);

The PostMessage() return code is 0 at that point, indicating an error, 
the corresponding error is 1400/invalid window handle.  
If I change the code to

   if (res) WaitForSingleObject(tsdPtr->readyEvent, INFINITE);

the program terminates as expected.  It seems an error to call
WaitForSingleObject() if the PostMessage() call did not succeed, since
the Wait() relies on the other thread to signal the Wait().  
If the other thread does not get the call, it cannot signal...

Proposed patch against tcl8.5.12:

--- tcl8.5.12/win/tclWinSock.c2012/09/18 18:43:471.1
+++ tcl8.5.12/win/tclWinSock.c2012/10/04 15:04:50
@@ -461,14 +461,14 @@
     if (tsdPtr != NULL) {
 if (tsdPtr->socketThread != NULL) {
     if (tsdPtr->hwnd != NULL) {
-PostMessage(tsdPtr->hwnd, SOCKET_TERMINATE, 0, 0);
+int res = PostMessage(tsdPtr->hwnd, SOCKET_TERMINATE, 0, 0);
 
 /*
  * Wait for the thread to exit. This ensures that we are
  * completely cleaned up before we leave this function.
  */
 
-WaitForSingleObject(tsdPtr->readyEvent, INFINITE);
+if (res) WaitForSingleObject(tsdPtr->readyEvent, INFINITE);
 tsdPtr->hwnd = NULL;
     }
     CloseHandle(tsdPtr->socketThread);
User Comments: dgp added on 2012-11-08 00:30:53:

allow_comments - 1

patched for 8.4.20 and 8.5.13.

dgp added on 2012-11-07 23:46:38:
apn points out to me that removing the Tcl_Finalize()
call from DllMain() is the solution on the Tcl trunk (8.6).

With Tcl 8.6, we apparently more explicitly put the
burden of finalization on the embedding program,
instead of trying to make automagic happen as the
Tcl Dll gets unloaded.

I'll commit the patch on the core-8-5-branch as an
additional layer of workaround, but programs like
the demo here also ought to be revised to take care
of Tcl finalization.

dgp added on 2012-11-07 23:25:32:
Thanks to apn for help debugging.  apn points to
docs saying synchronization calls are not welcome
in the routines called by DllMain().

So either Tcl finalization ought not indulge in any
synchronization matters, or....

.... Tcl_Finalize() ought not be called from DllMain().

The submitted patch is fine, but doesn't really address
the fundamental matter in either of these ways.

dgp added on 2012-11-07 23:17:04:
In the demo program, replacing the call _exit(0)
with a call to Tcl_Exit(0) will fix things, yes?

dkf added on 2012-11-06 05:01:46:
(The style issue is not a blocker, BTW.)

dkf added on 2012-11-06 05:01:19:
You mistake me for someone who understands what's going on. I read the patch a month ago and noted that it was not "in style" especially; I don't understand the semantics (beyond what I can grep for) and don't really use Tcl on Windows very much at the moment anyway.

ralfixx added on 2012-11-06 01:40:46:

File Added - 456039: tclWinSock.c.diff

ralfixx added on 2012-11-06 01:40:09:
The patch should be applied.

Unfortunately I currently have no resources to investigate the "why" as proposed by Alexandre :-/

@David: the usage as in the example program is indeed 'embedded' with no active TCL interpreter, but the error also manifests when a hard exit (C++ exception or signal) tears down the TCL application without properly releasing the TCL resources first.

Since the logic of the patch seems clear it should do no harm in regular usage of the TCL library.

dgp added on 2012-11-06 01:19:02:
Please report whether action on this is needed
before an 8.5.13 release.

davygrvy added on 2012-10-06 05:05:28:
unloading the DLL, for which a proper Tcl_Finalize() wasn't done before hand, is my guess.  This usage of Tcl here is probably embedding and the app isn't being clean.  I've seen this problem before with other areas of Tcl when the finalize logic is manipulated subtly, but has major affect in the abortive teardown.

ferrieux added on 2012-10-06 04:50:06:
Re "it would be nice to know *why*": agreed.
Could you please set breakpoints/printfs at strategic places like SocketExitHandler , TclpFinalizeSocket, or after the "while" loop in SocketThread ? I suspect one of the first two is called twice.
Also, what about 8.6 ?

ralfixx added on 2012-10-05 15:14:09:
Of course the terse form Donal suggested is preferable.  I had introduced the intermediate var before looking at the exact signature of PostMessage().
And of course it would be nice to know *why* the handle is already invalid at that point, but...

dkf added on 2012-10-05 02:58:17:
I'd be tempted to write that as:

  if (PostMessage(...)) {
      /* the comment... */
      WaitForSingleObject(...);
  }
  tsdPtr->hwnd = NULL;

YMMV. At least the result of PostMessage() is defined to be boolean...

davygrvy added on 2012-10-05 00:37:08:
Looks great!

ferrieux added on 2012-10-04 22:36:16:
Sounds reasonable to me, but my knowledge of Windows (if any) is waning, so please Dave can you validate ?

ralfixx added on 2012-10-04 22:06:21:

File Added - 454320: t.c

Attachments: