Ticket UUID: | 593810 | |||
Title: | Channel Transfer crashes | |||
Type: | Bug | Version: | obsolete: 8.4b1 | |
Submitter: | nobody | Created on: | 2002-08-11 21:30:16 | |
Subsystem: | 49. Threading | Assigned To: | das | |
Priority: | 9 Immediate | Severity: | ||
Status: | Closed | Last Modified: | 2003-07-19 03:23:15 | |
Resolution: | Fixed | Closed By: | andreas_kupries | |
Closed on: | 2003-07-18 20:23:15 | |||
Description: |
Hello, I am writing a concurrent Tcl server using threads. I have thread enabled Tcl core and thread extension 2.4. I am doing: 1. Open a socket and create a thread . 2. transfer socket to the created thread. 3. Send "puts sockid abc; flush sockid" script to the created thread. It crashes. I do not know why? Could you please look into the problem and write me back. Here is my code % socket localhost 35000 sock460 % pwd % load thread.dll % ::thread::create 1304 % ::thread::transfer 1304 sock460 % ::thread::send 1304 "puts sock460 line; flush sock460" Thanks yasar | |||
User Comments: |
andreas_kupries added on 2003-07-19 03:21:46:
Logged In: YES user_id=75003 Daniel, did you test the changes ? hobbs added on 2003-07-19 03:21:10: Logged In: YES user_id=72656 moved to pending since we haven't heard - assuming functional on Mac. andreas_kupries added on 2003-04-23 06:24:32: Logged In: YES user_id=75003 Reassigning to Daniel for test of Mac changes. andreas_kupries added on 2003-04-23 02:48:43: Logged In: YES user_id=75003 See also [ 718045 ] Closing transferred channel crashes app. davygrvy added on 2002-11-11 03:42:21: File Added - 35071: patch.txt Logged In: YES user_id=7549 Andreas: >The true fix however is to extend the channel driver with an >init-function which can be used by channels during >registration in an interp to ensure that their driver is initialized >in the thread of said interp. Yes, exactly. Does this issue exist in the other channel types, too? Should we generalize another another entry in the Tcl_ChannelType struct just for this purpose? See uploaded patch file for my idea in code. davygrvy added on 2002-11-11 01:30:32: Logged In: YES user_id=7549 *** generic/tclIO.c30 Jul 2002 18:36:25 -00001.57 --- generic/tclIO.c10 Nov 2002 10:30:52 -0000 *************** *** 771,776 **** --- 771,785 ---- panic("Tcl_RegisterChannel: duplicate channel names"); } Tcl_SetHashValue(hPtr, (ClientData) chanPtr); + #ifdef __WIN32__ + if (! strcmp(chanPtr->typePtr->typeName, "tcp")) { + /* + * Just in case, force per-thread initialization to happen + * so the socket event handler thread gets created. + */ + TclpHasSockets(NULL); + } + #endif } statePtr->refCount++; } That seems to do it, but is rather "bad style". zoro2 added on 2002-11-09 04:49:23: Logged In: YES user_id=191529 Workaround to the problem Andreas describes is unfortunately incomplete. Below is a testing script that SHOULD work. It stops when trying to do a gets on the channel, while when I use my *sockPtr=0 hack, everything seems to work OK. package require Thread set id [thread::create] thread::send $id { close [socket -server puts 0] } proc d {sock args} { after idle [list d0 $sock] } proc d0 {sock} { global id thread::send $id [list set sock $sock] thread::send $id [list set tid [thread::id]] thread::transfer $id $sock thread::send -async $id { puts $sock "HI" flush $sock thread::send -async $tid [list puts SENTHI] puts $sock [gets $sock] thread::send -async $tid [list puts SENTLINE] flush $sock close $sock thread::send -async $tid [list puts DONE] } } socket -server d 12345 set next [thread::create] thread::send $next [list set tid [thread::id]] thread::send -async $next { package require Thread if {[catch { after 2000 set s [socket 127.0.0.1 12345] puts $s TEST; flush $s } err]} { thread::send -async $::tid [list puts "ERROR: $::errorInfo"] } } andreas_kupries added on 2002-08-21 02:46:07: Logged In: YES user_id=75003 Found the problem. When a socket is created in a thread the socket driver will be initialized for that thread, especially its TSD slot. Call sequence: SocketObjCmd => TclpHasSockets => InitSocket Now if a thread is created and no socket is created nothing is iniitialized. The channel transfer then inserts a socket into the thread, but this does not run any code to completely initialize the driver. Hence the TSD slot is uninitialized and thus the crash. Because of the workaround described above (create and destroy a temp socket in the thread before transfering sockets) the priority will go down. The true fix however is to extend the channel driver with an init-function which can be used by channels during registration in an interp to ensure that their driver is initialized in the thread of said interp. andreas_kupries added on 2002-08-21 02:13:42: Logged In: YES user_id=75003 New datapoint: Start thread-enabled tclsh (in MSVC++ debugger), set a breakpoint in file tclWinSock, line 1776. This is where the core retrieves the tsdPtr for the SendMessage stuff later. Source non-crashing script, step into the tsd retrieval. The ultimate routine is TlsGetValue, presumably provided by Windows. I can't step into it. Here things are ok. Now source the crashing script, do not change interpreters. Step into the retrieval again. All arguments etc. are the same as before, but now TlsGetValue returns NULL.. The reason is unknown. It is suspected that somewhere some memory went haywire. Couldn't prove this however. 'memory validate on' does not trigger anything before we hit the crash (Yes, I used tcl/threads compiled with TCL_MEM_DEBUG). I declare this a windows specific bug for now, because of TlsGetValue, and the info by Don Porter that the attached crashing script is running ok on his Linux/Alpha. andreas_kupries added on 2002-08-21 02:06:45: File Added - 29445: ttx andreas_kupries added on 2002-08-21 02:06:18: File Added - 29444: ttt andreas_kupries added on 2002-08-21 02:06:17: Logged In: YES user_id=75003 Attaching the scripts I used for testing. andreas_kupries added on 2002-08-21 00:42:22: Logged In: YES user_id=75003 Confirmed for Win'2K. Stack trace: TcpOutputProc(void * 0x007d5fc0, const char * 0x008db0d0, int 6, int * 0x0150f994) line 1803 + 14 bytes FlushChannel(Tcl_Interp * 0x00000000, Channel * 0x007d5f70, int 0) line 2066 + 38 bytes Tcl_Flush(Tcl_Channel_ * 0x007d5f70) line 5104 + 13 bytes Tcl_FlushObjCmd(void * 0x00000000, Tcl_Interp * 0x007d5490, int 2, Tcl_Obj * const * 0x0150fbec) line 194 + 9 bytes TclEvalObjvInternal(Tcl_Interp * 0x007d5490, int 2, Tcl_Obj * const * 0x0150fbec, const char * 0x007d87f3, int 14, int 0) line 3033 + 25 bytes Tcl_EvalEx(Tcl_Interp * 0x007d5490, const char * 0x007d87e0, int 33, int 131072) line 3632 + 42 bytes ThreadSendEval(Tcl_Interp * 0x007d5490, void * 0x007d9610) line 1250 + 27 bytes ThreadEventProc(Tcl_Event * 0x007d89a0, int -3) line 2386 + 13 bytes Tcl_ServiceEvent(int -3) line 618 + 11 bytes Tcl_DoOneEvent(int -3) line 921 + 9 bytes ThreadWait() line 2189 + 14 bytes ThreadWaitObjCmd(void * 0x00000000, Tcl_Interp * 0x007d5490, int 1, Tcl_Obj * const * 0x0150ff08) line 955 TclEvalObjvInternal(Tcl_Interp * 0x007d5490, int 1, Tcl_Obj * const * 0x0150ff08, const char * 0x007d9b90, int 12, int 0) line 3033 + 25 bytes Tcl_EvalEx(Tcl_Interp * 0x007d5490, const char * 0x007d9b90, int 12, int 0) line 3632 + 42 bytes Tcl_Eval(Tcl_Interp * 0x007d5490, const char * 0x007d9b90) line 3796 + 17 bytes NewThread(void * 0x0012f640) line 1472 + 23 bytes KERNEL32! 77e8758a() Dereferencing a NULL pointer in TcpOutputProc. tsdPtr is NULL. tsd = Thread-specific Data. That is something which should never be NULL. andreas_kupries added on 2002-08-20 09:48:52: Logged In: YES user_id=75003 The resource which is not found anymore is an interpreter. Possibly the interpreter performing the after script. ... Just checked, this happens without sockets and transfering them. In other words, this is a different problem than shown in this report. Creating a new SF entry: #597575. I will have to check this on a windows platform. andreas_kupries added on 2002-08-20 09:33:08: Logged In: YES user_id=75003 Tried to replicate on Linux/x86. Used the smtp port instead of 35000 to ensure that the socket truly exists. May script is: set s [socket localhost smtp] ; # connect to smtp mail package require Thread puts [info loaded] puts [pwd] set t [::thread::create] ::thread::transfer $t $s ::thread::send $t "puts $s line; flush $s" I.e. this is not interactive, but executed via tclsh ./testscript No problem. Then I added the following lines to the script, at the end: after 5000 "::thread::send $t exit" vwait forever Now every once in a while the script does abort, the error message is: Tcl_Release couldn't find reference for 0x80533f8 Aborted (core dumped) In other words, a panice somewhere. The stack-trace, see below indicates the handling of the timer event: #0 0x4014b7b1 in kill () from /lib/libc.so.6 #1 0x400f5e5e in pthread_kill () from /lib/libpthread.so.0 #2 0x400f6339 in raise () from /lib/libpthread.so.0 #3 0x4014cc11 in abort () from /lib/libc.so.6 #4 0x4009f92e in Tcl_PanicVA (format=0x400d2d80 "Tcl_Release couldn't find reference for 0x%x", argList=0xbffff3b8) at ../../tcl/unix/../generic/tclPanic.c:106 #5 0x4009f967 in Tcl_Panic (arg1=0x400d2d80 "Tcl_Release couldn't find reference for 0x%x") at ../../tcl/unix/../generic/tclPanic.c:134 #6 0x400a897b in Tcl_Release (clientData=0x80533f8) at ../../tcl/unix/../generic/tclPreserve.c:255 #7 0x400b2ff5 in AfterProc (clientData=0x80a95e0) at ../../tcl/unix/../generic/tclTimer.c:1054 #8 0x400b2473 in TimerHandlerEventProc (evPtr=0x80a9660, flags=-3) at ../../tcl/unix/../generic/tclTimer.c:543 #9 0x4009ce21 in Tcl_ServiceEvent (flags=-3) at ../../tcl/unix/../generic/tclNotify.c:618 #10 0x4009d2a1 in Tcl_DoOneEvent (flags=-3) at ../../tcl/unix/../generic/tclNotify.c:921 #11 0x4006a5ec in Tcl_VwaitObjCmd (clientData=0x0, interp=0x80533f8, objc=2, objv=0xbffff5b8) at ../../tcl/unix/../generic/tclEvent.c:990 #12 0x4003b90c in TclEvalObjvInternal (interp=0x80533f8, objc=2, objv=0xbffff5b8, command=0x8052aea "\nvwait forever\n", length=15, flags=0) at ../../tcl/unix/../generic/tclBasic.c:3033 #13 0x4003c5c0 in Tcl_EvalEx (interp=0x80533f8, script=0x80529f8 "\n\nset s [socket localhost smtp] ; # connect to smtp mail\npackage require Thread\n\nputs [info loaded]\nputs [pwd]\n\nset t [::thread::create]\n\n::thread::transfer $t $s\n::thread::send $t \"puts $s line; flus"..., numBytes=257, flags=0) at ../../tcl/unix/../generic/tclBasic.c:3631 #14 0x40090044 in Tcl_FSEvalFile (interp=0x80533f8, pathPtr=0x80589e8) at ../../tcl/unix/../generic/tclIOUtil.c:1371 #15 0x40097e83 in Tcl_Main (argc=1, argv=0xbffffaf8, appInitProc=0x80486d8 <Tcl_AppInit>) at ../../tcl/unix/../generic/tclMain.c:292 #16 0x080486cc in main (argc=2, argv=0xbffffaf4) at ../../tcl/unix/../unix/tclAppInit.c:90 #17 0x4013b17f in __libc_start_main () from /lib/libc.so.6 dgp added on 2002-08-20 06:11:51: Logged In: YES user_id=80530 Just tried to reproduce this on Linux/Alpha using Tcl 8.4b2 and Thread 2.4. Right away I see there's a difficulty. What service do you have running on port 35000? |