Tk Source Code

View Ticket
Login
Ticket UUID: f214b8ad5b9a92e9476734ade445abf8cdecea21
Title: shutdown woes
Type: Bug Version: 8.6.1
Submitter: dgp Created on: 2013-10-24 17:57:18
Subsystem: 44. Generic Fonts Assigned To: jan.nijtmans
Priority: 8 Severity: Important
Status: Closed Last Modified: 2013-11-11 08:42:55
Resolution: Fixed Closed By: jan.nijtmans
    Closed on: 2013-11-11 08:42:55
Description:
In a "big wish" program that embeds Tk, when built against
(thread-enabled) Tcl/Tk 8.6.1, I get a panic on exit.  No such
trouble with (thread-disabled) Tcl/Tk 8.5.15.  Here's the stack
trace:

alloc: invalid block: 0xdc1a10: 50 0

Program received signal SIGABRT, Aborted.
0x0000003dda430265 in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x0000003dda430265 in raise () from /lib64/libc.so.6
#1  0x0000003dda431d10 in abort () from /lib64/libc.so.6
#2  0x000000000045eaf9 in CustomPanic(char*, ...) ()
#3  0x00002aaaaafc4ff7 in Tcl_PanicVA (
    format=0x2aaaab046a00 "alloc: invalid block: %p: %x %x",
    argList=0x7fffffffd070) at /home/dgp/fossil/tcl8.6.1/generic/tclPanic.c:99
#4  0x00002aaaaafc5167 in Tcl_Panic (
    format=0x2aaaab046a00 "alloc: invalid block: %p: %x %x")
    at /home/dgp/fossil/tcl8.6.1/generic/tclPanic.c:153
#5  0x00002aaaaafea3c9 in Ptr2Block (ptr=0xdc1a20 "")
    at /home/dgp/fossil/tcl8.6.1/generic/tclThreadAlloc.c:780
#6  0x00002aaaaafe99db in TclpFree (ptr=0xdc1a20 "")
    at /home/dgp/fossil/tcl8.6.1/generic/tclThreadAlloc.c:406
#7  0x00002aaaaaf88f59 in Tcl_DeleteHashEntry (entryPtr=0xdc1a20)
    at /home/dgp/fossil/tcl8.6.1/generic/tclHash.c:467
#8  0x00002aaaaab02711 in Tk_FreeFont (tkfont=0xe3af70)
    at /home/dgp/fossil/tk8.6.1/unix/../generic/tkFont.c:1439
#9  0x00002aaaaab027a2 in Tk_FreeFontFromObj (tkwin=0xffc4c0, objPtr=0x6fe840)
    at /home/dgp/fossil/tk8.6.1/unix/../generic/tkFont.c:1480
#10 0x00002aaaaaaf9ed8 in FreeResources (optionPtr=0x712190, objPtr=0x6fe840,
    internalPtr=0x0, tkwin=0xffc4c0)
    at /home/dgp/fossil/tk8.6.1/unix/../generic/tkConfig.c:1639
#11 0x00002aaaaaaf9d3b in Tk_FreeConfigOptions (
    recordPtr=0x1045690 "\300\304\377", optionTable=0x711f90, tkwin=0xffc4c0)
    at /home/dgp/fossil/tk8.6.1/unix/../generic/tkConfig.c:1568
#12 0x00002aaaaab43324 in DestroyMenuInstance (menuPtr=0x1045690)
    at /home/dgp/fossil/tk8.6.1/unix/../generic/tkMenu.c:1235
#13 0x00002aaaaab43464 in TkDestroyMenu (menuPtr=0x1045690)
    at /home/dgp/fossil/tk8.6.1/unix/../generic/tkMenu.c:1317
#14 0x00002aaaaab4a1fe in TkMenuEventProc (clientData=0x1045690,
    eventPtr=0x7fffffffd4a0)
    at /home/dgp/fossil/tk8.6.1/unix/../generic/tkMenuDraw.c:762
#15 0x00002aaaaaafe5a8 in Tk_HandleEvent (eventPtr=0x7fffffffd4a0)
    at /home/dgp/fossil/tk8.6.1/unix/../generic/tkEvent.c:1341
#16 0x00002aaaaab2a5a9 in Tk_DestroyWindow (tkwin=0xffc4c0)
---Type <return> to continue, or q <return> to quit---
    at /home/dgp/fossil/tk8.6.1/unix/../generic/tkWindow.c:1433
#17 0x00002aaaaab2a462 in Tk_DestroyWindow (tkwin=0x70c170)
    at /home/dgp/fossil/tk8.6.1/unix/../generic/tkWindow.c:1374
#18 0x00002aaaaab2c26d in DeleteWindowsExitProc (clientData=0x716a20)
    at /home/dgp/fossil/tk8.6.1/unix/../generic/tkWindow.c:2822
#19 0x00002aaaaaaff225 in TkFinalizeThread (clientData=0x0)
    at /home/dgp/fossil/tk8.6.1/unix/../generic/tkEvent.c:2104
#20 0x00002aaaaaf69d85 in Tcl_FinalizeThread ()
    at /home/dgp/fossil/tcl8.6.1/generic/tclEvent.c:1296
#21 0x00002aaaaaf69b77 in Tcl_Exit (status=0)
    at /home/dgp/fossil/tcl8.6.1/generic/tclEvent.c:986
#22 0x00002aaaaaebc049 in Tcl_ExitObjCmd (dummy=0x0, interp=0x6c0060, objc=2,
    objv=0xa113d0) at /home/dgp/fossil/tcl8.6.1/generic/tclCmdAH.c:833
User Comments: jan.nijtmans added on 2013-11-11 08:42:55:

Fixed in [9fc8df19b1]. Thinking more about it, backporting makes no sense:In Tk 8.5 Option tables are not shared among interpreters, so the described problem doesn't occur at all. In order to prove that, I added the test-case (only) to Tk 8.5 too.


jan.nijtmans added on 2013-11-08 10:17:00:

I agree with this solution. My attempt fixing this (See the [bug-f214b8ad5b] branch) only resulted in some clean-up, but didn't really solve this: probably too dangerous to do such reform in a bug-fix release.

The only potential problem I see is possible shimmering when Tcl_Obj's are used in two interpreters. But since the function GetObjectForOption (in tkConfig.c) does a Tcl_NewStringObj(Tk_NameOfFont(tkfont), -1) for Fonts, I think it's OK.

So, I'm OK merging this to trunk, and also backporting it.

> Should a particular meaning of a font name have control > over just one screen? Or one interp? Or one "application" ? I'm not sure, only that "interp" is not the correct answer. My guess would be "application", as suggested by: [4a168c76f4?ln=18-20]


anonymous added on 2013-11-05 22:00:55:
Hi Donald,

your patch fixes my problem with tclkits on Darwin.

Good work.

Paul

dgp added on 2013-11-05 21:00:08:
Created branch bug-f214b8 with first draft patch
fixing the script-only demo.  Would appreciate more
eyes, testing and comments before merging to trunk
(and considering for backport).

dgp added on 2013-11-05 19:23:40:
So Tk_CreateOptionTable() actually only creates
one table per thread, so when multiple interps
in one thread use Tk, they share that table.
In particular they share the default values
of the options in a Tcl_IsShared Tcl_Obj sense.

dgp added on 2013-11-05 18:23:26:
At least a contributing factor to the problem
appears to be confusion over the proper scope
of font names as a Tcl_ObjType.  Should a particular
meaning of a font name have control over just one
screen?  Or one interp?  Or one "application" ?

It appears that an intrep of a "font" value is
getting used in one Tk "application" when it was
set in (and stored in the fontCache of) another
Tk "application".  Later on attempting to tear down
twice what was only constructed once causes the abort.

Some advice from the designer(s) of the font system
would be useful.

dgp added on 2013-11-05 14:18:16:
Here's a script-only demo of the problem:

$ cat demo.tcl
interp create slave
load {} Tk slave
slave eval menu .menubar
menu .menubar
exit

$ wish demo.tcl
alloc: invalid block: 0x198f8720: a0 19
Abort

anonymous added on 2013-11-04 21:28:10:
I noticed a similar problem, also dealing with freeing fonts on exit:

I compile tclkits (with kitgen) from Tcl/Tk 8.6.0 and generate starpacks for my poApps application (adding several Tcl-only packages, as well as TkImg and my own C-based image library). This works fine on Darwin, Windows and Linux.

Using Tcl/Tk version 8.6.1, everything works fine, except on Darwin. Using the starpack from the command line works, but if the starpack is inside a *.app folder, the application crashes on exit.

Crash dump attached.

Paul

dgp added on 2013-11-04 20:25:34:
Some more info.

The program having trouble has multiple interps
in one thread, each with Tk in it.

When [exit] is evaluated, we get to 
DeleteWindowsExitProc() and the loop

    while (tsdPtr->mainWindowList != NULL) {...}

gets two passes through it.

The first pass calls TkFontPkgFree() which tears down
the whole hash table.

The second pass makes the Tk_FreeConfigOptions() call
that tries to delete hash entries that have already been
deleted when the table was destroyed leading to the crash.

Still looking for the simple demo, or simple solution,
but that may be enough info to be helpful toward making
more progress.

jan.nijtmans added on 2013-10-31 10:04:36:

> Bisecting confirms that this trouble started with commit

Yes, that's exactly what I thought. I don't think that [aaf11bdce2] is the cause of the bug, it only exposes a bug that must have been there already.

What more extensions does this "big wish" have? Something in it that uses fonts? Because this crash happens when cleaning up fonts, that's where I'm searching. So far I am not able to reproduce this with a statically built wish.


dgp added on 2013-10-30 19:58:51:
Bisecting confirms that this trouble started
with commit

http://core.tcl.tk/tk/info/aaf11bdce2

jan.nijtmans added on 2013-10-28 14:49:06:

Since there is a Tk_FreeConfigOptions in the stacktrace, I bet this is related to [069c9e43c4], so it is probably a refcount problem.

I'll have a look.


Attachments: