Tcl Source Code

View Ticket
Login
Ticket UUID: 3037525
Title: crash when freeing locals hashtable
Type: Bug Version: obsolete: 8.5.8
Submitter: msofer Created on: 2010-07-31 13:39:22
Subsystem: 07. Variables Assigned To: msofer
Priority: 9 Immediate Severity:
Status: Closed Last Modified: 2010-09-30 01:59:25
Resolution: Fixed Closed By: sf-robot
    Closed on: 2010-09-29 18:59:25
Description:
The one-liner
  proc foo {} { [catch {upvar 0 dummy \$index}] } ; foo
accesses freed memory and causes a crash (in non-threaded tcl, threaded seems to survive due to allocator differences).

Valgrind shows that a freed hash entry is being accessed, in order to free it again. The problem seems to depend on the order in which TclDeleteVars traverses the hashtable of locals.

Note that this likely impacts 8.6 too (unverified as of yet, time is so short and this netbook so slow).
User Comments: sf-robot added on 2010-09-30 01:59:25:

allow_comments - 1

This Tracker item was closed automatically by the system. It was
previously set to a Pending status, and the original submitter
did not respond within 14 days (the time period specified by
the administrator of this Tracker).

hobbs added on 2010-08-04 02:14:23:
var.test suite updated in 8.5 and head.

hobbs added on 2010-08-03 00:03:48:
BTW, to make the test easier, you don't need the [] around catch - it will still crash (at least for me).  Don't we want that in the test suite?

msofer added on 2010-08-02 18:11:16:
Ouch, bad port of the patch - my bad, should get more sleep. Thx for spotting this! Should be fixed in HEAD now.

cyan added on 2010-08-02 17:58:46:
Testing the fix on HEAD of 8.6, I get an abort "maformed bucket chain in Tcl_DeleteHashEntry" at tclHash.c:434.  Strangely, the build from the 24th of June is fine (and valgrind doesn't report any problems).

Tested configuration: Tcl HEAD from 2010-07-31 22:28:02, Linux 64 bit, glibc 2.11, threaded build, enabled symbols.  Backtrace:

#0  0x00007ffff71d1a75 in *__GI_raise (sig=<value optimized out>)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007ffff71d55c0 in *__GI_abort () at abort.c:92
#2  0x000000000043c74b in Tcl_PanicVA (format=0x578ca8 "malformed bucket chain in Tcl_DeleteHashEntry", argList=0x7fffffffd710)
    at /home/cyan/git/cfkit/pkgsrc/tcl/generic/tclPanic.c:93
#3  0x000000000043c810 in Tcl_Panic (format=0x578ca8 "malformed bucket chain in Tcl_DeleteHashEntry")
    at /home/cyan/git/cfkit/pkgsrc/tcl/generic/tclPanic.c:122
#4  0x0000000000546452 in Tcl_DeleteHashEntry (entryPtr=0x847bb8)
    at /home/cyan/git/cfkit/pkgsrc/tcl/generic/tclHash.c:434
#5  0x000000000046b218 in TclDeleteVars (iPtr=0x7bc550, tablePtr=0x847b20)
    at /home/cyan/git/cfkit/pkgsrc/tcl/generic/tclVar.c:5335
#6  0x0000000000431b6c in Tcl_PopCallFrame (interp=0x7bc550)
    at /home/cyan/git/cfkit/pkgsrc/tcl/generic/tclNamesp.c:369
#7  0x000000000044824f in InterpProcNR2 (data=0x7bc508, interp=0x7bc550, result=1) at /home/cyan/git/cfkit/pkgsrc/tcl/generic/tclProc.c:1911
#8  0x00000000004a52a2 in TclNRRunCallbacks (interp=0x7bc550, result=1,  rootPtr=0x82cdd0, tebcCall=1)
    at /home/cyan/git/cfkit/pkgsrc/tcl/generic/tclBasic.c:4324
#9  0x000000000053b8c7 in TclExecuteByteCode (interp=0x7bc550,     codePtr=0x814250)
    at /home/cyan/git/cfkit/pkgsrc/tcl/generic/tclExecute.c:6596
#10 0x00000000004a55ab in NRCallTEBC (data=0x7ba498, interp=0x7bc550, result=0)
    at /home/cyan/git/cfkit/pkgsrc/tcl/generic/tclBasic.c:4400
#11 0x00000000004a52a2 in TclNRRunCallbacks (interp=0x7bc550, result=0,  rootPtr=0x0, tebcCall=0)
    at /home/cyan/git/cfkit/pkgsrc/tcl/generic/tclBasic.c:4324
#12 0x00000000004a7da7 in TclEvalObjEx (interp=0x7bc550, objPtr=0x0, flags=131072, invoker=0x0, word=0)
    at /home/cyan/git/cfkit/pkgsrc/tcl/generic/tclBasic.c:5941
#13 0x00000000004a7d32 in Tcl_EvalObjEx (interp=0x7bc550, objPtr=0x0, flags=131072)
    at /home/cyan/git/cfkit/pkgsrc/tcl/generic/tclBasic.c:5922
#14 0x0000000000547047 in Tcl_RecordAndEvalObj (interp=0x7bc550, cmdPtr=0x7ba4f0, flags=131072)
    at /home/cyan/git/cfkit/pkgsrc/tcl/generic/tclHistory.c:192
#15 0x0000000000430e95 in Tcl_Main (argc=-1, argv=0x7fffffffe4e0,  appInitProc=0x411882 <Tcl_AppInit>)
    at /home/cyan/git/cfkit/pkgsrc/tcl/generic/tclMain.c:471
#16 0x000000000041187b in main (argc=1, argv=0x7fffffffe4d8)
    at /home/cyan/git/cfkit/pkgsrc/tcl/unix/tclAppInit.c:85

Tcl script:

proc foo {} {
    [catch {upvar 0 dummy \$index}]
}

foo


Valgrind doesn't complain about any errors

ferrieux added on 2010-08-02 16:36:23:

allow_comments - 0

msofer added on 2010-08-01 03:28:34:

allow_comments - 1

Patch committed to core-8-5-branch and (adapted) HEAD

msofer added on 2010-08-01 00:03:50:

File Added - 381851: var.patch

msofer added on 2010-08-01 00:03:19:
The attached patch implements (1) and solves the issue. Not committed pending full valgrind check.

msofer added on 2010-07-31 23:42:02:
Two possible solutions:

(1) lose the optimisation, delete the vars from the table one by one (just like in TclDeleteNamespaceVars; can also merge big parts of these two funcs)
(2) somehow signal to UnsetVarStruct or CleanupVar that the var should NOT be deleted.

(2) seems hacky to me, and the value of the opt is iffy anyway (in my prejudice): it is low impact, and only concerns procs with runtime-created locals

msofer added on 2010-07-31 21:25:33:
Supporting evidence for the diagnose:

% proc foo {} { [catch {upvar 0 dummy \$index; set dummy 1}]}; foo
invalid command name "0"
% proc foo {} { [catch {upvar 0 dummy \$index; set x 1}]}; foo
Segmentation fault

msofer added on 2010-07-31 20:58:09:
Description of the early-freeing scenario:
  * two locals a and b defined at runtime - ie, in the hashtable and not the locals array
  * upvar 0 a b
  * the ONLY reference to a is by the upvar, and it remains undefined
  * the hashtable traversal reaches b before a

In this case, when b is freed a is eliminated from the hashtable and is freed too: there are only two refcounts to this var, one from the hash entry and one from the upvar. When the upvar goes away, the undef'ed var is seen as being kept alive only by the entry, and the lot is freed to eliminate the "circular ref".

Later, when Tcl_NextHashEntry tries to read a's entry we access freed memory.

msofer added on 2010-07-31 20:40:26:
OOPS ... forgot to credit tclguy and aku for detecting the problem, and reducing it to this oneliner.

Attachments: