Tcl Source Code

View Ticket
Login
Ticket UUID: 3001438
Title: [info frame -1] sigsegv under strange circumstances
Type: Bug Version: obsolete: 8.6b1.1
Submitter: coldstore Created on: 2010-05-14 03:52:25
Subsystem: 35. TclOO Package Assigned To: dkf
Priority: 9 Immediate Severity:
Status: Closed Last Modified: 2011-01-18 23:12:18
Resolution: Fixed Closed By: dkf
    Closed on: 2011-01-18 16:12:18
Description:
Linux, HEAD as at 2pm 14Mar10, get a sigsegv:

0x001741e3 in TclInfoFrame (interp=0x8055fc8, framePtr=0x805e148) at /home/colin/Desktop/packages/tcl/generic/tclCmdIL.c:1374

Of note:

(gdb) p *namePtr
$15 = {nextPtr = 0x805e280, tablePtr = 0x0, hash = 0x805e310, clientData = 0x24be10, key = {oneWordValue = 0x80ff6f8 "\002", objPtr = 0x80ff6f8, words = {135263992}, string = "\370\366\017\b"}}
(gdb) p namePtr->tablePtr
$16 = (Tcl_HashTable *) 0x0

I haven't a short way to reproduce this error, but it seems it may be related to namePtr->tablePtr being NULL.

Colin.
User Comments: dkf added on 2011-01-18 23:12:18:

allow_comments - 1

Backported

dkf added on 2011-01-18 20:53:09:
BTW, [info frame] is a sewer inhabited by alligators.

dkf added on 2011-01-18 20:51:09:
Fixed on HEAD. Will need to see if it should be backported before closing this...

dkf added on 2011-01-18 20:41:08:
Here's a shorter triggering script:


oo::class create test {
    method test {{x 1}} {
if {$x} {my test 0}
lsort {q w e r t y u i o p}
info frame 0
    }
}
puts [[test new] test]

msofer added on 2011-01-14 05:39:28:

File Deleted - 398704:

msofer added on 2011-01-14 05:38:53:

File Added - 398706: CRASH5

msofer added on 2011-01-14 05:37:26:

File Added - 398704: CRASH5

msofer added on 2011-01-14 05:36:32:
The attached CRASH5 script may provide a clue to dkf? It looks as if the nested call to [my fixup] is causing some corruption in the outer call's CallFrame (or is it CmdFrame?)

msofer added on 2011-01-06 18:49:08:
dkf writes "The issue seems to be that there is confusion about what the evaluation stack has as top and bottom so it calculates the space required as
negative. It then allocates space that happens to overlap with something
else (i.e., there is a 4 byte overlap between the returned block and the
temporary Command allocated for the execution of the method)."

This suggests a missing update to esPtr->tosPtr somewhere, maybe a missing DECACHE_STACK_INFO in TEBC. Will look for that.

msofer added on 2010-08-21 22:34:57:
Traced back the problem to: InfoFrame somehow has a bad procPtr->cmdPtr. The stack trace is

#0  __strlen_sse2 () at ../sysdeps/i386/i686/multiarch/strlen.S:87
#1  0x0808e7ab in Tcl_AppendLimitedToObj (objPtr=0x8bf28b8, 
    bytes=0x31207765 <Address 0x31207765 out of bounds>, length=-1, limit=2147483647, ellipsis=0x0)
    at /home/CVS/tcl/generic/tclStringObj.c:1078
#2  0x0808e928 in Tcl_AppendToObj (objPtr=0x8bf28b8, bytes=0x31207765 <Address 0x31207765 out of bounds>, 
    length=-1) at /home/CVS/tcl/generic/tclStringObj.c:1146
#3  0x080db15d in Tcl_GetCommandFullName (interp=0x8bd4a28, command=0x8bd583c, objPtr=0x8bf28b8)
    at /home/CVS/tcl/generic/tclBasic.c:2833
#4  0x080f3700 in TclInfoFrame (interp=0x8bd4a28, framePtr=0x8bd569c) at /home/CVS/tcl/generic/tclCmdIL.c:1381


Looking at the core file with gdb we see that procPtr->cmdPtr has some rotten entries (hPtr, nsPtr, ...). The proc being inspected is method fixup, as can be seen in procPtr->bodyPtr->bytes 


(gdb) f 4
#4  0x080f3700 in TclInfoFrame (interp=0x8bd4a28, framePtr=0x8bd569c) at /home/CVS/tcl/generic/tclCmdIL.c:1381
1381            Tcl_GetCommandFullName(interp, (Tcl_Command) procPtr->cmdPtr,
(gdb) p *procPtr
$10 = {iPtr = 0x8bd4a28, refCount = 2, cmdPtr = 0x8bd583c, bodyPtr = 0x8c03578, numArgs = 1, 
  numCompiledLocals = 1, firstLocalPtr = 0x8c03620, lastLocalPtr = 0x8c03620}
(gdb) p *procPtr->bodyPtr
$11 = {refCount = 1, bytes = 0x8c03598 "\n\tif {$tuple == 0} {\n\t    my new 1\n\t}\n    ", length = 42, 
  typePtr = 0x81af4a8, internalRep = {longValue = 146746440, doubleValue = 6.4384897475493207e-314, 
    otherValuePtr = 0x8bf2c48, wideValue = 13031648328, twoPtrValue = {ptr1 = 0x8bf2c48, ptr2 = 0x3}, 
    ptrAndLongRep = {ptr = 0x8bf2c48, value = 3}}}
(gdb) p *procPtr->cmdPtr
$12 = {hPtr = 0x8bf2cc1, nsPtr = 0x8c035b2, refCount = 8, cmdEpoch = 1, compileProc = 0x8bf0e60, objProc = 0, 
  objClientData = 0x0, proc = 0, clientData = 0x8bd5878, deleteProc = 0, deleteData = 0x0, flags = 0, 
  importRefPtr = 0x0, tracePtr = 0x0, nreProc = 0}
(gdb) p *procPtr->cmdPtr->hPtr
$13 = {nextPtr = 0x4220306, tablePtr = 0x98000401, hash = 0xb808c030, clientData = 0xa008bd51, key = {
    oneWordValue = 0x8808c02f <Address 0x8808c02f out of bounds>, objPtr = 0x8808c02f, words = {-2012692433}, 
    string = "/\300\b\210"}}
(gdb) p *procPtr->cmdPtr->nsPtr
$14 = {name = 0x6e20796d <Address 0x6e20796d out of bounds>, 
  fullName = 0x31207765 <Address 0x31207765 out of bounds>, clientData = 0xa7d090a, deleteProc = 0x20202020, 
  parentPtr = 0x290800, childTable = {buckets = 0x4a280000, staticBuckets = {0x208bd, 0x583c0000, 0x357808bd, 
      0x108c0}, numBuckets = 65536, numEntries = 908066816, rebuildSize = 908069056, downShift = 2240, 
    mask = 1638400, keyType = 0, findProc = 0x1bb00000, createProc = 0x8bac08c0, typePtr = 0x3648000b}, 
  nsId = 1055393984, interp = 0x1908bf, flags = 0, activationCount = 464519168, refCount = 651299008, 
  cmdTable = {buckets = 0x43500000, staticBuckets = {0x3ff808bf, 0x2908bf, 0x0, 0x50000}, numBuckets = 0, 
    numEntries = 16777216, rebuildSize = 0, downShift = 0, mask = 1970536448, keyType = 6646896, 
    findProc = 0x120ffff, createProc = 0x210000, typePtr = 0xdf000000}, varTable = {table = {
      buckets = 0x2081a, staticBuckets = {0x35480000, 0x3ee808c0, 0x8bf, 0x1b680000}, numBuckets = 67776, 
      numEntries = 2162688, rebuildSize = -1825570816, downShift = 1437075467, mask = 811600061, 
      keyType = 2240, findProc = 0, createProc = 0x35280000, typePtr = 0x369808c0}, nsPtr = 0x2108c0}, 
  exportArrayPtr = 0xfd80000, numExportPatterns = 2239, maxExportPatterns = -65536, cmdRefEpoch = 65535, 
  resolverEpoch = 0, cmdResProc = 0x10000, varResProc = 0x3d700000, compiledVarResProc = 0x3108c0, 
  exportLookupEpoch = 0, ensembles = 0x42c00000, unknownHandlerPtr = 0x8bf, commandPathLength = 131072, 
  commandPathArray = 0x0, commandPathSourceList = 0x0, earlyDeleteProc = 0}

msofer added on 2010-06-04 01:30:21:
Simplifying a bit: we do not need [new] to be twice in the stack 

oo::class create Tuple {
method fixup {tuple} {
if {$tuple == 0} {
my new 1
}
}

method new {{arg 0}} {
my fixup $arg
puts stderr [info frame -1]
}
}
::Tuple
% set x [Tuple new]
::oo::Obj4
% $x new 1
type eval line -1 cmd {$x new 1} level 1
% $x fixup 0
Segmentation fault (core dumped)

dkf added on 2010-06-03 03:35:29:
That was with a breakpoint on TclInfoFrame and a (conditional on 'move') breakpoint on GrowEvaluationStack. There were a number of things set to display too. I've elided some stretches (marked with [...]) where not much is going on.

The issue seems to be that there is confusion about what the evaluation stack has as top and bottom so it calculates the space required as negative. It then allocates space that happens to overlap with something else (i.e., there is a 4 byte overlap between the returned block and the temporary Command allocated for the execution of the method). That then gets overwritten a little bit down, corrupting the hPtr field (the first in the structure) and causing the crash later. Which explains why my original attitude was one of "but this can't happen".

I don't know that I've done good probes for finding out what's going wrong in the memory handling; don't know that code at all.

dkf added on 2010-06-03 03:23:42:
(gdb) run
Starting program: /Users/dkf/Documents/software/tcl8.6-commits/unix/tclsh 
% oo::class create Tuple {
method fixup {tuple} {
if {$tuple == 0} {
my new 1
}
}

method new {{arg 0}} {
my fixup $arg
info frame -1
}
}

Breakpoint 4, 0x0a08cbff in GrowEvaluationStack (eePtr=0x104000, growth=2, move=1) at /Users/dkf/Documents/software/tcl8.6-commits/generic/tclExecute.c:1000
1000{
7: *eePtr->execStackPtr = {
  prevPtr = 0x0, 
  nextPtr = 0x0, 
  markerPtr = 0x808288, 
  endPtr = 0x809f70, 
  tosPtr = 0x8082e4, 
  stackWords = {0x0}
}
(gdb) cont
Continuing.

Breakpoint 4, 0x0a08cbff in GrowEvaluationStack (eePtr=0x104000, growth=2, move=1) at /Users/dkf/Documents/software/tcl8.6-commits/generic/tclExecute.c:1000
1000{
7: *eePtr->execStackPtr = {
  prevPtr = 0x0, 
  nextPtr = 0x0, 
  markerPtr = 0x8082a8, 
  endPtr = 0x809f70, 
  tosPtr = 0x808300, 
  stackWords = {0x0}
}
(gdb) 
Continuing.
::Tuple
% [Tuple new] new

Breakpoint 4, 0x0a08cbff in GrowEvaluationStack (eePtr=0x104000, growth=2, move=1) at /Users/dkf/Documents/software/tcl8.6-commits/generic/tclExecute.c:1000
1000{
7: *eePtr->execStackPtr = {
  prevPtr = 0x0, 
  nextPtr = 0x0, 
  markerPtr = 0x808088, 
  endPtr = 0x809f70, 
  tosPtr = 0x8080e4, 
  stackWords = {0x0}
}
(gdb) cont
Continuing.

Breakpoint 4, 0x0a08cbff in GrowEvaluationStack (eePtr=0x104000, growth=2, move=1) at /Users/dkf/Documents/software/tcl8.6-commits/generic/tclExecute.c:1000
1000{
7: *eePtr->execStackPtr = {
  prevPtr = 0x0, 
  nextPtr = 0x0, 
  markerPtr = 0x8080e8, 
  endPtr = 0x809f70, 
  tosPtr = 0x808140, 
  stackWords = {0x0}
}
(gdb) 
Continuing.

Breakpoint 1, TclInfoFrame (interp=0x807620, framePtr=0x8082dc) at /Users/dkf/Documents/software/tcl8.6-commits/generic/tclCmdIL.c:1235
1235    Interp *iPtr = (Interp *) interp;
6: framePtr->framePtr->procPtr->cmdPtr = (struct Command *) 0x80848c
5: framePtr->framePtr->procPtr = (Proc *) 0x18be80
4: *framePtr->framePtr->procPtr->cmdPtr = {
  hPtr = 0x0, 
  nsPtr = 0x176d60, 
  refCount = 0, 
  cmdEpoch = 0, 
  compileProc = 0, 
  objProc = 0, 
  objClientData = 0x0, 
  proc = 0, 
  clientData = 0x8084c8, 
  deleteProc = 0, 
  deleteData = 0x0, 
  flags = 0, 
  importRefPtr = 0x0, 
  tracePtr = 0x0, 
  nreProc = 0
}
(gdb) n  
[...]
(gdb) n
1304CmdFrame *fPtr = TclStackAlloc(interp, sizeof(CmdFrame));
6: framePtr->framePtr->procPtr->cmdPtr = (struct Command *) 0x80848c
5: framePtr->framePtr->procPtr = (Proc *) 0x18be80
4: *framePtr->framePtr->procPtr->cmdPtr = {
  hPtr = 0x0, 
  nsPtr = 0x176d60, 
  refCount = 0, 
  cmdEpoch = 0, 
  compileProc = 0, 
  objProc = 0, 
  objClientData = 0x0, 
  proc = 0, 
  clientData = 0x8084c8, 
  deleteProc = 0, 
  deleteData = 0x0, 
  flags = 0, 
  importRefPtr = 0x0, 
  tracePtr = 0x0, 
  nreProc = 0
}
(gdb) s
[...]
(gdb) s
GrowEvaluationStack (eePtr=0x104000, growth=12, move=0) at /Users/dkf/Documents/software/tcl8.6-commits/generic/tclExecute.c:1001
1001    ExecStack *esPtr = eePtr->execStackPtr, *oldPtr = NULL;
7: *eePtr->execStackPtr = {
  prevPtr = 0x0, 
  nextPtr = 0x0, 
  markerPtr = 0x8083f8, 
  endPtr = 0x809f70, 
  tosPtr = 0x808454, 
  stackWords = {0x0}
}
(gdb) n
1003    int needed = growth - (esPtr->endPtr - esPtr->tosPtr);
7: *eePtr->execStackPtr = {
  prevPtr = 0x0, 
  nextPtr = 0x0, 
  markerPtr = 0x8083f8, 
  endPtr = 0x809f70, 
  tosPtr = 0x808454, 
  stackWords = {0x0}
}
(gdb) 
1004    Tcl_Obj **markerPtr = esPtr->markerPtr, **memStart;
7: *eePtr->execStackPtr = {
  prevPtr = 0x0, 
  nextPtr = 0x0, 
  markerPtr = 0x8083f8, 
  endPtr = 0x809f70, 
  tosPtr = 0x808454, 
  stackWords = {0x0}
}
(gdb) print needed
$18 = -1723

msofer added on 2010-06-02 23:07:47:
Renaming stuff around to make sure we know what's what: still crashing

oo::class create Tuple {
method fixup {tuple} {
if {$tuple == 0} {
my foo 1
}
}

method foo {{arg 0}} {
my fixup $arg
info frame -1
}
}

set x [Tuple new]
$x foo

dkf added on 2010-06-02 22:39:50:
Verified that HEAD is still crashing (with --enable-symbols=all) with both Colin's and my "minimal" scripts, and that my workaround fix (use a local buffer instead of something returned by TclStackAlloc in TclInfoFrame) makes the crash stop.

coldstore added on 2010-06-02 21:30:59:
Here's a more minimal script to cause the crash

oo::class create Tuple {
    method fixup {tuple} {
if {$tuple == 0} {
    my new 1
}
    }

    method new {{arg 0}} {
my fixup $arg
info frame -1
    }
}

[Tuple new] new

msofer added on 2010-06-01 23:58:52:
Ouch, this involves both oo and [info frame] :(

Working through the core dumped by dkf's small crashing script:

#0  0x400e604b in strlen () from /lib/tls/i686/cmov/libc.so.6
#1  0x0808e50d in Tcl_AppendLimitedToObj (objPtr=0x87ed050, 
    bytes=0x74207765 <Address 0x74207765 out of bounds>, length=-1, limit=2147483647, ellipsis=0x0)
    at /home/CVS/tcl/generic/tclStringObj.c:1078
#2  0x0808e68a in Tcl_AppendToObj (objPtr=0x87ed050, 
    bytes=0x74207765 <Address 0x74207765 out of bounds>, length=-1)
    at /home/CVS/tcl/generic/tclStringObj.c:1146
#3  0x080dbf58 in Tcl_GetCommandFullName (interp=0x87903b0, command=0x8797994, objPtr=0x87ed050)
    at /home/CVS/tcl/generic/tclBasic.c:2816
#4  0x080f4c2e in TclInfoFrame (interp=0x87903b0, framePtr=0x87977f4)
    at /home/CVS/tcl/generic/tclCmdIL.c:1381
#5  0x080f41a1 in InfoFrameCmd (dummy=0x0, interp=0x87903b0, objc=2, objv=0x87f0fd0)
    at /home/CVS/tcl/generic/tclCmdIL.c:1210

At frame 4, I see:
  *framePtr looks fine
  *framePtr->framePtr looks fine
  *framePtr->framePtr->procPtr looks fine (body seems ok too)
  * framePtr->framePtr->procPtr->cmdPtr looks NOT OK

In particular: hPtr, nsPtr contain bad data, no proc/objProc/nreProc at all, ...

Now: both *framePtr and *framePtr->framePtr are TclStackAlloc'ed ... but *procPtr is not.

dkf: can you please check that *framePtr->framePtr->procPtr looks ok? Details require some knowledge about oo inner workings, I guess.

dkf added on 2010-05-20 21:52:41:
Assigning to Miguel because this looks like some kind of weird problem in TclStackAlloc.

dkf added on 2010-05-19 21:59:35:
Bug appears to be caused by the TclStackAlloc-ated space for the method frame somehow overlapping with the TclStackAlloc-ated space for the CmdFrame that is allocated inside of TclInfoFrame. No idea how that comes to pass!

If I change TclInfoFrame to use a local variable (should be OK for 48 bytes that don't persist) then everything works.

The following code is the test case (and isn't far off minimal):

  oo::class create Tuple {
    method fixup {x} {
      if {$x eq "conversion"} {my New type}
    }
    method New {x} {
      my fixup $x
      puts >[info frame -1]
    }
    constructor {args} {my New conversion}
  }
  set ts [Tuple new]

coldstore added on 2010-05-19 20:48:36:

File Added - 374454: test.tcl

coldstore added on 2010-05-19 20:48:08:

File Deleted - 374448:

coldstore added on 2010-05-19 20:30:48:
I have attached a 50 line program which reliably reproduces the crash here.

coldstore added on 2010-05-19 20:30:13:

File Added - 374448: test.tcl

dkf added on 2010-05-17 16:48:29:
I've changed the code to now use Tcl_GetCommandFullName. This *should* only move the location of the crash into that function. The problem – a hash entry with a null pointer to its table – is a Can't Happen case so far as I can see. :-(

ferrieux added on 2010-05-17 05:08:35:
OK, but [info frame] is not really my backyard :}
Reassigning to dkf based on ChangeLog comments.
(Might also be aku, that's TIP 280 stuff -- feel free to reassign again).

dkf added on 2010-05-14 15:32:55:
I suspect that it is trying to inspect a method or lambda. Neither of those have a conventional command table reference, though they should both have NULL procPtr->cmdPtr or procPtr->cmdPtr->hPtr fields. Hope this helps tracking this down.

Attachments: