View Ticket
Not logged in
Ticket UUID: 0b874c344dd5331a5bc28b24f723ef8905903129
Title: Tcl SEGV
Type: Bug Version: 8.6b1
Submitter: cmcc Created on: 2013-11-23 18:33:55
Subsystem: 04. Async Events Assigned To: aku
Priority: 8 Severity: Critical
Status: Closed Last Modified: 2013-12-18 18:23:29
Resolution: Fixed Closed By: dgp
    Closed on: 2013-12-18 18:23:29
Description:
Attached a 2000 line script which invariably generates a SEGV on [info frame]

SEGV occurs in InfoFrameCmd() lines

	/* TODO - deal with overflow */
	topLevel += corPtr->caller.cmdFramePtr->level;

Where

(gdb) p *corPtr
$11 = {cmdPtr = 0xb6a1a78, eePtr = 0x9a2fab0, callerEEPtr = 0x9743b68, caller = {framePtr = 0x8055d70, varFramePtr = 0x8055d70, cmdFramePtr = 0x0, lineLABCPtr = 0x9a20110}, running = {framePtr = 0xb68f668, varFramePtr = 0xb68f668, cmdFramePtr = 0xb68f860, lineLABCPtr = 0x9a205d0}, lineLABCPtr = 0x9a205d0, stackLevel = 0xbffff05c, auxNumLevels = 3, nargs = -1}

(gdb) p corPtr->caller
$12 = {framePtr = 0x8055d70, varFramePtr = 0x8055d70, cmdFramePtr = 0x0, lineLABCPtr = 0x9a20110}

(gdb) p corPtr->caller.cmdFramePtr
$13 = (struct CmdFrame *) 0x0
User Comments: dgp added on 2013-12-18 18:23:29:
Much improved fix now merged to trunk.

dgp added on 2013-12-05 20:47:00:
Draft fix checked into branch bug-0b874c344d.

dgp added on 2013-12-04 14:37:30:
A variant of that:

coroutine X coroutine Y info frame 0

demonstrates that the proposed fix is
not enough to solve the problem.

dgp added on 2013-12-04 12:55:52:
Final entry:

coroutine X coroutine Y info frame

aku added on 2013-12-03 20:11:04:
Thank you Don.
That is very short and sweet.
Will work on making that a test ASAP.

dgp added on 2013-12-03 20:07:13:
Simpler still; no event loop required:

proc a {} {yield; coroutine C c}
proc b {} {yield; info frame}
proc c {} {tailcall B}
coroutine A a
coroutine B b
A

dgp added on 2013-12-03 19:58:52:
Here's a much shorter demo script:

    proc a {} {
        after 0 [info coroutine]
        yield
        coroutine C c
    }

    proc b {} {
        yield
        info frame
    }

    proc c {} {
        tailcall B
    }

    after 0 {
        coroutine A a
        coroutine B b
    }
    vwait forever

aku added on 2013-12-03 18:33:05:
> Perspective
Yes. There was actually code in the function for dealing with a possible NULL in the main iPtr->cmdFramePtr already. Given this it was considered to be possible for such a NULL to wind up in the corPtr->caller.cmdFramePtr as well.

> Merging
Ok.

> Test(s)
While accepting the possibility of a NULL winding up in the coro cmdFrame I am not sure yet how it actually happens. So, from me I have no prospects of crafting the necessary test.

And yes, I do wish that Colin had taken the time to reduce his script as much as possible.

dgp added on 2013-12-03 15:10:22:
So the fix takes the perspective that the
implementation of [info frame] failed to
account for an unusual, but normal condition.

This is in contrast to a perspective that
(corPtr->caller.cmdFramePtr == NULL) was
an invalid state and some bug was wrongly
allowing it to occur.

Accepting that, we should merge the fix to
the trunk.

Any prospects for crafting a test for the
test suite that doesn't require a 2100 line
script?

aku added on 2013-11-29 18:16:43:
Possible fix on branch "bug-0b874c344d-ak-info-frame-coro".
Revision [f8164c896c].
Committed.
Pushed.

aku added on 2013-11-25 20:51:14:
Interestingly enough, the other branch of the condition is where CORO_ACTIVATE_YIELDM is handled.

dgp added on 2013-11-25 20:23:35:
Part of the sequence of events leading to the segfault
is the 

SAVE_CONTEXT(corPtr->caller);

around line 8740 or so in tclBasic.c (in the TIP 396 commit,
in case the file has changed a lot since then).

This is called at a time when iPtr->cmdFramePtr is NULL,
which puts the NULL value in place which [info frame] later
crashes on attempting to dereference.

aku added on 2013-11-25 18:20:48:
First, I can confirm dgp's finding that revision [1d6747e53fa1fc82] is where things go bad, with the previous revision being ok.

aku added on 2013-11-25 18:09:05:
While my checkouts continue to update I looked over the changes made by [1d6747e53fa1fc82].

My current primary suspect is the call to "TclNRYieldObjCmd" at the end of "TclNRYieldToObjCmd", going from "clientData" to "INT2PTR(CORO_ACTIVATE_YIELDM))".

... Update complete, I can now build and actually start trialing.

aku added on 2013-11-25 17:55:39:
Thanks for bisecting. I just opened this bug and started to update my checkouts to investigate. So yes, having the bad commit should help.

dgp added on 2013-11-25 17:32:50:
In case it's helpful, the demo script generates
a SEGV starting with checkin

2012-04-02 13:13:11 1d6747e53fa1fc82 BAD CURRENT

That's the implementation of TIP 396.

miguel (claiming to be miguel.sofer@gmail.com) added on 2013-11-23 23:46:35:
I believe corPtr->caller.cmdFramePtr should never be NULL. How did this happen?

cmcc (claiming to be BackTrace) added on 2013-11-23 18:46:14:
(gdb) bt
#0  0xb7e8cde6 in InfoFrameCmd (dummy=0x0, interp=0x8055770, objc=1, objv=0xb68f8b4) at /home/colin/Desktop/packages/tcl8.6.1/generic/tclCmdIL.c:1174
#1  0xb7e734ef in Dispatch (data=0xada4b5c, interp=0x8055770, result=0) at /home/colin/Desktop/packages/tcl8.6.1/generic/tclBasic.c:4335
#2  0xb7e78b4f in TclNRRunCallbacks (interp=interp@entry=0x8055770, result=0, rootPtr=rootPtr@entry=0x810d108) at /home/colin/Desktop/packages/tcl8.6.1/generic/tclBasic.c:4368
#3  0xb7e7c68f in TclEvalObjEx (interp=interp@entry=0x8055770, objPtr=objPtr@entry=0xb688bc0, flags=flags@entry=131072, invoker=invoker@entry=0x0, word=word@entry=0) at /home/colin/Desktop/packages/tcl8.6.1/generic/tclBasic.c:5934
#4  0xb7e7c6db in Tcl_EvalObjEx (interp=interp@entry=0x8055770, objPtr=0xb688bc0, flags=flags@entry=131072) at /home/colin/Desktop/packages/tcl8.6.1/generic/tclBasic.c:5915
#5  0xb7f54508 in AfterProc (clientData=0x97e49b0) at /home/colin/Desktop/packages/tcl8.6.1/generic/tclTimer.c:1191
#6  0xb7f54854 in TimerHandlerEventProc (evPtr=evPtr@entry=0x9a0ef40, flags=flags@entry=-3) at /home/colin/Desktop/packages/tcl8.6.1/generic/tclTimer.c:593
#7  0xb7f31da3 in Tcl_ServiceEvent (flags=flags@entry=-3) at /home/colin/Desktop/packages/tcl8.6.1/generic/tclNotify.c:670
#8  0xb7f3203e in Tcl_DoOneEvent (flags=flags@entry=-3) at /home/colin/Desktop/packages/tcl8.6.1/generic/tclNotify.c:907
#9  0xb7ef12cc in Tcl_VwaitObjCmd (clientData=0x0, interp=0x8055770, objc=2, objv=0x805ca30) at /home/colin/Desktop/packages/tcl8.6.1/generic/tclEvent.c:1408
#10 0xb7e734ef in Dispatch (data=0xac8d824, interp=0x8055770, result=0) at /home/colin/Desktop/packages/tcl8.6.1/generic/tclBasic.c:4335
#11 0xb7e78b4f in TclNRRunCallbacks (interp=interp@entry=0x8055770, result=0, rootPtr=rootPtr@entry=0x0) at /home/colin/Desktop/packages/tcl8.6.1/generic/tclBasic.c:4368
#12 0xb7e78c7f in Tcl_EvalObjv (interp=interp@entry=0x8055770, objc=objc@entry=3, objv=objv@entry=0x805c8e0, flags=flags@entry=2097168) at /home/colin/Desktop/packages/tcl8.6.1/generic/tclBasic.c:4099
#13 0xb7e7a168 in TclEvalEx (interp=interp@entry=0x8055770, script=0x80b0808 "# H.tcl - light Httpd 1.1\nif {[info exists argv0] && ($argv0 eq [info script])} {\n    apply {{} {\n\tset home [file dirname [file normalize [info script]]]\n\tlappend ::auto_path $home [file join [file di"..., numBytes=5472, flags=flags@entry=0, line=1876, line@entry=1, clNextOuter=clNextOuter@entry=0x0, outerScript=0x80b0808 "# H.tcl - light Httpd 1.1\nif {[info exists argv0] && ($argv0 eq [info script])} {\n    apply {{} {\n\tset home [file dirname [file normalize [info script]]]\n\tlappend ::auto_path $home [file join [file di"...) at /home/colin/Desktop/packages/tcl8.6.1/generic/tclBasic.c:5237
#14 0xb7f25221 in Tcl_FSEvalFileEx (interp=interp@entry=0x8055770, pathPtr=pathPtr@entry=0x8082030, encodingName=0x0) at /home/colin/Desktop/packages/tcl8.6.1/generic/tclIOUtil.c:1809
#15 0xb7f2beab in Tcl_MainEx (argc=<optimised out>, argc@entry=2, argv=<optimised out>, argv@entry=0xbffff674, appInitProc=appInitProc@entry=0x8048790 <Tcl_AppInit>, interp=0x8055770) at /home/colin/Desktop/packages/tcl8.6.1/generic/tclMain.c:417
#16 0x0804868a in main (argc=2, argv=0xbffff674) at /home/colin/Desktop/packages/tcl8.6.1/unix/tclAppInit.c:84
(gdb)

Attachments: