Tcl Source Code

View Ticket
Login
Ticket UUID: 219393
Title: [trace] should use Tcl_Preserve()
Type: Support Version: None
Submitter: nobody Created on: 2000-10-26 05:12:17
Subsystem: None Assigned To: dkf
Priority: 9 Immediate Severity:
Status: Closed Last Modified: 2009-07-29 19:40:20
Resolution: Closed By: dkf
    Closed on: 2001-11-19 15:25:24
Description:
OriginalBugID: 6172 Bug
Version: 8.3.2
SubmitDate: '2000-08-25'
LastModified: '2000-09-05'
Severity: CRIT
Status: UnAssn
Submitter: techsupp
ChangedBy: ericm
OS: Linux-Red Hat
FixedDate: '2000-10-25'
ClosedDate: '2000-10-25'


Name:

Don Porter



Comments:

The same problem exists in the Tcl 8.4 branch,

    and should be corrected there too.



ReproducibleScript:

# FILE: demo.tcl

proc Hare {old args} {

    trace vdelete ::bunny w [list Hare $old]

    trace variable ::bunny w [list Hare $::bunny]

    error "FOO"

}

set bunny Bugs

trace variable bunny w [list Hare $bunny]

while {1} {catch { set bunny Easter}}



ObservedBehavior:

$ tclsh demo.tcl

Segmentation fault (core dumped)



DesiredBehavior:

No crash.



Patch:

*** tcl8.3.2/generic/tclCmdMZ.c Mon Apr 10 17:08:26 2000

--- tcl8.3.2-patch/generic/tclCmdMZ.c   Thu Aug 24 21:03:51 2000

***************

*** 2626,2633 ****

                                TraceVarProc, clientData);

                        if (tvarPtr->errMsg != NULL) {

                            ckfree(tvarPtr->errMsg);

                        }

!                       ckfree((char *) tvarPtr);

                        break;

                    }

                }

--- 2626,2634 ----

                                TraceVarProc, clientData);

                        if (tvarPtr->errMsg != NULL) {

                            ckfree(tvarPtr->errMsg);

+                           tvarPtr->errMsg = NULL;

                        }

!                       Tcl_EventuallyFree((ClientData) tvarPtr, TCL_DYNAMIC);

                        break;

                    }

                }

***************

*** 2729,2734 ****

--- 2730,2745 ----

      int code;

      Tcl_DString cmd;

  

+     /* 

+      * We might call Tcl_Eval() below, and that might evaluate

+      * [trace vdelete] which might try to free tvarPtr.  We want

+      * to use tvarPtr until the end of this function, so we use

+      * Tcl_Preserve() and Tcl_Release() to be sure it is not 

+      * freed while we still need it.

+      */

+ 

+     Tcl_Preserve((ClientData) tvarPtr);

+ 

      result = NULL;

      if (tvarPtr->errMsg != NULL) {

        ckfree(tvarPtr->errMsg);

***************

*** 2784,2792 ****

        result = NULL;

        if (tvarPtr->errMsg != NULL) {

            ckfree(tvarPtr->errMsg);

        }

!       ckfree((char *) tvarPtr);

      }

      return result;

  }

  ^L

--- 2795,2810 ----

        result = NULL;

        if (tvarPtr->errMsg != NULL) {

            ckfree(tvarPtr->errMsg);

+           tvarPtr->errMsg = NULL;

        }

!       Tcl_EventuallyFree((ClientData) tvarPtr, TCL_DYNAMIC);

      }

+ 

+     /*

+      * Match for Tcl_Preserve() near start of this function.

+      */

+ 

+     Tcl_Release((ClientData) tvarPtr);

      return result;

  }

  ^L



PatchFiles:

tclCmdMZ.c
User Comments: dkf added on 2009-07-29 19:40:20:

IP - Comment Removed: 130.88.1.31

dkf added on 2008-11-21 21:30:00:

data_type - 210894

dgp added on 2001-11-06 06:21:14:

File Added - 12904: tip68.patch

Logged In: YES 
user_id=80530

Attached is a corrected patch that stops the
segfaulting.  I added a check for a NULL value
of result near the end of TraceVarProc() in
tclCmdMZ.c

Now there are just the errors in the test suite:

incr-old-2.6
incr-1.28
incr-2.28
set-2.4
set-4.4
trace-8.{1,2,3,5}
var-9.{10,12}

All due to a failure to transfer the error mesage
from the trace script to the final result.  This
appears to be because the TCL_TRACE_RESULT_* flag
values are being masked away by Tcl_TraceVar2().
I'm punting back to DKF on how to resolve that problem.

dgp added on 2001-11-06 05:28:14:
Logged In: YES 
user_id=80530

TIP 68 looks good, but the patch causes multiple
segfaults during 'make test'.  I had them all
listed, but then SF hiccupped and lost them,
and I'm not gonna generate them again.

dkf added on 2001-11-05 18:23:24:
Logged In: YES 
user_id=79902

See TIP#68 http://purl.org/tcl/tip/68 and a possible
implementation patch at:
http://www.cs.man.ac.uk/~fellowsd/tcl/patches.tip68.patch

dkf added on 2001-03-23 18:44:41:
Logged In: YES 
user_id=79902

I've been looking at this in a bit more detail, and it seems
that the problem stems from the fact that the standard C
string returned by any variable trace function has to be
static (i.e. not deallocated before it has a chance to be
copied into a suitable error string for the interpreter) and
yet error results spat out by traces are not at all
guaranteed to be static. Currently, the work-around for this
involves storing a pointer to a copy of the error string in
the trace structure itself, but this is obviously
susceptable to the sorts of problems described in this bug. 
Command traces do not have this problem at all.

There are three possible workarounds:
  1) Use Tcl_Preserve() and Tcl_Release() to keep hold of
the structure; the only problem with this is that to keep
the interface static, these operations have to be applied in
tclVar.c:CallTraces (i.e. applied to the clientData prior to
calling the trace itself) in order to ensure that there is
sufficient guaranteed time for a temporary string to be
allocated to keep a copy of the string before it vanishes.
This is nasty, and requires a doc update to mention this
non-obvious behaviour (which could theoretically cause all
sorts of troubles...)
  2) Introduce an extra flag to be passed to Tcl_TraceVar
(and friends) to indicate that non-NULL results returned by
the callback are actually dynamic strings that need to be
released with ckfree()/Tcl_Free() when finished with.
  3) Introduce an extra flag to be passed to Tcl_TraceVar
(and friends) to indicate that non-NULL results returned by
the callback are actually pointers to Tcl_Objs (cast to
char*) that contain the error message, and can be handled by
the usual lifetime management mechanisms.

Solutions 2) and 3) can coexist with each other and will not
cause any backward incompatabilities, but will need a short
TIP to introduce.  But I suppose that's true of 1) as well,
and that's an alternative I don't like (it causes
non-trivial leakage of behaviour that could cause problems
for extension writers.)

dgp added on 2000-11-25 00:54:31:
Correction: The memory leak shows up only after the
patch has been applied.

dgp added on 2000-11-25 00:02:09:
Evaluating the ReproducibleScript shows a
memory leak, both before and after the patch
is applied.  There may still be more work to
do here to get the proper bug fix.

dgp added on 2000-11-24 23:47:45:
Re-opened.  I just re-tested and running Tcl 8.4a2
on Red Hat Linux 6.2 on a Compaq/DEC Alpha XLT-300,
I see a segmentation fault almost immediately.
Running my own patched version of Tcl 8.4, which
includes the patch in this report, there is no fault.

I re-categorized this to "Other".  It's certainly not
something as innocuous as "Code Cleanup".  The point is
not to make the code pretty, but to make it work.  A
better cateogry might be "Memory Management" to cover
pointer problems like this one.

Triggering bugs like this depends on the particulars
of how the system's memory allocation re-uses freed
memory.  Just because it works on one system, doesn't
mean it works on another.

I've registered Patch 102501 that fixes this bug in
the Patch Manager for public review.  I've assigned
the bug to myself, and I'll pass it on to the appropriate
maintainer when he is selected.

dkf added on 2000-11-24 21:49:54:
I don't see this in 8.4a2, even with several minutes of CPU time.

dkf added on 2000-11-10 18:18:32:
All failures that can lead to a crash are of the highest priority.

Attachments: