Tcl Source Code

Ticket Change Details
Login
Overview

Artifact ID: b2fa274250f1acdf6d99e00b8f5428ac6be0433c
Ticket: 21000b5d407f330efaf8e6207fe8d025b0633ebb
Problem with Tcl_DeleteInterp() in 8.4.9.1_Activetcl
User & Date: dkf 2014-02-23 18:16:08
Changes

  1. icomment:
    I've looked through the log and I don't see anything particularly problematic; there's simply a <i>lot</i> of things being freed. Since an interpreter is the major context object in Tcl, it's not surprising that deleting it will cause a great many things to cease to be. Many of these things are allocated during the running of scripts (e.g., cached objects, cached strings, cached compilations) so we would not in general expect that the amount of work done in the two cases would be the same. It looks like there's only <i>actually</i> 6540 calls to free in there; plenty, but not really very excessive.
    <p>
    Looking through the trace, some of the operations appear to be taking a long time, but that's probably just the OS preempting to allow some other thread or process to execute. (I'm guessing that mutex lock acquisition is done in a macro inside libc, but that's nothing to do with Tcl.) I don't see any double-frees; well, not without a malloc in-between (I can't tell why we're allocating during the release of the interpreter, but it does seem to be a very small number). I'm also puzzled as to why so many of Tcl's internal calls are not showing up in the trace; I'm guessing inadequacy in the tooling.
    <p>
    Maybe the OS is pulling pages back in from a stressed disk (“swapping”)? That can have mysterious symptoms if you're not expecting it. If that's the case (I don't know how to diagnose for sure; back when I used Solaris, I just put my ear to the machine and listened to the disk head working very hard! Low tech, I know!) there's absolutely nothing we can do about it from a Tcl perspective, as all that software can ever do is assume that it is adequately provisioned. Tuning how much physical memory to allow a system to use for a particular set of tasks is a bit of a black art, but over-allocating is typically cheaper than the developer effort to create a better estimate. The amount of space required appears to be <i>at least</i> 4MB (on the basis of the addresses in the log) and is probably much more than that.
    <blockquote style='background:#f0f0f0'>
    <h2>General Tips</h2>
    In general, we don't advise building Tcl with <tt>TCL_MEM_DEBUG</tt> enabled as that's much slower. Tcl has its own built-in layer on top of the system memory allocator that avoids most of the locking (making things <i>much</i> faster) but it is disabled when doing mem-debugging as it confuses the heck out of some third-party tools. We also <i>do not advise deallocating on exit</i> (known by us as “finalization”) as the cheapest way of cleaning up the memory held by the process is to just let the OS do it for us by throwing the process away directly. This is very different to how most C++ code works, but we've measured this and know it is definitely true. We have separate hooks to handle deletion of things that need special treatment on exit (e.g., DB connection handles) but they're really rather rare.
    <p>
    In short, I advise trying to avoid deleting the interpreter unless you want to throw away the context and yet keep the process going.
    </blockquote>
    In short, there's nothing obviously wrong <i>in Tcl</i> as far as I can see. There probably is something horribly wrong elsewhere. My current guess is that there's too much instrumentation (yes, it has a major impact!), insufficient hardware resources, or that the hardware is is undergoing some kind of nasty slow failure. I really hope for your sake that it isn't the last of those options, but I've had them happen to me in the past.
    
  2. login: "dkf"
  3. mimetype: "text/html"
  4. username: "dkf"