Ticket UUID: | 1036064 | |||
Title: | TCL crashes if the application runs for a long time | |||
Type: | Bug | Version: | final: 8.3.5 | |
Submitter: | nobody | Created on: | 2004-09-28 10:21:45 | |
Subsystem: | 01. Notifier | Assigned To: | kennykb | |
Priority: | 5 Medium | Severity: | ||
Status: | Closed | Last Modified: | 2004-09-28 21:08:26 | |
Resolution: | Invalid | Closed By: | kennykb | |
Closed on: | 2004-09-28 14:08:26 | |||
Description: |
OS Platform and Version : W2K Problem Behaviour: We have an application built using Tcl/Tk V8.3.5. This application will be typically used for a long duration (say 12 to 60 hrs). During such a prolonged usage the application crashes and a Dr.Watson dump is generated (the log is attached). On the first analysis of the log I could figure out that the crash has occured in the function TclpStrftime, which will typically be invoked by the usage of the Tcl command [clock clicks - milliseconds]. The crash does not happen during the short duration of the usage of the tool. Expected Behaviour: The application should not crash Contact email id : [email protected] | |||
User Comments: |
kennykb added on 2004-09-28 21:08:26:
Logged In: YES user_id=99768 There are multiple sources of confusion in this bug report; let me try to untangle a few of them, or else the explanation will appear wholly unrelated. First, despite the indications in DrWatson.log, the crash did *not* occur in or around TclpStrftime. Rather, TclpStrftime was the last exported name before the code in question. (This fact is not surprising; it's the last exported name in the Tcl library.) The code that faulted was, in actuality, a bit of generated code, in another segment, that handles probing the large activation record of TclRegExec (the 'exec' function in generic/regexec.c). The stack probes went below the base of the stack segment at 0x34000 and faulted. This is a usual behaviour of most software confronted with a stack overflow. Tcl contains logic to make stack overflows more benign, in the function TclpCheckStackSpace in TclWin32Dll.c. Unfortunately, in the release you're using, the stack commitment that TclpCheckStackSpace imposes is not enough to handle the demands of TclRegExec (whose activation record is extremely large). This problem is fixed in release 8.4.7; see http://sourceforge.net/tracker/index.php?func=detail&aid=947070&group_id=10894&atid=110894 and http://cvs.sourceforge.net/viewcvs.py/tcl/tcl/win/tclWinInt.h?r1=1.20.2.2&r2=1.20.2.3 for the details. The bad news is that this change will, in the log that I'm seeing, apparently just convert a crash to a Tcl error; the stack will still have overflowed. Unlike the case with most stack overflows, I'm not seeing tremendously deep recursive invocations of Tcl code. Rather, I'm observing that there's an unusually large amount of stack in use prior to a call to Tcl_DoOneEvent in or near a procedure named Q_Init (which is not part of Tcl, so I can't comment on it). I suspect several possibilities here: (1) It's possible that Q_Init (or something called from it) is leaking memory that is allocated with the 'alloca' library call; 'alloca' allocates memory by expanding the activation record. Eventually, there isn't enough stack space left to run the event handler, and the process crashes. (2) Another possibility is that Q_Init calls a deeply recursive nest of functions, each of which is compiled with frame pointers omitted. Since the DLL in question has no symbol information, DrWtsn32 can't trace calls through it. (3) Yet another possibility is that an event handler in C (again, compiled with frame pointers omitted) is invoking Tcl_DoOneEvent (or invoking Tcl code that calls [update] or [vwait]) and Tcl_DoOneEvent finds another event pending. The second event in turn also does Tcl_DoOneEvent in its event handler, and so on. Eventually, there are enough unfinished event handlers stacked that the process crashes. If this is the case, the most likely cause is that something does [after idle] or Tcl_DoWhenIdle from an idle handler - the documentation remarks that doing so is not safe. Since the stack exhaustion appears to be the result of Tcl_DoOneEvent being entered with inadequate stack space remaining, rather than any inherent fault in the Tcl library itself, I'm closing this bug. If you need further help tracking things down, I'd suggest visiting http://mini.net/cgi-bin/chat.cgi and talking to the Tcl developers there. nobody added on 2004-09-28 17:21:46: File Added - 103016: DrWtsnLog_sim19.txt |
Attachments:
- DrWtsnLog_sim19.txt [download] added by nobody on 2004-09-28 17:21:46. [details]