Tcl Source Code

View Ticket
Login
Ticket UUID: 781746
Title: Memory leak with arrays whose elements are lists
Type: Bug Version: obsolete: 8.4.0
Submitter: jbeltz Created on: 2003-08-01 21:05:59
Subsystem: 42. Memory Preservation Assigned To: msofer
Priority: 5 Medium Severity:
Status: Closed Last Modified: 2003-08-07 09:03:42
Resolution: Accepted Closed By: jbeltz
    Closed on: 2003-08-07 02:03:42
Description:
This is my first summitted bug report so please forgive 
any newbie type mistakes.

OS:  HP-UX 11.0
Problem behavior:
We noticed a memory problem with a tcl script which 
which occurs when accessing a list from an array.  It is 
a very bizarre problem and have narrowed it down to a 
pretty simple script (See attached script memarrlist).

It seems like whenever you use the list from the array 
directly it causes memory to grow.  However if you make 
a copy of it first using [concat $a($elem)] then it is fine?

Run top -s1 in one window, and run the script in 
another.  The script waits in three places.

The first is after everything has been initialize.  Top 
should say the size of the program is about 9M.

The second is after running through the first loop.  This 
loop makes the copy of the list from the array then uses 
the copy in the following command.
set size [[length $recordList]
This doesn't cause the program to grow so top should 
still say about 9M.

The third wait you may never get to before running out 
of memory.  This loop does one of two things both of 
which cause the program to run out of memory.

    if { 1 } {
        set recordLine $a($index)
        set size [llength $recordLine]
    } else {
        set size [llength $a($index)]
    }
User Comments: jbeltz added on 2003-08-07 09:03:42:
Logged In: YES 
user_id=835726

Sorry I haven't responded, I got distracted on other things.  
You were correct the problem wasn't an errant memory 
problem.  The script you received was a self contained, 
narrowed down view of the problem we thought we were 
seeing.    

The original problem was noticed when we sourced a huge file 
containing an array of lists generated for a report.  The 
problem was after a source an use of the array's list caused 
the memory to grow.  After your explaination, it made perfect 
sense.  The problem was the array's list wasn't really a list 
but a string.  If we used the string in one of the list 
procedures it needed to create a list struct to give us the 
answer.  

Thanks and I am very glad it isn't a bug, but user error.

Jeff Beltz

msofer added on 2003-08-02 05:46:12:
Logged In: YES 
user_id=148712

Err ... sorry, this little window is tough for re-reading
before submitting, isn't it?
The reference counts are wrong :( You'll get 20000 Tcl_Obj:
   . objs 1 to 100 will have a refCount of 20001: obj i
appears in 
     each $a($count), and again in $a(i)
   . all others have a refCount of 1, they only appear in $a(i)

msofer added on 2003-08-02 05:37:44:
Logged In: YES 
user_id=148712

This may not really a bug in tcl, but rather pilot error? Or
insufficient docs? Or one of those cases where knowledge of
the internals makes a difference (performance, memory).

First, note that  your script runs much better if you replace 
    set a($count) [concat $element $count]
in the creation loop with the two lines
    set a($count) $element
    lappend a($count) $count
or else with
    set a($count) [concat $element [list $count]]
Note that these are (in this case) fully equivalent, except
for the memory requirements.

I'll try to explain the difference in memory requirements,
and let you decide where the bug is.

The problem is caused by the fact that [concat] returns a
new Tcl_Obj; if any of the lists to be concatenated is *not*
a pure list (which is your case), a string representation is
returned.  Your script hence defines all array elements
initially from their string internal representation. 

The first loop then converts a copy (recordLine) of each
$a($count) to a list when requesting its llength; but
recordLine's is promptly discarded at the next iteration. So
you have 20000 string reps and one list with 100 elements.
The second loop converts every array element to list
representation; you have then 20000 Tcl_Obj for the
$a($count), each with both a string and a list
representation, plus 2000000 Tcl_Obj for the elements of
$a($count) ... boom.

So: why doesn't my little change boom too? Tcl's internal
are clever about sharing Tcl_Obj's - in the second and third
versions, the core recognizes that many of those 2000000
elements are repeated, and manages with 20000 objects (each
with a ref erence count of 100) for a total of 40000.

But the core is not clever enough to notice that the 2000000
elements can be represented so sparingly in your script,
which creates them afresh by splitting the string values of
the $a($count). My versions never destroyed the relationship
to begin with, yours "went out of its way" to hide it. 

I do hope the explanation is understandable ...

jbeltz added on 2003-08-02 04:05:59:

File Added - 57583: memarrlist

Attachments: