Tcl Source Code

View Ticket
Login
Ticket UUID: 3382401
Title: Thread pool leaks memory
Type: Bug Version: None
Submitter: auriocus Created on: 2011-07-29 19:34:15
Subsystem: 80. Thread Package Assigned To: vasiljevic
Priority: 7 High Severity:
Status: Open Last Modified: 2012-11-27 00:40:29
Resolution: Accepted Closed By:
    Closed on:
Description:
Using thread pool to do a computation in parallel with many thousand tasks, I noticed, that tpool leaks memory. The script attached demonstrates the loss both in ActiveTcl 8.5.10.1 on Windows/64 and in tcl8.6 HEAD on Mac OSX. My real computation ran for several hours on 6 cores and the leaked memory summed up to several hundred MB. I could not locate the leakage any further, first I thought the culprit was the tpool::wait command with the list result, but replacing it with a simple "after" in the test case has no influence. Also, tpool::release can not free the leaked memory.
User Comments: techentin added on 2012-11-27 00:40:29:
I see the same memory leak with Tcl 8.5.13 and Thread 2.6.7 on Linux.

It looks to me like the resource leak is related to the thread pool maintaining information on completed jobs.  

I inserted periodic calls to tpool::wait and tpool::get inside the loop that posts work to the thread pool, and observed reduced memory growth.  (About half)  If I call tpool::get twice on the same job, I get "no such job," so I presume that it cleans up some resources.  But I don't see anything explicit in the API to discard results from completed jobs. 

When I changed the post option from -nowait to -detached, the memory leak went away.


Bob

auriocus added on 2011-07-30 12:50:35:
If I change the tpool::wait command, which is supposed to give back the still running processes in the list, the leakage becomes a bit slower (I checked after a fixed number of iterations). See the attached demo script. Therefore the joblist leaks, but it's not the only leak in here.

auriocus added on 2011-07-30 12:48:32:

File Added - 419751: threadpool_leakdemo84_slow.tcl

auriocus added on 2011-07-30 12:35:00:
I foun dthe leak originally witjin ActiveState 8.5.10.1 on Windows. I can reproduce it using Tcl8.6 HEAD:
bin/tclsh8.6
% info patchlevel
8.6b1.2
% package require Thread
2.6.7
% 

This Tcl8.6 is 64bit, the leakage seems to be twice as fast as with the 32bit version.

hobbs added on 2011-07-30 05:18:28:
I am able to repro the leak on Windows with ActiveTcl 8.5.9.2 and 8.4.19.5 using Thread 2.6.7 (not final).

andreas_kupries added on 2011-07-30 04:59:55:
Oh, the question about '... did you try 2.6.6 release' was for Christian ...

vasiljevic added on 2011-07-30 04:58:09:
yes. there are no significant core-code related changes there.
if 2.6.6 does not leak, the 2.6.7 will not either.

andreas_kupries added on 2011-07-30 04:54:42:
You have seen that Don Porter has RC for 2.6.7 up ?
See tcl-core mailinglist.

vasiljevic added on 2011-07-30 04:51:54:
Hey, thanks Andreas! I think I will need to do some reading and learning
new stuff...

Apropos reported leak... did you try 2.6.6 release? I saw I have already
corrected some things related to tpool already...

andreas_kupries added on 2011-07-30 04:48:39:
Thread sources.

See http://core.tcl.tk/ in general,
and http://core.tcl.tk/thread/timeline?y=ci
specifically.

Further http://wiki.tcl.tk/28127
and  http://wiki.tcl.tk/28126

vasiljevic added on 2011-07-30 04:45:09:
I think I would need to release a new version. My own private sandbox
seems to work fine. The 2.6.5 that gets distributed with Mac OSX seems
to leak. But I need to figure out where are the sources! I heard SF is not
hosting the sources any more...

vasiljevic added on 2011-07-30 04:27:01:
yes. this eats up memory somewhere. will double-check and fix.
thanks for reporting this!

auriocus added on 2011-07-30 03:55:57:
OK, so try the third version. I shortened the inner loop and made more iterations. Now you'll see, that also virtual memory winds up.

auriocus added on 2011-07-30 03:54:47:

File Added - 419733: threadpool_leakdemo84_fast.tcl

vasiljevic added on 2011-07-30 03:46:43:
The virtual memory is what counts. The real memory is
not important here. If virtual memory does not increase
you do not have leak.

The OS versioning is weird by Mac, yes.
I also have SnowLeopard but 10.6.7. This is no problem.

auriocus added on 2011-07-30 03:17:33:
My platform:

91-65-210-59-dynip:Programmieren chris$ tclsh8.4
% parray tcl_platform
tcl_platform(byteOrder) = littleEndian
tcl_platform(machine)   = i386
tcl_platform(os)        = Darwin
tcl_platform(osVersion) = 10.8.0
tcl_platform(platform)  = unix
tcl_platform(threaded)  = 1
tcl_platform(tip,268)   = 1
tcl_platform(tip,280)   = 1
tcl_platform(user)      = chris
tcl_platform(wordSize)  = 4
% 

PS: Funny version. My OS is Snow Leopard 10.6.8, why does it say 10.8.0?

auriocus added on 2011-07-30 03:14:33:
I've updated the script for 8.4 without max. If I run it with
tclsh8.4 threadpool_leakdemo84.tcl
then the value in activity monitor "physical memory" winds up, every ten seconds +0.1MB. It starts from 3.0 and goes up to 6MB. You are right that "virtual memory" does not change; I wonder, why.

auriocus added on 2011-07-30 03:05:45:

File Added - 419729: threadpool_leakdemo84.tcl

vasiljevic added on 2011-07-30 02:47:44:
I cannot confirm this on Max OSX using Tcl8.4

% array get tcl_platform
osVersion 10.7.0 byteOrder littleEndian tip,268 1 threaded 1 machine i386 platform unix os Darwin tip,280 1 user zoran wordSize 4
% set tcl_version
8.4

When I start your script it ocmplains about math max function.
I remove that out of the code (it is not really related to anything
thread-wise) and run the script. It allocates some 30MB virtual
memory and stays there no matter howmany times i run your
leak proc.

I gues this has something to do with new(er) Tcl cores.  I have
not tested anything later then 8.4.

auriocus added on 2011-07-30 02:34:15:

File Added - 419726: threadpool_leakdemo.tcl

Attachments: