Tcl Source Code

View Ticket
Login
Ticket UUID: f00009f7ce4b58c8b6caedf976e37ea78be6d3c0
Title: file exists, mtime... procedures leaks
Type: Bug Version: 8.6.4
Submitter: RP. Created on: 2015-07-29 08:37:45
Subsystem: 37. File System Assigned To: sebres
Priority: 5 Medium Severity: Important
Status: Closed Last Modified: 2015-07-31 11:30:44
Resolution: Fixed Closed By: sebres
    Closed on: 2015-07-31 11:30:44
Description:
When I use command "file mtime" or "file exists" with parameter provided as result of "file join" command it leaks memory.
Example:

set path1 C:/example.txt
set path2 [file join C:/ example.txt]
file exists $path1 ;# this not leaks
file exists $path2 ;# this leaks

Note that there is difference between representations of both paths:

% ::tcl::unsupported::representation $path1
value is a pure string with a refcount of 4, object pointer at 006587A0, string representation "C:/example.txt"
% ::tcl::unsupported::representation $path2
value is a string with a refcount of 2, object pointer at 00658338, internal representation 00665318:00000000, string representation "C:/example.txt"
User Comments: sebres added on 2015-07-31 11:30:44:
I made a review for 8.5 in relation to this issue, commited in [86b6e7b98a8d1551] as branch [sebres_clean_core_8_5]. The code is more similar to 8.6 now.

BTW. I have misleadingly commited a diff from my private branch, and because I don't know something like "commit -amend" or "delete branch", I have just closed my misleadingly commit (if someone knows how it can be deleted, then please do it).

sebres added on 2015-07-30 19:17:24:
> Is this behaviour documented
yes, but ... if you pay attention about it, and in this area there are more than 2 functions.

> Should we audit all call sites of these two functions for similar ref-count trouble ?
Would prefer if we can rewrite it uniform, but this functions are public, so such changes would be backwards incompatible in external modules.

Not explainable to me are the troubles to reproduce it sometimes, resp. different behavour like executing of the same code without 'source' and within. So this function executes sometimes the scope with TranslatedPath, and sometimes the scope with NormalizedPath, but code is the same. Additionally this strange different execution time.
And if you have executed it in "source" once, that is no difference hereafter - then you can use it in console with different paths - all will go over scope with TranslatedPath.

aku added on 2015-07-30 18:46:51:
> - Tcl_FSGetTranslatedPath - returns a tclObj with incremented refCount, it should be decremented;
> - Tcl_FSGetNormalizedPath - returns a tclObj owned by Tcl, refCount should be not decremented;

Is this behaviour documented in the '--' comments at the location of the function definition ?

Should we audit all call sites of these two functions for similar ref-count trouble ?

sebres added on 2015-07-30 18:23:19:
This time fixed for both cases, but I believe it was not the last time, because of very confusing differently behavior Tcl_FSGetTranslatedPath vs Tcl_FSGetNormalizedPath:
- Tcl_FSGetTranslatedPath - returns a tclObj with incremented refCount, it should be decremented;
- Tcl_FSGetNormalizedPath - returns a tclObj owned by Tcl, refCount should be not decremented;

sebres added on 2015-07-30 15:39:04:
Not so fast, because seems to be not so easy to fix...

That fixes a leak only if I give this line direct in stdin inside tclsh:
% time {file exists [string trim "C:/example.txt "]} 100000

If I do the same in a file (for example "test"), and execute it via `source test`, it leaks again.
Possible another constellation regarding of objects, or refcounts (history involved) or literals, etc.

Very strange also, that the tests have different execution times (constantly  reproducible):
- with leaks (% source ...): 16.03923 microseconds per iteration
- w/o leaks (% time direct): 90.1751 microseconds per iteration

But a 1 year older branch does the same without leaks within 15ms also!

sebres added on 2015-07-30 13:06:51:
Fixed with [4199a280129370f2] in trunk, erroneous Tcl_IncrRefCount removed.
Will merge this in core_8_5_branch, but there it is a little bit larger.

sebres added on 2015-07-30 10:29:40:
New findings here: it's not a "file join" direct (may be still "Tcl_FSJoinToPath" despite called) - is enough to give to "file exists" an object, that will be destroed hereafter, because this code leaks exact like another:
```
time {file exists [string trim "C:/example.txt "]} 100000
```

May be it is Tcl_FSConvertToPathType.

sebres added on 2015-07-29 17:42:52:
I believe that it is - somewhere in 8.6 the Tcl_FSJoinToPath will be executed with object that refCount is 1 (or 0), so hereafter it will be destroyed, but it leaks on pathPtr... 
Very confused code here, need to think about that.

sebres added on 2015-07-29 17:21:54:
have found a fix on my branches

and another ticket (in the same time) von Don, http://core.tcl.tk/tcl/tktview/3479689fffffffffffff

My solution was a diff below, I don't know anymore why it was not so good for tcl comunity (but I not have leaks in my private branches with it):
```
 generic/tclFileName.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/generic/tclFileName.c b/generic/tclFileName.c
index a8360fc..d416eeb 100644
--- a/generic/tclFileName.c
+++ b/generic/tclFileName.c
@@ -823,6 +823,11 @@ Tcl_FSJoinToPath(
      */
 
     Tcl_IncrRefCount(ret);
+    // [SB] if will be deleted - prevent leaks on pathPtr (refCount) by ObjDeletePending (delayed free) :
+    if (!Tcl_IsShared(lobj)) {
+	// free list now :
+	TclFreeIntRep(lobj);
+    };
     Tcl_DecrRefCount(lobj);
     ret->refCount--;
     return ret;
```

Will try later to merge it in 8.6 for testing.

sebres added on 2015-07-29 17:05:55:
after all I've reproduced on current activestate build 8.6.4 x86 under windows with following code:
```
  time {file exists [file join C:/ example.txt]} 100000
```
It leaks constantly...

I can remember me that we've ever already fixed similar bug in 8.4 (and 8.5), see a719f5eddfcf81520a3e

sebres added on 2015-07-29 16:15:46:
I not see any leak under windows; Can you provide exact version you used (active state build or self compiled, possible commit point)?

BTW: refcounts are normal:
   * path1: 4 = call stack + in path1 + code + history
   * path2: 2 = call stack + in path2

dgp added on 2015-07-29 15:38:13:
No sign of a leak here.  Perhaps a Windows-only problem?

RP. added on 2015-07-29 13:43:15:
When I run script below, I observe fast and monotonous increase of memory usage in task manager:
while {1} {
   set path [file join C:/ example.txt]
   file exists $path
}

Same script, but with "file join" before "while" does not leak:
set path [file join C:/ example.txt]
while {1} {
   file exists $path
}

I've noticed that after "file" operation variable "path" changes it's representation (my guess is that this conversion could be source of the leak):
% set path [file join C:/ example.txt]
C:/example.txt
% ::tcl::unsupported::representation $path
value is a string with a refcount of 3, object pointer at 000000000021BBA0, internal representation 00000000002159B0:0000000000000000, string representation "C:/example.txt"
% file exists $path
0
% ::tcl::unsupported::representation $path
value is a path with a refcount of 3, object pointer at 000000000021BBA0, internal representation 0000000000246340:0000000000000000, string representation "C:/example.txt"

dgp added on 2015-07-29 12:22:17:
What's the evidence of a leak?