Tcl Source Code

View Ticket
Login
Ticket UUID: 2918610
Title: Wrong file opened
Type: Bug Version: obsolete: 8.5.8
Submitter: magv Created on: 2009-12-21 11:28:03
Subsystem: 36. Pathname Management Assigned To: dgp
Priority: 9 Immediate Severity:
Status: Closed Last Modified: 2010-01-06 02:00:08
Resolution: Fixed Closed By: dgp
    Closed on: 2010-01-05 19:00:08
Description:
Here's a stripped down code snippet:

    set root somedir
    exec rm -rf $root
    exec mkdir -p $root
    exec touch $root/x.y

    file normalize anything

    foreach fn [glob $root/x.y] {
        set ofn [file rootname $fn]
        set fl [open $ofn {WRONLY CREAT}]
        puts $fl "something"
        close $fl
    }

    puts [exec ls $root]

I expect this code to put "something" into somedir/x, and then print x and x.y. Instead it puts "something" into somedir/x.y, and prints x.y; somedir/x is not created, even though $ofn is somedir/x.

I tried this code in the following environments:

* FreeBSD/i386 7-STABLE, Tcl 8.4.19 -- works as expected
* FreeBSD/i386 7-STABLE, Tcl 8.5.8 -- fails as described
* FreeBSD/i386 7-STABLE, Tcl 8.6-b1 -- fails as described
* FreeBSD/amd64 8.0-RC2, Tcl 8.5.8 -- fails as described
* Windows XP, ActiveTcl 8.5.8 -- works as expected

Alexandre Ferrieux was kind enough to test in another environment [1]:

* Windows/MinGW, Tcl 8.5-HEAD -- works as expected

Note that removing the void [file normalize], replacing glob with it's value, or removing foreach loop also makes it work as expected.

I will try to run the code on other systems to see if it's specific to FreeBSD (maybe FreeBSD port patches are to blame).

[1] http://groups.google.com/group/comp.lang.tcl/browse_thread/thread/859e950831fa3e9b
User Comments: dgp added on 2010-01-06 02:00:08:

allow_comments - 1

Attached patch committed for 8.5.9 and 8.6b2.

dgp added on 2010-01-06 01:51:09:

File Added - 357792: 2918610.patch

dgp added on 2010-01-06 01:13:53:
The pattern for this bug almost perfectly matches
2710920.

dgp added on 2009-12-23 04:55:39:
I also have a testing script I am using
to probe this, which I will convert to a
test before commit (dkf is right!)

dgp added on 2009-12-23 04:54:19:
This appears to be an exposure of an
existing bug, rather than a new one.
Appears the PATHFLAGS != 0 and
portion == TCL_PATH_ROOT branch
has always been wrong wrong wrong
yet somehow that combination never
happened until the changes for 8.5.3.

Alex's patch is one sensible way to fix
the brokenness.  I want to chase down
a slightly different fix before commit.

dgp added on 2009-12-23 03:56:24:
Likely more fallout from the 
1972879 fix.

dgp added on 2009-12-23 03:32:30:
Bug introduced between 8.5.2 and 8.5.3

magv added on 2009-12-23 01:02:50:
Thanks, Alexandre. Your patch does wonders (even if it may not be a permanent solution).

dkf added on 2009-12-22 23:35:12:
It passes the test suite? That's good. The other thing to check is what impact it has on Tcl's startup time with a large auto_path or auto_path of large directories, notably including /usr/lib which is why there's been a problem in the past.

(Trying to remember why we've got all this complex infrastructure here in the first place... :-))

Before commit, the issue identified must be clearly tested for in a self-contained test and you need to check that the test fails before application of the patch and succeeds afterwards. I say this because failing to do that sort of thing has got me in trouble in the past, and VFS path objects are (clearly) very tricky.

ferrieux added on 2009-12-22 17:52:47:
Of course it fixes the local bug and passes the test suite...

ferrieux added on 2009-12-22 17:52:18:
Rather than chasing the whereabouts of the various flags (which have a right to exist anyway), I prefer to fix the broken code.
Attached patch does this by invalidating both the translated and native paths after the duplication.
I am uneasy about the freeing of the native path though. Currently it is freed only if the fsRecPtr *and* associated freeIntRepProc are not-null, but in any case the nativepathPtr is reset to NULL. So there's a potential leak if either of the above fs-related pointers is null. But I didn't find any example of freeing code doing more. Is it correct ?

ferrieux added on 2009-12-22 17:47:29:

File Added - 356110: rootname.patch

ferrieux added on 2009-12-22 17:15:47:
Also, it looks like another facet of what happened in 2710920: a brittle PATHFLAGS!=0 code branch.

ferrieux added on 2009-12-22 17:05:01:
Continuimg at a snail's pace... the offending flag is TCLPATH_APPENDED.
This flags is set by default when creating a path object with TclNewFSPathObj, so the question becomes, Why on Windows (and on earth) is it reset afterwards ?

ferrieux added on 2009-12-22 04:52:03:
Not much progress to report.. except that the dichotomy between the failing and non-failing cases is due to the following test in TclPathPart:

if (TclFSEpochOk(fsPathPtr->filesystemEpoch)
&& (PATHFLAGS(pathPtr) != 0)) {

with the dangerous field-sharing branch exposed when the "if" succeeds.
I have yet to grok why this happens only on unix. TBC.

dkf added on 2009-12-22 02:13:07:
Just noting that there have been persistent nagging bugs in this area, so something is definitely wrong and nobody's fixed it yet. (Also noting that I've never seriously poked my nose into this area, and don't propose to start now.)

ferrieux added on 2009-12-22 00:12:41:
I am an absolute ignorant of these things, but single-stepping shows that in the TclPathPart routine, case TCL_PATH_ROOT, when the result object is created, it is built by half-hearted duplication: the PATHOBJ itself is properly duplicated, but its fields are only lazily separated from the original.

Indeed:  normPathPtr is actively split off, because it holds the prime result; similarly the string rep is proprely separated (by T_DupObj) and updated. However, the other fields of the fsDupPtr are left untouched, and they are shared with the original path (argument of [file rootname]): translatedPathPtr and nativePathPtr.

I assume that these should be simply invalidated, though I don't know how (not much time to dive now).

cc_benny added on 2009-12-21 21:36:37:
I can confirm the behaviour with these:

  Mac OS X 10.4.11, Tcl 8.4.7 -- works
  Mac OS X 10.4.11, Tcl 8.5.7 (MacPorts) -- fails
  Mac OS X 10.4.11, Tcl 8.5.8 (CVS current) -- fails
  Mac OS X 10.4.11, Tcl 8.6b1.1 (CVS current) -- fails
  Linux Ubuntu 9.10 (PPC), Tcl 8.4.19 -- works
  Linux Ubuntu 9.10 (PPC), Tcl 8.5.8 (Source) -- fails
  Linux Ubuntu 9.10 (PPC), Tcl 8.6b1.1 (CVS current) -- fails

Looks to me like the breakage is in the Unix code, introduced
somewhere in the early 8.5 versions.

Attachments: