Tcl Source Code

View Ticket
Login
Ticket UUID: 800106
Title: Incorrect treatment of mounted file by glob.
Type: Bug Version: obsolete: 8.4.5
Submitter: andreas_kupries Created on: 2003-09-03 23:06:22
Subsystem: 37. File System Assigned To: andreas_kupries
Priority: 8 Severity:
Status: Closed Last Modified: 2003-10-23 05:59:05
Resolution: Fixed Closed By: andreas_kupries
    Closed on: 2003-10-22 22:59:05
Description:
Session:

[andreask@pliers trans]$ ~/TDK/bin/tclsh
% file type hackomatic.kit
file
% package require vfs::mk4
1.9
% vfs::mk4::Mount hackomatic.kit hackomatic.kit
mk4vfs1
% file type hackomatic.kit
directory
% glob -types d *
2print __attic__ as docbook splint tdk-specials teapot 
snit papers-ii papers tcl_2003 tcllib AnnArbor bp save-
check
% info patchlevel
8.4.4
%

What is happening is simple. A file is mounted over itself, 
therefore a directory. [file type] returns the correct 
information. [glob] however does _not_ list the file as 
directory, as it should. At this point it is unclear if this is 
a problem in the tcl core, or in tclvfs.
User Comments: andreas_kupries added on 2003-10-23 05:59:05:
Logged In: YES 
user_id=75003

The patch has approved by me and Jeff and is now committed 
to the 8.4 head.

vincentdarley added on 2003-10-14 00:16:08:

File Added - 64216: core84.patch

Logged In: YES 
user_id=32170

If we do wish to apply this to Tcl 8.4.5, here is a patch. 
I'll leave that decision up to others...

vincentdarley added on 2003-10-13 23:52:47:
Logged In: YES 
user_id=32170

I realised there's a much better way of fixing this. 
Entirely backwards compatible at source and binary level. 
I've checked this into Tcl 8.5a0.  It simply adds a new flag
value to the Tcl_GlobTypeData.type field.

Filesystems need to be updated to be aware of this flag if
they are to return results which are correct at filesystem
boundaries.

An interesting question is whether this should also go in
Tcl 8.4.5.

vincentdarley added on 2003-10-10 18:19:46:
Logged In: YES 
user_id=32170

I have a fix for this problem which I need to clean up and
then work out how best to apply to Tcl. It does require (as
noted below) a change to the Tcl_Filesystem structure, but
in a backwards compatible way through use of the
TCL_FILESYSTEM_VERSION field (which will hereafter have
value 0x2).

I don't know whether such a change requires a TIP or not. 
It is a change to a public interface of Tcl, but is only
there to fix a bug, and does not cause any backwards
compatibility problems with extensions like tclvfs.

Any comments?

vincentdarley added on 2003-09-09 01:01:16:
Logged In: YES 
user_id=32170

A few notes...

The current core-vfs layer goes with the idea that one is
actually plugging in a filesystem protocol implementation.
One or more actual operational filesystems can then be made
available through that plug-in. 

If we wish to switch to a system where Tcl keeps track of
mount points, we'll need to think about moving code like
Vfs_AddMount, Vfs_RemoveMount from tclvfs/vfs.c to Tcl's core.

One semi-problem with such an approach is that native
filesystems tend to assume that they alone are responsible
for mounting/unmounting new volumes. This means the native
filesystem would have to inform Tcl's core when volumes
appear/disappear. Avoiding this issue is one reason why the
current approach of Tcl's core always asking the filesystems
for their volumes is perhaps cleaner. I don't know if this
argument generalizes from 'volumes' to 'all mount points'. 
If so, it suggests we should go with my original suggestion
of extending the Tcl_FSListVolumes idea.

vincentdarley added on 2003-09-05 14:34:41:
Logged In: YES 
user_id=32170

I don't see why this process is quadratic. To find the
filesystem for the directory is linear (or constant time if
it has been cached). We then need to ask every other
filesystem for the list of mountpoints in that directory,
this is again linear.  In any case the number of registered
filesystems is likely to be 2 or perhaps 3, so I don't think
we need be too concerned.

Of course, given we have to modify the filesystem interface,
a different way to modify it would be to introduce core
structures for mountpoints. This would be a larger change,
but not that dramatic, really. I have yet to think through
the negative consequences of such a change...

andreas_kupries added on 2003-09-05 02:46:18:
Logged In: YES 
user_id=75003

This interface means IIUC that each element has to ask each 
filesystem to find where it is, and we also know the fs of the 
dir we are in, this latter can be done once. If they don't 
match we have a mountpoint. This process however can be 
quadratic.

vincentdarley added on 2003-09-04 23:30:49:
Logged In: YES 
user_id=32170

There are no core structures recording mountpoints...  There
is only a proc for each filesystem called
'TclPathInFilesystemProc' which either says 'yes' or 'no'
when it is called with a path.

I forget why this particular design decision was made.

andreas_kupries added on 2003-09-04 23:06:15:
Logged In: YES 
user_id=75003

Regarding: glob -types d hackomatic.kit

% package require vfs::mk4
1.9
% vfs::mk4::Mount hackomatic.kit hackomatic.kit -readonly
mk4vfs1
% glob -types d hackomatic.kit
@@handler: mk4vfs1 - matchindirectory -
 /nfs/crimper/home/andreask/trans/hackomatic.kit -  - 
hackomatic.kit - {} 4
@@handler: mk4vfs1 - stat -
 /nfs/crimper/home/andreask/trans/hackomatic.kit -  - 
hackomatic.kit -
hackomatic.kit
%

The @@ lines are on stderr, tracing. So yes, this does access 
the vfs, and it does report the correct thing.

As you said below, the solution I envisioned was to determine 
if there are mountpoints in a directory currently searched. 
These have to be reported as directories, and excluded from 
reporting as files, and have to be used for recursive matching, 
i.e. patterns like foo/*/*. Secondly, as you rightly note, a file 
in some directory which is the origin of a mount has to be 
surpressed from its directory too, so we have to ask during 
matching if something is the origin of a mount.

both queries should be possible, and available on the level of 
the API to the whole FS code. The filesystems themselves 
require no changes. This query operates purely on the core 
structures (hashtables I presume) recording mountpoints and 
their origins.

vincentdarley added on 2003-09-04 21:05:43:
Logged In: YES 
user_id=32170

Here's another simple example of the same basic issue. 
Start up tclkit.exe:

cd [file dirname [info nameof]]
glob */*

will not return anything inside tclkit.exe.

vincentdarley added on 2003-09-04 16:55:58:
Logged In: YES 
user_id=32170

Quick question: in your example above, what does:

glob -types d hackomatic.kit

return?  Does this access the vfs?

vincentdarley added on 2003-09-04 16:54:26:
Logged In: YES 
user_id=32170

Another complication -- what if we mount something not over
itself, but simply into a previously non-existent file, then
'glob' won't even list it at all.

Given this problem, the best (only?) solution I can think of
is that the *generic* function Tcl_FSMatchInDirectory in
tclIOUtil.c should check whether any vfs's are registered
*inside* the directory being listed, and if this is true,
modify the result appropriately. This would also have an
advantage of the performance impact being small, provided we
can do that test relatively easily.  It also has the benefit
of being a reasonably elegant solution.

Unfortunately, there is no current API by which a filesystem
can be asked "Do you have any mounts in directory 'foo'?",
although it is in some sense a generalisation of the
Tcl_FSListVolumes function. This means to solve the problem,
we will need to do something like increment
TCL_FILESYSTEM_VERSION_1, and overload Tcl_FSListVolumes as
a new function (Tcl_FSListMounts) which takes as a parameter
a Tcl_Obj containing a directory. If that directory is null,
the function behaves like the old Tcl_FSListVolumes, and
otherwise it asks the filesystem for a list of all matching
mounts in the given directory.

So, this fix is only really possible in Tcl 8.5 and will
require a TIP. While we're at it, therefore, we might wish
to consider other filesystem enhancements.

At this point I should say I have other priorities which
mean I do not expect to see myself as the primary author of
any such tip...

Vince.

vincentdarley added on 2003-09-04 14:38:35:
Logged In: YES 
user_id=32170

Yup, this is a core bug, but one that is nasty to fix
without a significant performance hit. The issue is that the
TclpMatchInDirectory native fs functions (in mac/unix/win)
call direct to the native fs to determine the type of each
file returned from glob (and do so in a relatively efficient
way, since they already have the file's native
representation, etc).

What is needed is an examination of this code and how it can
be made to be vfs aware. In particular, we should be aware
there *will* be a performance penalty when -types in used.

Note: this actually raises the possibility that there might
be another bug in the existing code, if the results
generated by glob are appropriately cached, since it might
be that TclpMatchInDirectory generates Tcl_Objs which have
cached the native filesystem and native representation of
the file (hence claiming the objects for the native fs,
which could be wrong). I can't remember if I got around to
making TclpMatchInDirectory that clever or not...

andreas_kupries added on 2003-09-04 06:16:28:
Logged In: YES 
user_id=75003

After instrumenting the metakit fs in tclvfs a bit I get

[andreask@pliers trans]$ ~/TDK/bin/tclsh
% info patchlevel
8.4.4
% package require vfs::mk4
1.9
% vfs::mk4::Mount hackomatic.kit hackomatic.kit
mk4vfs1
% file type hackomatic.kit
%%mk%%|sta|mk4vfs1|
sb(atime) = 0
sb(csize) = 0
sb(ctime) = 0
sb(gid)   = 0
sb(ino)   = mk4vfs1.dirs!0
sb(mode)  = 0777
sb(mtime) = 0
sb(nlink) = 2
sb(size)  = 0
sb(type)  = directory
sb(uid)   = 0
sb(view)  = mk4vfs1.dirs
directory
% glob -types d *
2print __attic__ as docbook splint tdk-specials teapot snit 
papers-ii papers tcl_2003 tcllib AnnArbor bp save-check
%


The %%... and sb(...) are output from the instrumentation. 
file type goes through the [stat] implementation of mkfs. glob 
otoh does not call into any of the mkfs functions. Because of 
this I conclude that this is a core issue where the glob 
command incorrectly determines the type of a path ...

This means for me that I have to actually loop through all 
matches to find directories instead of relying on glob.

Attachments: