Ticket UUID: | 800106 | |||
Title: | Incorrect treatment of mounted file by glob. | |||
Type: | Bug | Version: | obsolete: 8.4.5 | |
Submitter: | andreas_kupries | Created on: | 2003-09-03 23:06:22 | |
Subsystem: | 37. File System | Assigned To: | andreas_kupries | |
Priority: | 8 | Severity: | ||
Status: | Closed | Last Modified: | 2003-10-23 05:59:05 | |
Resolution: | Fixed | Closed By: | andreas_kupries | |
Closed on: | 2003-10-22 22:59:05 | |||
Description: |
Session: [andreask@pliers trans]$ ~/TDK/bin/tclsh % file type hackomatic.kit file % package require vfs::mk4 1.9 % vfs::mk4::Mount hackomatic.kit hackomatic.kit mk4vfs1 % file type hackomatic.kit directory % glob -types d * 2print __attic__ as docbook splint tdk-specials teapot snit papers-ii papers tcl_2003 tcllib AnnArbor bp save- check % info patchlevel 8.4.4 % What is happening is simple. A file is mounted over itself, therefore a directory. [file type] returns the correct information. [glob] however does _not_ list the file as directory, as it should. At this point it is unclear if this is a problem in the tcl core, or in tclvfs. | |||
User Comments: |
andreas_kupries added on 2003-10-23 05:59:05:
Logged In: YES user_id=75003 The patch has approved by me and Jeff and is now committed to the 8.4 head. vincentdarley added on 2003-10-14 00:16:08: File Added - 64216: core84.patch Logged In: YES user_id=32170 If we do wish to apply this to Tcl 8.4.5, here is a patch. I'll leave that decision up to others... vincentdarley added on 2003-10-13 23:52:47: Logged In: YES user_id=32170 I realised there's a much better way of fixing this. Entirely backwards compatible at source and binary level. I've checked this into Tcl 8.5a0. It simply adds a new flag value to the Tcl_GlobTypeData.type field. Filesystems need to be updated to be aware of this flag if they are to return results which are correct at filesystem boundaries. An interesting question is whether this should also go in Tcl 8.4.5. vincentdarley added on 2003-10-10 18:19:46: Logged In: YES user_id=32170 I have a fix for this problem which I need to clean up and then work out how best to apply to Tcl. It does require (as noted below) a change to the Tcl_Filesystem structure, but in a backwards compatible way through use of the TCL_FILESYSTEM_VERSION field (which will hereafter have value 0x2). I don't know whether such a change requires a TIP or not. It is a change to a public interface of Tcl, but is only there to fix a bug, and does not cause any backwards compatibility problems with extensions like tclvfs. Any comments? vincentdarley added on 2003-09-09 01:01:16: Logged In: YES user_id=32170 A few notes... The current core-vfs layer goes with the idea that one is actually plugging in a filesystem protocol implementation. One or more actual operational filesystems can then be made available through that plug-in. If we wish to switch to a system where Tcl keeps track of mount points, we'll need to think about moving code like Vfs_AddMount, Vfs_RemoveMount from tclvfs/vfs.c to Tcl's core. One semi-problem with such an approach is that native filesystems tend to assume that they alone are responsible for mounting/unmounting new volumes. This means the native filesystem would have to inform Tcl's core when volumes appear/disappear. Avoiding this issue is one reason why the current approach of Tcl's core always asking the filesystems for their volumes is perhaps cleaner. I don't know if this argument generalizes from 'volumes' to 'all mount points'. If so, it suggests we should go with my original suggestion of extending the Tcl_FSListVolumes idea. vincentdarley added on 2003-09-05 14:34:41: Logged In: YES user_id=32170 I don't see why this process is quadratic. To find the filesystem for the directory is linear (or constant time if it has been cached). We then need to ask every other filesystem for the list of mountpoints in that directory, this is again linear. In any case the number of registered filesystems is likely to be 2 or perhaps 3, so I don't think we need be too concerned. Of course, given we have to modify the filesystem interface, a different way to modify it would be to introduce core structures for mountpoints. This would be a larger change, but not that dramatic, really. I have yet to think through the negative consequences of such a change... andreas_kupries added on 2003-09-05 02:46:18: Logged In: YES user_id=75003 This interface means IIUC that each element has to ask each filesystem to find where it is, and we also know the fs of the dir we are in, this latter can be done once. If they don't match we have a mountpoint. This process however can be quadratic. vincentdarley added on 2003-09-04 23:30:49: Logged In: YES user_id=32170 There are no core structures recording mountpoints... There is only a proc for each filesystem called 'TclPathInFilesystemProc' which either says 'yes' or 'no' when it is called with a path. I forget why this particular design decision was made. andreas_kupries added on 2003-09-04 23:06:15: Logged In: YES user_id=75003 Regarding: glob -types d hackomatic.kit % package require vfs::mk4 1.9 % vfs::mk4::Mount hackomatic.kit hackomatic.kit -readonly mk4vfs1 % glob -types d hackomatic.kit @@handler: mk4vfs1 - matchindirectory - /nfs/crimper/home/andreask/trans/hackomatic.kit - - hackomatic.kit - {} 4 @@handler: mk4vfs1 - stat - /nfs/crimper/home/andreask/trans/hackomatic.kit - - hackomatic.kit - hackomatic.kit % The @@ lines are on stderr, tracing. So yes, this does access the vfs, and it does report the correct thing. As you said below, the solution I envisioned was to determine if there are mountpoints in a directory currently searched. These have to be reported as directories, and excluded from reporting as files, and have to be used for recursive matching, i.e. patterns like foo/*/*. Secondly, as you rightly note, a file in some directory which is the origin of a mount has to be surpressed from its directory too, so we have to ask during matching if something is the origin of a mount. both queries should be possible, and available on the level of the API to the whole FS code. The filesystems themselves require no changes. This query operates purely on the core structures (hashtables I presume) recording mountpoints and their origins. vincentdarley added on 2003-09-04 21:05:43: Logged In: YES user_id=32170 Here's another simple example of the same basic issue. Start up tclkit.exe: cd [file dirname [info nameof]] glob */* will not return anything inside tclkit.exe. vincentdarley added on 2003-09-04 16:55:58: Logged In: YES user_id=32170 Quick question: in your example above, what does: glob -types d hackomatic.kit return? Does this access the vfs? vincentdarley added on 2003-09-04 16:54:26: Logged In: YES user_id=32170 Another complication -- what if we mount something not over itself, but simply into a previously non-existent file, then 'glob' won't even list it at all. Given this problem, the best (only?) solution I can think of is that the *generic* function Tcl_FSMatchInDirectory in tclIOUtil.c should check whether any vfs's are registered *inside* the directory being listed, and if this is true, modify the result appropriately. This would also have an advantage of the performance impact being small, provided we can do that test relatively easily. It also has the benefit of being a reasonably elegant solution. Unfortunately, there is no current API by which a filesystem can be asked "Do you have any mounts in directory 'foo'?", although it is in some sense a generalisation of the Tcl_FSListVolumes function. This means to solve the problem, we will need to do something like increment TCL_FILESYSTEM_VERSION_1, and overload Tcl_FSListVolumes as a new function (Tcl_FSListMounts) which takes as a parameter a Tcl_Obj containing a directory. If that directory is null, the function behaves like the old Tcl_FSListVolumes, and otherwise it asks the filesystem for a list of all matching mounts in the given directory. So, this fix is only really possible in Tcl 8.5 and will require a TIP. While we're at it, therefore, we might wish to consider other filesystem enhancements. At this point I should say I have other priorities which mean I do not expect to see myself as the primary author of any such tip... Vince. vincentdarley added on 2003-09-04 14:38:35: Logged In: YES user_id=32170 Yup, this is a core bug, but one that is nasty to fix without a significant performance hit. The issue is that the TclpMatchInDirectory native fs functions (in mac/unix/win) call direct to the native fs to determine the type of each file returned from glob (and do so in a relatively efficient way, since they already have the file's native representation, etc). What is needed is an examination of this code and how it can be made to be vfs aware. In particular, we should be aware there *will* be a performance penalty when -types in used. Note: this actually raises the possibility that there might be another bug in the existing code, if the results generated by glob are appropriately cached, since it might be that TclpMatchInDirectory generates Tcl_Objs which have cached the native filesystem and native representation of the file (hence claiming the objects for the native fs, which could be wrong). I can't remember if I got around to making TclpMatchInDirectory that clever or not... andreas_kupries added on 2003-09-04 06:16:28: Logged In: YES user_id=75003 After instrumenting the metakit fs in tclvfs a bit I get [andreask@pliers trans]$ ~/TDK/bin/tclsh % info patchlevel 8.4.4 % package require vfs::mk4 1.9 % vfs::mk4::Mount hackomatic.kit hackomatic.kit mk4vfs1 % file type hackomatic.kit %%mk%%|sta|mk4vfs1| sb(atime) = 0 sb(csize) = 0 sb(ctime) = 0 sb(gid) = 0 sb(ino) = mk4vfs1.dirs!0 sb(mode) = 0777 sb(mtime) = 0 sb(nlink) = 2 sb(size) = 0 sb(type) = directory sb(uid) = 0 sb(view) = mk4vfs1.dirs directory % glob -types d * 2print __attic__ as docbook splint tdk-specials teapot snit papers-ii papers tcl_2003 tcllib AnnArbor bp save-check % The %%... and sb(...) are output from the instrumentation. file type goes through the [stat] implementation of mkfs. glob otoh does not call into any of the mkfs functions. Because of this I conclude that this is a core issue where the glob command incorrectly determines the type of a path ... This means for me that I have to actually loop through all matches to find directories instead of relying on glob. |
Attachments:
- core84.patch [download] added by vincentdarley on 2003-10-14 00:16:08. [details]