Ticket UUID: | 1325803 | |||
Title: | stat ino, nlink fields not set | |||
Type: | Bug | Version: | obsolete: 8.5a4 | |
Submitter: | nobody | Created on: | 2005-10-13 13:29:41 | |
Subsystem: | 37. File System | Assigned To: | vincentdarley | |
Priority: | 5 Medium | Severity: | ||
Status: | Closed | Last Modified: | 2005-10-24 01:51:45 | |
Resolution: | Fixed | Closed By: | vincentdarley | |
Closed on: | 2005-10-23 18:51:45 | |||
Description: |
When support for file links was added to Tcl, the 'stat' command was not updated on Windows to add unix-like support for the ino and nlink fields. Here's a relevant email transcript: Jeff Hobbs to Oscar, me More options Sep 28 Hi Oscar, You may well be right that the 'file stat' stuff was not updated to properly reflect the status of hard links. I am cc'ing Vince Darley, author of that TIP to comment on whether that was an oversight or intentional. Regards, Jeff Oscar Bonilla wrote: > On Sep 27, 2005, at 6:16 PM, Jeff Hobbs wrote: > > > Oscar Bonilla wrote: > > > >> I was talking with Larry about hard links on Windows and how we don't > >> support them even though NTFS supports them, and he brought up that > >> ActivePython supports them. > >> > >> I was looking at the ftp site of activestate, and it seems the newest > >> source is for version 2.3.1 from Nov 2003. Do you know where I could > >> get the latest source? > >> > >> Or better yet, do you know how they implemented the stat(2) syscall > >> (in terms of which win32 APIs)? Does Tcl handle this? > >> > > > > Tcl supports hard links as well: > > > > http://aspn.activestate.com/ASPN/docs/ActiveTcl/tcl/TclCmd/ > > file.htm#M20 > > Well, unless I'm reading > http://cvs.sourceforge.net/viewcvs.py/tcl/tcl/win/tclWinFile.c? > rev=1.77&view=auto > incorrectly, 'file stat' always returns st_ino = 0 and st_nlink = 1 > on Windows. > > So even though you can create hard links, you can't tell if a file is > hard linked to another or if it has more than one link. Right? > > Cygwin seems to get away with using nFileIndexHigh | nFileIndexLow > for the inode and you can get the real hard link count from > GetFileInformationByHandle() in the nNumberOfLinks field. > > > You would get the latest python from SourceForge. > > I couldn't find the hard link stuff there... Tcl source seems more > organized ;-) > > Regards, > > -Oscar ReplyReply to allForwardInvite Jeff to Gmail Vince Darley to Jeff, Oscar More options Sep 28 Oscar, Indeed 'ino' and 'nlink' were not updated when I added support for hard links to Tcl. This was simply an oversight due to my lack of use of these fields (and it would seem most people's lack of use of them, given yours is the first bug report on this!). It would be good to make these as similar as possible to their Unix interpretations. Can you perhaps provide a patch and/or new tests for the test suite? | |||
User Comments: |
vincentdarley added on 2005-10-24 01:51:45:
Logged In: YES user_id=32170 Committed fix to cvs head. obonilla added on 2005-10-15 00:36:11: Logged In: YES user_id=219610 There should be a warning in the release notes or somewhere that st_ino can have collisions on Windows and should then not be used for determining whether a file is the same as another. A typical idiom (in C) for determining if two files are links to the same file is: if (stat(filea, &sa) || stat(fileb, &sb)) return (error); if ((sa.st_ino == sb.st_ino) && (sa.st_dev == sb.st_dev) && sa.st_nlink >= 2)) return (same); In Tcl it would be more verbose, but it would follow the same pattern. This is dangerous because on Windows that code could very well say yeah, they're the same when in fact they are not. Obviously, the current code just returns 0 for st_ino, so it would fail that test... In Unix, the '.' and '..' directories, are links to self and parent respectively. So if you create an empty directory, it will have 2 links (self, and the link from the parent directory). If the directory doesn't have any subdirectories, it will have 2 links (same as before). If you create a subdirectory, that subdirectory also has . (self) and .. (to directory), so now, our directory will have 3 links. $ mkdir foobar $ ls -lad foobar drwxr-xr-x 2 ob ob 68 Oct 14 10:34 foobar $ mkdir foobar/baz $ ls -lad foobar drwxr-xr-x 3 ob ob 102 Oct 14 10:34 foobar $ A common optimization for determining whether you have to recourse into a directory or not, is to use hard links to see if the directory has any subdirectories or not... but I have to admit it's kind of a special case... maybe not worth implementing. vincentdarley added on 2005-10-15 00:08:51: Logged In: YES user_id=32170 'st_ino' will come from whatever the compile environment sys/stat.h or sys/types.h says, so, as you correctly point out, it's "unsigned short". Nothing Tcl can do about that, I believe. I don't know how 'nlink' is defined on unix, but no 'man stat' that I've found talks about nlink for directories as the number of files within in, so I'd be interested in a clear definition (does it include subdirs, symlinks?). obonilla added on 2005-10-14 23:43:21: Logged In: YES user_id=219610 Make sure st_ino is at least defined as an unsigned. I don't know where you're getting your definition of struct stat from, but the one from MS's headers and the one in MSYS default st_ino to short, and both nFileIndexHigh and nFileIndexLow are DWORDs. You can imagine all the collisions you can get by casting a quad word to a short ;-) The other thing you guys might want to consider (although it might be too much work and you might just decide to punt on it) is that you can actually fake the st_nlink for directories and have it behave just like Unix. I wrote a little proc that goes something like: int dirEntries(const char *dir) { WIN32_FIND_DATA f; HANDLE h; char buf[MAXPATH]; int len, count = 0; strcpy(buf, dir); len = strlen(buf); if (len == 0 || buf[len - 1] == '\\' || buf[len - 1] == '/') { strcat(buf, "*"); } else { strcat(buf, "/*"); } if ((h = FindFirstFileA(buf, &f)) == INVALID_HANDLE_VALUE) return (0); count++; while (FindNextFileA(h &f)) { if (f.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) count++; } FindClose(h); return (count); } You might want to tweak it to conform to Tcl-style and not use the *A API which is just for ASCII and doesn't handle Unicode. As I said, I don't know how useful this would be... it allows some optimizations when walking directory structures, but I don't know if anyone else uses those... vincentdarley added on 2005-10-14 05:09:51: File Deleted - 152404: File Added - 152452: nlink.diff vincentdarley added on 2005-10-14 05:09:50: Logged In: YES user_id=32170 And here's a better patch with tests. vincentdarley added on 2005-10-14 00:00:56: File Added - 152404: nlink.diff vincentdarley added on 2005-10-14 00:00:55: Logged In: YES user_id=32170 Added first attempt at a patch. Vince. |
Attachments:
- nlink.diff [download] added by vincentdarley on 2005-10-14 05:09:51. [details]