Tcl Source Code

View Ticket
Login
Ticket UUID: 1325803
Title: stat ino, nlink fields not set
Type: Bug Version: obsolete: 8.5a4
Submitter: nobody Created on: 2005-10-13 13:29:41
Subsystem: 37. File System Assigned To: vincentdarley
Priority: 5 Medium Severity:
Status: Closed Last Modified: 2005-10-24 01:51:45
Resolution: Fixed Closed By: vincentdarley
    Closed on: 2005-10-23 18:51:45
Description:
When support for file links was added to Tcl, the
'stat' command was not updated on Windows to add
unix-like support for the ino and nlink fields.

Here's a relevant email transcript:

 Jeff Hobbs 
to Oscar, me
 More options  Sep 28
Hi Oscar,

You may well be right that the 'file stat' stuff was not
updated to properly reflect the status of hard links.  I
am cc'ing Vince Darley, author of that TIP to comment on
whether that was an oversight or intentional.

Regards,

Jeff

Oscar Bonilla wrote:
> On Sep 27, 2005, at 6:16 PM, Jeff Hobbs wrote:
>
> > Oscar Bonilla wrote:
> >
> >> I was talking with Larry about hard links on
Windows and how we don't
> >> support them even though NTFS supports them, and
he brought up that
> >> ActivePython supports them.
> >>
> >> I was looking at the ftp site of activestate, and
it seems the newest
> >> source is for version 2.3.1 from Nov 2003. Do you
know where I could
> >> get the latest source?
> >>
> >> Or better yet, do you know how they implemented
the stat(2) syscall
> >> (in terms of which win32 APIs)? Does Tcl handle this?
> >>
> >
> > Tcl supports hard links as well:
> >
> >
http://aspn.activestate.com/ASPN/docs/ActiveTcl/tcl/TclCmd/
> > file.htm#M20
>
> Well, unless I'm reading
>
http://cvs.sourceforge.net/viewcvs.py/tcl/tcl/win/tclWinFile.c?
> rev=1.77&view=auto
> incorrectly, 'file stat' always returns st_ino = 0
and st_nlink = 1
> on Windows.
>
> So even though you can create hard links, you can't
tell if a file is
> hard linked to another or if it has more than one
link. Right?
>
> Cygwin seems to get away with using nFileIndexHigh |
nFileIndexLow
> for the inode and you can get the real hard link
count from
> GetFileInformationByHandle() in the nNumberOfLinks field.
>
> > You would get the latest python from SourceForge.
>
> I couldn't find the hard link stuff there... Tcl
source seems more
> organized ;-)
>
> Regards,
>
> -Oscar


  ReplyReply to allForwardInvite Jeff to Gmail


Vince Darley 
to Jeff, Oscar
 More options  Sep 28
Oscar,

Indeed 'ino' and 'nlink' were not updated when I added
support for
hard links to Tcl. This was simply an oversight due to
my lack of use
of these fields (and it would seem most people's lack
of use of them,
given yours is the first bug report on this!). It would
be good to
make these as similar as possible to their Unix
interpretations.

Can you perhaps provide a patch and/or new tests for
the test suite?
User Comments: vincentdarley added on 2005-10-24 01:51:45:
Logged In: YES 
user_id=32170

Committed fix to cvs head.

obonilla added on 2005-10-15 00:36:11:
Logged In: YES 
user_id=219610

There should be a warning in the release notes or somewhere that st_ino 
can have collisions on Windows and should then not be used for 
determining whether a file is the same as another. A typical idiom (in C) 
for determining if two files are links to the same file is:

  if (stat(filea, &sa) || stat(fileb, &sb)) return (error);
  if ((sa.st_ino == sb.st_ino) && (sa.st_dev == sb.st_dev) && 
sa.st_nlink >= 2)) return (same);

In Tcl it would be more verbose, but it would follow the same pattern. 
This is dangerous because on Windows that code could very well say 
yeah, they're the same when in fact they are not. Obviously, the current 
code just returns 0 for st_ino, so it would fail that test...

In Unix, the '.' and '..' directories, are links to self and parent 
respectively. So if you create an empty directory, it will have 2 links 
(self, and the link from the parent directory). If the directory doesn't have 
any subdirectories, it will have 2 links (same as before). If you create a 
subdirectory, that subdirectory also has . (self) and .. (to directory), so 
now, our directory will have 3 links. 

$ mkdir foobar
$ ls -lad foobar
drwxr-xr-x   2 ob  ob  68 Oct 14 10:34 foobar
$ mkdir foobar/baz
$ ls -lad foobar
drwxr-xr-x   3 ob  ob  102 Oct 14 10:34 foobar
$ 

A common optimization for determining whether you have to recourse 
into a directory or not, is to use hard links to see if the directory has any 
subdirectories or not... but I have to admit it's kind of a special case... 
maybe not worth implementing.

vincentdarley added on 2005-10-15 00:08:51:
Logged In: YES 
user_id=32170

'st_ino' will come from whatever the compile environment
sys/stat.h or sys/types.h says, so, as you correctly point
out, it's "unsigned short". Nothing Tcl can do about that, I
believe.

I don't know how 'nlink' is defined on unix, but no 'man
stat' that I've found talks about nlink for directories as
the number of files within in, so I'd be interested in a
clear definition (does it include subdirs, symlinks?).

obonilla added on 2005-10-14 23:43:21:
Logged In: YES 
user_id=219610

Make sure st_ino is at least defined as an unsigned. I don't know where 
you're getting your definition of struct stat from, but the one from MS's 
headers and the one in MSYS default st_ino to short, and both 
nFileIndexHigh and nFileIndexLow are DWORDs. You can imagine all 
the collisions you can get by casting a quad word to a short ;-)

The other thing you guys might want to consider (although it might be 
too much work and you might just decide to punt on it) is that you can 
actually fake the st_nlink for directories and have it behave just like 
Unix. I wrote a little proc that goes something like:

int
dirEntries(const char *dir)
{
    WIN32_FIND_DATA    f;
    HANDLE             h;
    char               buf[MAXPATH];
    int                len, count = 0;

    strcpy(buf, dir);
    len = strlen(buf);
    if (len == 0 || buf[len - 1] == '\\' || buf[len - 1] == '/') {
        strcat(buf, "*");
    } else {
        strcat(buf, "/*");
    }

    if ((h = FindFirstFileA(buf, &f)) == INVALID_HANDLE_VALUE) return 
(0);
    count++;
    while (FindNextFileA(h &f)) {
        if (f.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) count++;
    }
    FindClose(h);
    return (count);
}

You might want to tweak it to conform to Tcl-style and not use the *A 
API which is just for ASCII and doesn't handle Unicode. As I said, I 
don't know how useful this would be... it allows some optimizations 
when walking directory structures, but I don't know if anyone else uses 
those...

vincentdarley added on 2005-10-14 05:09:51:

File Deleted - 152404: 



File Added - 152452: nlink.diff

vincentdarley added on 2005-10-14 05:09:50:
Logged In: YES 
user_id=32170

And here's a better patch with tests.

vincentdarley added on 2005-10-14 00:00:56:

File Added - 152404: nlink.diff

vincentdarley added on 2005-10-14 00:00:55:
Logged In: YES 
user_id=32170

Added first attempt at a patch.

Vince.

Attachments: