Tcl Source Code

View Ticket
Login
Ticket UUID: 1479814
Title: Tcl should support Unicode versions of Win32 file APIs
Type: Bug Version: obsolete: 8.4.13
Submitter: offline Created on: 2006-05-01 15:44:05
Subsystem: 36. Pathname Management Assigned To: vincentdarley
Priority: 7 High Severity:
Status: Closed Last Modified: 2007-02-21 03:38:04
Resolution: Fixed Closed By: patthoyts
    Closed on: 2007-02-20 20:38:04
Description:
The unicode filename format as described in
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/fs/naming_a_file.asp
is not property handled in Tcl.

The file normalize routines should allow the following
to work:

file exists {\\?\D:\existingfile.txt}

which is equivalent to

file exists {D:\existingfile.txt}

The only difference is that the former form uses the
Unicode file APIs in windows, allowing the creation of
long pathnames (longer than 260 characters) which is
extremely useful when dealing with certain Java tools'
output.
User Comments: patthoyts added on 2007-02-21 03:38:04:
Logged In: YES 
user_id=202636
Originator: NO

At least the additional hackery has a fairly small footprint.
Patch applied to 8.5. I don't see reason to apply this to 8.4

dgp added on 2007-02-21 00:35:46:
Logged In: YES 
user_id=80530
Originator: NO


I suppose there's a case to be
made that a bad solution is better
than no solution at all, but it
still seems seriously wrong to me
that we multiply the hacks in
the core generic code instead of
addressing this via the Tcl_Filesystem
interfaces.

That said, I obviously don't
care enough to implement a
better answer (at least not
today) so HOSOGOTP rules.

still,.... Ick!

hobbs added on 2007-02-20 07:07:35:
Logged In: YES 
user_id=72656
Originator: NO

I give a +1 to adding it.

patthoyts added on 2007-01-09 05:18:06:

File Added - 210378: 1479814-extpath.patch

Logged In: YES 
user_id=202636
Originator: NO

I'm attaching a patch that supports the extended path prefix in file normalize and in the creation of the native form of the path. I have been successful in handling local files using the \\?\ syntax but MSDN says that UNC paths can have the same extension but using \\?\UNC\ as the prefix. I've not found a Microsoft application that supports this yet.
File Added: 1479814-extpath.patch

dgp added on 2006-05-15 01:05:17:
Logged In: YES 
user_id=80530


If folks have the idea I'm against
expanding the paths recognized by
Tcl to match those recognized by
the OS, that's incorrect.

My only opinion on the subject is
that if the set of pathnames recognized
by Tcl is to be expanded, the correct
way to do it is by adding on an
additional Tcl_Filesystem, and not
by adding in more hackery within
the core cener of the VFS system.

If we had Tcl_Filesystems back in 
the days when UNC support was added,
I'd have said the same thing about
them.

offline added on 2006-05-14 21:23:16:
Logged In: YES 
user_id=32671

The gist of discussion here is, I think, the right direction
to go.  Basically, remove the restriction that *prevents*
this functionality working, and ensure that the underlying
API calls are ones that support both forms (a case which I
believe to be true for all standard file APIs).

There's really no sense in adding an artificial restriction.
 The risk of problems with temp files is one best handled by
the application developer.

cc_benny added on 2006-05-14 17:14:47:
Logged In: YES 
user_id=143885

Vince:
> Are there really _no_ programs that could
> manipulate these files if Tcl created them?

I never looked for them so I can't say for sure.
I doubt that the average user would have any
though.

> Does Windows 'explorer' handle them?

Not as of W2K.

CMD.EXE (again in W2K) can use the explicit syntax
\\?\... in some circumstances but not in others,
e.g. in the experiment I did for the discussion on
tcl-core MKDIR worked but RMDIR didn't :-((.  See
<http://sourceforge.net/mailarchive/forum.php?thread_id=9833535&forum_id=3854>.

As I understand Microsoft, if you use "normal"
paths you get normal behaviour for all your
programs and all garantees and functionality that
Microsoft always gave for such paths.  If you use
\\?\ you can do what you want, but you are
responsible yourself for every incompatibility and
for informing the user about potential problems.

I don't want that responsibility wherever I can
avoid it.  And I find it difficult to see how Tcl
can use this transparently without encouraging
programs that do what I consider bad things.

Of course explicit (and documented) support for
\\?\ or //?/ has my vote.  I consider it a bug
that this doesn't just work in Tcl.

benny

nobody added on 2006-05-13 15:37:45:
Logged In: NO 

Are there really _no_ programs that could manipulate these
files if Tcl created them?  These are standard win32 APIs
which have been around a fairly long time so it would seem
somewhat bizarre if nothing else could access them.  Surely
if someone is asking to be able to create these long paths
there must be some _other_ applications which are going to
be involved in some way?

Does Windows 'explorer' handle them?

Vince.

Note: if we wanted we could have a C variable linked to some
tcl::unsupported::errorOnLongPaths Tcl variable which could
be checked by the core filesystem code and not allow long
paths.  This wouldn't have any significant performance
impact at all and might solve things for Benny.

cc_benny added on 2006-05-12 18:21:25:
Logged In: YES 
user_id=143885

vincentdarley:
> of course as dgp days, someone can happily write
> another Tcl_Filesystem to do that

Is that how other UNC paths work?  Basically \\?\ and
\\.\ are just variations UNC paths AFAICS.

> That, to me, would be the Tcl way -- it would just
> work.

For the programmer.  It would not work for the user,
because files that can only be created with this
syntax can not be manipulated by the user outside
of the program that created the file.  Users would
start to format their disks or re-install their
system to get rid of those files.

I might well be in the minority with this view,
but if this gets implemented transparently,
I'd want to add system-specific code to some of
my programs to check that I don't use it
inadvertently.  It would make life more difficult
for me.

vincentdarley added on 2006-05-10 05:59:05:
Logged In: YES 
user_id=32170

As remarked in other comments here, the definition of 'file
normalize' precludes us from making Tcl itself interpret a
leading '\\?\' at the Tcl level (of course as dgp says,
someone can happily write another Tcl_Filesystem to do
that), but there's nothing to stop us simply checking the
length of paths in the core and putting in the appropriate
prefix before calling any of the native routines.

That, to me, would be the Tcl way -- it would just work.

Vince.

dgp added on 2006-05-04 18:53:03:
Logged In: YES 
user_id=80530


Perhaps I'm completely off the
mark, but shouldn't a set of
pathnames marked by a completely
separate prefix get handled by
an additional Tcl_Filesystem ?

hobbs added on 2006-05-02 06:14:14:
Logged In: YES 
user_id=72656

Tcl uses the W (wide) Win32 file APIs throughout, if you are
on an NT-based system.  I believe there is a bit more to it
than that to support this feature.

offline added on 2006-05-02 05:38:10:
Logged In: YES 
user_id=32671

Granted, although the com1 example can be deleted the same
way as the file was created.

Assuming that someone can tell me where the code that
handles that is, I could have a look at special casing it.

cc_benny added on 2006-05-02 05:20:37:
Logged In: YES 
user_id=143885

"Chris R" writes on c.l.t:
> There is one issue with this, though -- what *should*
> the results of a file normalize be,

[file normalize] for normal file names should not
change IMO.

> The two forms (with and without \\?\ prefix) are
> equivalent to the API, even if not all frontend
> applications support them.

No they are not.  Opening "\\?\c:\tmp\com1" will create
a file "com1" which the user will than be unable to
delete.  Opening "c:\tmp\com1" will open the first
serial port.  There are probably other
incompatibilities and Microsoft might introduce more.
Also "\\?\..."  doesn't mean anything to W9x/Me.


Tcl handles UNC paths fine right now.  Last time I
tested this, the error result for opening
\\?\c:\tmp\... took some time, which hints to me that
some code probably tried to resolve the host "?".  That
would run into a timeout, because that host doesn't
exist, of course.

From testing it seems that the code already handles
".", which is used for devices.  I think we might just
need a special case in the code that handles UNC paths
for the special hostname "?".

Attachments: