Ticket UUID: | 1479814 | |||
Title: | Tcl should support Unicode versions of Win32 file APIs | |||
Type: | Bug | Version: | obsolete: 8.4.13 | |
Submitter: | offline | Created on: | 2006-05-01 15:44:05 | |
Subsystem: | 36. Pathname Management | Assigned To: | vincentdarley | |
Priority: | 7 High | Severity: | ||
Status: | Closed | Last Modified: | 2007-02-21 03:38:04 | |
Resolution: | Fixed | Closed By: | patthoyts | |
Closed on: | 2007-02-20 20:38:04 | |||
Description: |
The unicode filename format as described in http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/fs/naming_a_file.asp is not property handled in Tcl. The file normalize routines should allow the following to work: file exists {\\?\D:\existingfile.txt} which is equivalent to file exists {D:\existingfile.txt} The only difference is that the former form uses the Unicode file APIs in windows, allowing the creation of long pathnames (longer than 260 characters) which is extremely useful when dealing with certain Java tools' output. | |||
User Comments: |
patthoyts added on 2007-02-21 03:38:04:
Logged In: YES user_id=202636 Originator: NO At least the additional hackery has a fairly small footprint. Patch applied to 8.5. I don't see reason to apply this to 8.4 dgp added on 2007-02-21 00:35:46: Logged In: YES user_id=80530 Originator: NO I suppose there's a case to be made that a bad solution is better than no solution at all, but it still seems seriously wrong to me that we multiply the hacks in the core generic code instead of addressing this via the Tcl_Filesystem interfaces. That said, I obviously don't care enough to implement a better answer (at least not today) so HOSOGOTP rules. still,.... Ick! hobbs added on 2007-02-20 07:07:35: Logged In: YES user_id=72656 Originator: NO I give a +1 to adding it. patthoyts added on 2007-01-09 05:18:06: File Added - 210378: 1479814-extpath.patch Logged In: YES user_id=202636 Originator: NO I'm attaching a patch that supports the extended path prefix in file normalize and in the creation of the native form of the path. I have been successful in handling local files using the \\?\ syntax but MSDN says that UNC paths can have the same extension but using \\?\UNC\ as the prefix. I've not found a Microsoft application that supports this yet. File Added: 1479814-extpath.patch dgp added on 2006-05-15 01:05:17: Logged In: YES user_id=80530 If folks have the idea I'm against expanding the paths recognized by Tcl to match those recognized by the OS, that's incorrect. My only opinion on the subject is that if the set of pathnames recognized by Tcl is to be expanded, the correct way to do it is by adding on an additional Tcl_Filesystem, and not by adding in more hackery within the core cener of the VFS system. If we had Tcl_Filesystems back in the days when UNC support was added, I'd have said the same thing about them. offline added on 2006-05-14 21:23:16: Logged In: YES user_id=32671 The gist of discussion here is, I think, the right direction to go. Basically, remove the restriction that *prevents* this functionality working, and ensure that the underlying API calls are ones that support both forms (a case which I believe to be true for all standard file APIs). There's really no sense in adding an artificial restriction. The risk of problems with temp files is one best handled by the application developer. cc_benny added on 2006-05-14 17:14:47: Logged In: YES user_id=143885 Vince: > Are there really _no_ programs that could > manipulate these files if Tcl created them? I never looked for them so I can't say for sure. I doubt that the average user would have any though. > Does Windows 'explorer' handle them? Not as of W2K. CMD.EXE (again in W2K) can use the explicit syntax \\?\... in some circumstances but not in others, e.g. in the experiment I did for the discussion on tcl-core MKDIR worked but RMDIR didn't :-((. See <http://sourceforge.net/mailarchive/forum.php?thread_id=9833535&forum_id=3854>. As I understand Microsoft, if you use "normal" paths you get normal behaviour for all your programs and all garantees and functionality that Microsoft always gave for such paths. If you use \\?\ you can do what you want, but you are responsible yourself for every incompatibility and for informing the user about potential problems. I don't want that responsibility wherever I can avoid it. And I find it difficult to see how Tcl can use this transparently without encouraging programs that do what I consider bad things. Of course explicit (and documented) support for \\?\ or //?/ has my vote. I consider it a bug that this doesn't just work in Tcl. benny nobody added on 2006-05-13 15:37:45: Logged In: NO Are there really _no_ programs that could manipulate these files if Tcl created them? These are standard win32 APIs which have been around a fairly long time so it would seem somewhat bizarre if nothing else could access them. Surely if someone is asking to be able to create these long paths there must be some _other_ applications which are going to be involved in some way? Does Windows 'explorer' handle them? Vince. Note: if we wanted we could have a C variable linked to some tcl::unsupported::errorOnLongPaths Tcl variable which could be checked by the core filesystem code and not allow long paths. This wouldn't have any significant performance impact at all and might solve things for Benny. cc_benny added on 2006-05-12 18:21:25: Logged In: YES user_id=143885 vincentdarley: > of course as dgp days, someone can happily write > another Tcl_Filesystem to do that Is that how other UNC paths work? Basically \\?\ and \\.\ are just variations UNC paths AFAICS. > That, to me, would be the Tcl way -- it would just > work. For the programmer. It would not work for the user, because files that can only be created with this syntax can not be manipulated by the user outside of the program that created the file. Users would start to format their disks or re-install their system to get rid of those files. I might well be in the minority with this view, but if this gets implemented transparently, I'd want to add system-specific code to some of my programs to check that I don't use it inadvertently. It would make life more difficult for me. vincentdarley added on 2006-05-10 05:59:05: Logged In: YES user_id=32170 As remarked in other comments here, the definition of 'file normalize' precludes us from making Tcl itself interpret a leading '\\?\' at the Tcl level (of course as dgp says, someone can happily write another Tcl_Filesystem to do that), but there's nothing to stop us simply checking the length of paths in the core and putting in the appropriate prefix before calling any of the native routines. That, to me, would be the Tcl way -- it would just work. Vince. dgp added on 2006-05-04 18:53:03: Logged In: YES user_id=80530 Perhaps I'm completely off the mark, but shouldn't a set of pathnames marked by a completely separate prefix get handled by an additional Tcl_Filesystem ? hobbs added on 2006-05-02 06:14:14: Logged In: YES user_id=72656 Tcl uses the W (wide) Win32 file APIs throughout, if you are on an NT-based system. I believe there is a bit more to it than that to support this feature. offline added on 2006-05-02 05:38:10: Logged In: YES user_id=32671 Granted, although the com1 example can be deleted the same way as the file was created. Assuming that someone can tell me where the code that handles that is, I could have a look at special casing it. cc_benny added on 2006-05-02 05:20:37: Logged In: YES user_id=143885 "Chris R" writes on c.l.t: > There is one issue with this, though -- what *should* > the results of a file normalize be, [file normalize] for normal file names should not change IMO. > The two forms (with and without \\?\ prefix) are > equivalent to the API, even if not all frontend > applications support them. No they are not. Opening "\\?\c:\tmp\com1" will create a file "com1" which the user will than be unable to delete. Opening "c:\tmp\com1" will open the first serial port. There are probably other incompatibilities and Microsoft might introduce more. Also "\\?\..." doesn't mean anything to W9x/Me. Tcl handles UNC paths fine right now. Last time I tested this, the error result for opening \\?\c:\tmp\... took some time, which hints to me that some code probably tried to resolve the host "?". That would run into a timeout, because that host doesn't exist, of course. From testing it seems that the code already handles ".", which is used for devices. I think we might just need a special case in the code that handles UNC paths for the special hostname "?". |
Attachments:
- 1479814-extpath.patch [download] added by patthoyts on 2007-01-09 05:18:06. [details]