Tcl Source Code

View Ticket
Login
Ticket UUID: baf43873720f5e92ae1142b1fc8d343b3648b54d
Title: file join trashes valid network paths
Type: Bug Version: >= 8.5 (*nix only)
Submitter: bll Created on: 2018-07-27 01:18:54
Subsystem: 16. Commands A-H Assigned To: dgp
Priority: 5 Medium Severity: Severe
Status: Open Last Modified: 2018-07-27 20:55:35
Resolution: None Closed By: nobody
    Closed on: 2018-07-27 15:18:10
Description:
Reference: https://groups.google.com/d/msg/comp.lang.tcl/SqXhSGqGEWc/UdwXhExCBwAJ

Reference: (from the wiki): AMG: In Tcl 8.6.8, [file join //a/b] returns //a/b,
 but in Tcl 8.7a1, [file join //a/b] returns /a/b. This got me in trouble
 because I was trying to work with Windows UNC paths. In the end I just had to
 concatenate strings and forgo [file join]. [file nativename] worked right, at
 least.

file join on a valid windows network path will change the leading double slashes to a single slash.

file join should not trash a valid path.
User Comments: bll added on 2018-07-27 20:55:35:
Oh, I did understand.  Sorry about the confusion.

I don't think a // on unix should be changed in the path...
but
I also don't think any unix treats // as a special case.

At this point, I think the 8.6.8 implementation is fine.

AMG's note was specific to 8.7a1, I don't know if a bug was
introduced, but the tests should hopefully catch that.

I do think the documentation should be updated so that there is an explanation
of how [file join] treats network paths on windows.  And perhaps some 
explanation of the differences between windows and unix.

file join //host location file 
vs.
file join //host/location file

sebres added on 2018-07-27 19:51:59: (text/x-fossil-wiki)
> Then again [file join C: /a] on unix also destroys the drive path.

Sure, but you did not understand my example - I meant on unix c:/... is not absolute,<br/>
so in case [file join foo c:/bar/$tcl_platform(platform)]
it DOES NOT overwrite foo, but will be appended as relative path. Take a look on my example again.

This way it is definitely different as on windows (just as an argument against the UNC).

As regards the provided link, I saw it already in the newest edition (<a href="http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_13">issue 7, 2018 edition</a>):
<code><pre style="padding-left:10pt">
A pathname consisting of a single <slash> shall resolve to the root directory of the process. 
A null pathname shall not be successfully resolved.
If a pathname begins with two successive <slash> characters, the first component following 
the leading <slash> characters <b>may be interpreted in an implementation-defined manner,</b>
although more than two leading <slash> characters shall be treated as a single <slash> character.
</pre></code>

And in your opinion is "may be interpreted in an implementation-defined manner" the same as "the POSIX specification disagrees"?<br/>
In my opinion this sentence allows implementation to do what it wants.<br/>
Additionally it's going to pathname resolution (not about the join), and anyway not about the file-subsystem, to where the `file` ensemble does belong.

I could be wrong, but I'm sure this will not convince TCT to let us rewrite the handling to support UNC-paths.

bll added on 2018-07-27 18:07:17:
Link is here, last paragraph:

http://pubs.opengroup.org/onlinepubs/000095399/basedefs/xbd_chap04.html#tag_04_11

What I don't know, since //something/somepath/somefile is valid on unix,
is the specification like windows, where //something/somepath is a single
network path or is //something a stand-alone value that is valid.

I do not know what would be proper there.

I think I would make the same rule as for windows.
//something/somepath would need to be a single unit and the
leading // is kept intact.  This would preserve any UNC path
processing on a unix server.

Then again [file join C: /a]
on unix also destroys the drive path.

sebres added on 2018-07-27 17:15:44: (text/x-fossil-wiki)
As regards your latest comment (windows also), this was never possible using such kind of join (also on previous versions), because {\\myprinter} as well as {//myprinter} are not valid UNC-path (and not really valid share network name), so

<code><pre>
% file join //myprinter/myqueue myfile
//myprinter/myqueue/myfile
</pre></code>

but

<code><pre>
% file join //myprinter myqueue myfile
/myprinter/myqueue/myfile
</pre></code>

As regards the "putting together UNC pathnames to return to windows for processing" - yes, but why only windows? Let us then accept any file-system of the world (including virtual). The command is called `file join` and join path segments of the *nix-platform correctly.

Back to the issue, actually I'm not against this (because myself also working multi-platform and the artificial case [f34cf83dd0] is even fewer interesting for me, IMHO it was not really a bug).

But the fact is (if I understood the discussion correctly) - no one knows now how it is right, and I don't think someone want to revert the handling back to version before 8.6.7.

So if you meant "the POSIX specification disagrees", please provide a link or  give me the quote referencing this.

And of course I can reopen it, but ATM I do not see the good prospects of success.

bll added on 2018-07-27 15:18:10:
Please re-open.
Please read my latest notes.
The problem exists on windows.
The POSIX specification disagrees.

sebres added on 2018-07-27 15:10:46: (text/x-fossil-wiki)
The conclusion is - not a bug but feature.

Tcl will not support an UNC-path on *nix-platform, because:
<ul><li>it is not native for the file-subsystem of this platform</li>
<li>exactly the same manner as the other windows conventions for absolute path (like c:/) are not supported</li>
</ul>

So just compare this for both platforms:
<code><pre style="padding-left:10pt">
% file join foo c:/bar/$tcl_platform(platform)
c:/bar/windows
foo/c:/bar/unix
</pre></code>

The versions are: 8.6.7, 8.6.8, 8.7a1 and above.
Possibly the documentation should still get a notice about the handling provided in [2158eea530].

bll added on 2018-07-27 15:02:15:
Also, it would be possible for some file server running on unix to be
putting together UNC pathnames to return to windows for processing.
There should not be a unix/windows separation disjunction here.

Windows 7-64, Tcl 8.6.8, confirmed issue
% set printer \\\\myprinter
\\myprinter
% set queue myqueue
myqueue
% set file myfile
myfile
% puts [file join $printer $queue $file]
/myprinter/myqueue/myfile
% set printer //myprinter
//myprinter
% puts [file join $printer $queue $file]
/myprinter/myqueue/myfile
% file normalize //myprinter/myqueue/myfile
//myprinter/myqueue/myfile
%

Removing the *nix only.

bll added on 2018-07-27 14:52:28:
I would answer (a) for all three.

The poster on comp.lang.tcl had on windows 7, 8.6.3:
 > set printer \\\\192.168.1.171
 > puts [file join $printer queue doc.ps]
/192.168.1.171/queue/doc.ps

So windows may have issues also.


From the IEEE Std 1003.1 POSIX Specification:

A pathname that begins with two successive slashes may be interpreted in an implementation-defined manner, although more than two leading slashes shall be treated as a single slash.

( http://pubs.opengroup.org/onlinepubs/000095399/basedefs/xbd_chap04.html#tag_04_11 )

sebres added on 2018-07-27 11:03:10: (text/x-fossil-wiki)
Well, I found what it is, belong to solution for [f34cf83dd0], first time introduced in [2158eea530] (and merged in all major branches).

@Don: are you sure, this should be fixed so incompatible way as regards the UNC-pathes? 

Where is the TIP? (allow me to do this tiny joke ;)

The simple test-case filename-7.19 (added in [f49a421a0d]) is IMHO questionable also:
<code><pre style="padding-left:10pt">
test filename-7.19 {[Bug f34cf83dd0]} {
    file join foo //bar
} /bar
</pre></code>

Either TCL continue to support UNC platform-independent, or //bar is indeed simple absolute root-path (on *nix-platform) and the test is correct then.

So I would be interested to know the answer of the questions as regards the UNC-paths and segments:

<code><pre style="padding-left:10pt">
1. file normalize //a/b
a. //a/b
b. /a/b

2. file normalize //a//b
a. //a/b
b. /a/b

3. file join //a/b //c/d
a. //c/d
b. /c/d
</pre></code>

sebres added on 2018-07-27 09:57:24: (text/x-fossil-wiki)
Hmm... even weirder - also on previous versions, where it does not affect already "normalized" path, this removes slashes (including first) if one additional slash presents in the second path-segment (path was not normalized).
<code><pre style="padding-left:10pt;">
$ echo 'puts [file normalize //a/b/$tcl_platform(platform)]' | ./tclsh.sh
//a/b/unix
$ echo 'puts [file normalize //a//b/$tcl_platform(platform)]' | ./tclsh.sh
/a/b/unix
$ echo 'puts [file normalize //a/b//$tcl_platform(platform)]' | ./tclsh.sh
/a/b/unix
</pre></code>

sebres added on 2018-07-27 09:45:57: (text/x-fossil-wiki)
The problem here is not the join but normalize (invoked internally).
Additionally it is still correct on windows (so issue affects unix only).

<code><pre style="padding-left:10pt">
% file normalize //a/b/$tcl_platform(platform)
//a/b/windows
/a/b/unix
</pre></code>

Additionally it affects all current versions (on *nix) since 8.5.

I'm not sure the issue is really an issue (because UNC-paths are not accessible directly on *nix platform, so are not really valid in sense of *nix path).
Either one uses something like `mount -t drvfs '\\server\share' /mnt/share`, or different named conventions like `file://server/share`.

But to be platform-independent (e. g. build the path-segments) the current behavior looks wrong to me.

And indeed 8.6.8 has still remained first backslash, current 8.6 does not (as well as 8.5 also).

WiP.