Tcl Library Source Code

Check-in [6c65571dfa]
Login

Many hyperlinks are disabled.
Use anonymous login to enable hyperlinks.

Overview
Comment:Module uri. In order to preserve backward compatibility, reverse many of the changes made in the previous commit. The module has a secondary role as a repository of useful regexp patterns, which are not entirely documented. Rearrange pattern definitions in an attempt to segregate the code for the two different roles. Undo deprecation of uri::register. Add comments to explain the purpose of each block of definitions.
Timelines: family | ancestors | descendants | both | trunk
Files: files | file ages | folders
SHA1: 6c65571dfa61456294954a76a458e913a24e8a18
User & Date: kjnash 2017-01-07 15:39:25
Context
2017-01-08
10:06
Correct several tests in the math module - they were failing in Tcl 8.5, seemingly not in Tcl 8.6. After these corrections there are still two tests left that need examination, both concerning test-anova-F in the statistics package. check-in: d535d2f0ea user: arjenmarkus tags: trunk
2017-01-07
15:39
Module uri. In order to preserve backward compatibility, reverse many of the changes made in the previous commit. The module has a secondary role as a repository of useful regexp patterns, which are not entirely documented. Rearrange pattern definitions in an attempt to segregate the code for the two different roles. Undo deprecation of uri::register. Add comments to explain the purpose of each block of definitions. check-in: 6c65571dfa user: kjnash tags: trunk
01:20
Cleanup module uri. There is a large amount of unused code that obfuscates the useful code and deters revision. * Remove non-executed code and unused variables. ** Many regexps were defined but never used. ** Some regexps were even demanded and carefully processed by uri::register, but then never used. ** The function ::uri::GetHostPort was never used. * Deprecate uri::register. ** Registration is of little value: the schemes gopher, wais and prospero, although registered, have no functionality because the corresponding split and join commands are not defined. ** Leave the uri::register function in place for use by third-party code. check-in: 436d5be89b user: kjnash tags: trunk
Changes
Hide Diffs Unified Diffs Ignore Whitespace Patch

Changes to modules/uri/uri.tcl.

22
23
24
25
26
27
28

29



30




31
32
33
34
35



36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59







60
61
62









































63
64
65
66
67
68
69
70
71

72
73
74
75


76

77
78
79
80
81
82
83
84
85
86
87
88
89
90



91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107



















108
109
110
111
112
113
114
    namespace export resolve isrelative
    namespace export geturl
    namespace export canonicalize
    namespace export register

    variable file:counter 0


    # extend these variable in the coming namespaces



    variable schemes       {}





    # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    # basic regular expressions used in URL syntax.

    namespace eval basic {



	variable	digit		{[0-9]}
	variable	hex		{[0-9A-Fa-f]}
	variable	alphaDigit	{[A-Za-z0-9]}
	variable	alphaDigitMinus	{[A-Za-z0-9-]}
	variable	escape		"%${hex}${hex}"
	variable	digits		"${digit}+"

	variable	toplabel	\
		"(${alphaDigit}${alphaDigitMinus}*${alphaDigit}\\.?|${alphaDigit}\\.?)"
	variable	domainlabel	\
		"(${alphaDigit}${alphaDigitMinus}*${alphaDigit}|${alphaDigit})"

	variable	hostname	\
		"((${domainlabel}\\.)*${toplabel})"
	variable	hostnumber4	\
		"(?:${digits}\\.${digits}\\.${digits}\\.${digits})"
	variable	hostnumber6	{(?:\[[^]]*\])}
	variable	hostnumber	"(${hostnumber4}|${hostnumber6})"
	variable	hostspec	"${hostname}|${hostnumber}"

	variable	port		$digits

	variable	usrCharN	{[a-zA-Z0-9$_.+!*'(,);?&=-]}
	variable	usrChar		"(${usrCharN}|${escape})"







	variable	user		"${usrChar}*"
	variable	password	$user
	# Used by commands: hostspec port user password









































    } ;# basic {}
}

# ::uri::register (DEPRECATED) --
#
#	Register a scheme (and aliases) in the package. The command
#	creates a namespace below "::uri" with the same name as the
#	scheme and executes the script declaring the pattern variables
#	for this scheme in the new namespace.

#
# REASON FOR DEPRECATION
#	- Registration defines a namespace and pattern variables.
#	- From v1.2.7, the built-in schemes (and the urn scheme) do this without


#	  calling uri::register.

#	- Before v1.2.7, uri::register prevented re-registration of the same
#	  scheme name(s), and maintained variables that were not used for
#	  anything.
#	- The command is retained for use by third-party legacy code.
#
# Arguments:
#	schemeList	Name of the scheme to register, plus aliases
#       script		Script declaring the scheme patterns
#
# Results:
#	None.

proc ::uri::register {schemeList script} {
    variable schemes




    # Check scheme and its aliases for existence.
    foreach scheme $schemeList {
	if {[lsearch -exact $schemes $scheme] >= 0} {
	    return -code error \
		    "trying to register scheme (\"$scheme\") which is already known"
	}
    }

    # Get the main scheme
    set scheme  [lindex $schemeList 0]

    if {[catch {namespace eval $scheme $script} msg]} {
	catch {namespace delete $scheme}
	return -code error \
	    "error while evaluating scheme script: $msg"
    }



















    return
}

# ::uri::split --
#
#	Splits the given <a url> into its constituents.
#







>
|
>
>
>

>
>
>
>





>
>
>


















<
<
<



>
>
>
>
>
>
>


|
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>


|
|




|
>

<
|
|
>
>
|
>
|
<
|
<










>
>
>

















>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>







22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64



65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129

130
131
132
133
134
135
136

137

138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
    namespace export resolve isrelative
    namespace export geturl
    namespace export canonicalize
    namespace export register

    variable file:counter 0

    # --------------------------------------------------------------------------
    # These variables are used by uri::register and are a repository of
    # scheme-related pattern information that may be accessed by external code.
    # None is used by the other commands of this package.
    # --------------------------------------------------------------------------
    variable schemes       {}
    variable schemePattern ""
    variable url           ""
    variable url2part
    array set url2part     {}

    # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    # basic regular expressions used in URL syntax.

    namespace eval basic {
	# ----------------------------------------------------------------------
	# These variables are used to construct the variables used by commands.
	# ----------------------------------------------------------------------
	variable	digit		{[0-9]}
	variable	hex		{[0-9A-Fa-f]}
	variable	alphaDigit	{[A-Za-z0-9]}
	variable	alphaDigitMinus	{[A-Za-z0-9-]}
	variable	escape		"%${hex}${hex}"
	variable	digits		"${digit}+"

	variable	toplabel	\
		"(${alphaDigit}${alphaDigitMinus}*${alphaDigit}\\.?|${alphaDigit}\\.?)"
	variable	domainlabel	\
		"(${alphaDigit}${alphaDigitMinus}*${alphaDigit}|${alphaDigit})"

	variable	hostname	\
		"((${domainlabel}\\.)*${toplabel})"
	variable	hostnumber4	\
		"(?:${digits}\\.${digits}\\.${digits}\\.${digits})"
	variable	hostnumber6	{(?:\[[^]]*\])}
	variable	hostnumber	"(${hostnumber4}|${hostnumber6})"




	variable	usrCharN	{[a-zA-Z0-9$_.+!*'(,);?&=-]}
	variable	usrChar		"(${usrCharN}|${escape})"

	# ----------------------------------------------------------------------
	# >>> THESE VARIABLES ARE THE ONLY ONES USED BY COMMANDS <<<
	# ----------------------------------------------------------------------

	variable	hostspec	"${hostname}|${hostnumber}"
	variable	port		"${digit}*"
	variable	user		"${usrChar}*"
	variable	password	$user

	# ----------------------------------------------------------------------
	# This variable (and escape, hostname, hostnumber, port, user, password
	# from above) are used to construct the variables in the block below.
	# ----------------------------------------------------------------------

	variable	xCharN		{[a-zA-Z0-9$_.+!*'(,);/?:@&=-]}

	# ----------------------------------------------------------------------
	# These variables (and "escape") are used in the patterns defined in the
	# calls to uri::register at the end of the file.  They are not used by
	# any commands.
	# ----------------------------------------------------------------------

	variable	xChar		"(${xCharN}|${escape})"
	variable	host		"(${hostname}|${hostnumber})"
	variable	hostOrPort	"${host}(:${port})?"
	variable	login		"(${user}(:${password})?@)?${hostOrPort}"
	variable	alpha		{[a-zA-Z]}

	# ----------------------------------------------------------------------
	# These variables are not used by anything in this file.
	# ----------------------------------------------------------------------

	variable	loAlpha		{[a-z]}
	variable	hiAlpha		{[A-Z]}
	variable	safe		{[$_.+-]}
	variable	extra		{[!*'(,)]}
	# danger in next pattern, order important for []
	variable	national	{[][|\}\{\^~`]}
	variable	punctuation	{[<>#%"]}	;#" fake emacs hilit
	variable	reserved	{[;/?:@&=]}

	# next is <national | punctuation>
	variable	unsafe		{[][<>"#%\{\}|\\^~`]} ;#" emacs hilit

	#	unreserved	= alpha | digit | safe | extra
	#	xchar		= unreserved | reserved | escape

	variable	unreserved	{[a-zA-Z0-9$_.+!*'(,)-]}
	variable	uChar		"(${unreserved}|${escape})"

    } ;# basic {}
}

# ::uri::register --
#
#	Register a scheme (and aliases) in the package. The command
#	creates a namespace below "::uri" with the same name as the
#	scheme and executes the script declaring the pattern variables
#	for this scheme in the new namespace. At last it updates the
#	uri variables keeping track of overall scheme information.
#

#	The script has to declare at least the variable "schemepart",
#	the pattern for an url of the registered scheme after the
#	scheme declaration. Not declaring this variable is an error.
#
#	Registration provides a number of pattern variables for use by external
#	code.  It is unconnected to the commands provided by the uri package.
#	See the warnings near the end of this file where uri::register is

#	called.

#
# Arguments:
#	schemeList	Name of the scheme to register, plus aliases
#       script		Script declaring the scheme patterns
#
# Results:
#	None.

proc ::uri::register {schemeList script} {
    variable schemes
    variable schemePattern
    variable url
    variable url2part

    # Check scheme and its aliases for existence.
    foreach scheme $schemeList {
	if {[lsearch -exact $schemes $scheme] >= 0} {
	    return -code error \
		    "trying to register scheme (\"$scheme\") which is already known"
	}
    }

    # Get the main scheme
    set scheme  [lindex $schemeList 0]

    if {[catch {namespace eval $scheme $script} msg]} {
	catch {namespace delete $scheme}
	return -code error \
	    "error while evaluating scheme script: $msg"
    }

    if {![info exists ${scheme}::schemepart]} {
	namespace delete $scheme
	return -code error \
	    "Variable \"schemepart\" is missing."
    }

    # Now we can extend the variables which keep track of the registered schemes.

    eval [linsert $schemeList 0 lappend schemes]
    set schemePattern	"([::join $schemes |]):"

    foreach s $schemeList {
	# FRINK: nocheck
	set url2part($s) "${s}:[set ${scheme}::schemepart]"
	# FRINK: nocheck
	append url "(${s}:[set ${scheme}::schemepart])|"
    }
    set url [string trimright $url |]
    return
}

# ::uri::split --
#
#	Splits the given <a url> into its constituents.
#
937
938
939
940
941
942
943




















944




945





946
947
948















































































































































949
950
951
# (RFC 2255)
# ------------------------------------------------
# scheme	basic syntax of scheme specific part
# ------------------------------------------------
# ldap		//<host>:<port>/<dn>?<attrs>?<scope>?<filter>?<extensions>
# ------------------------------------------------





















# FTP




namespace eval ::uri::ftp {





    variable	Type		{[AaDdIi]}
    variable	typepart	";type=(${Type})"
    # Used elsewhere: typepart















































































































































}

package provide uri 1.2.7







>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

>
>
>
>
|
>
>
>
>
>
|
|

>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>



1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
# (RFC 2255)
# ------------------------------------------------
# scheme	basic syntax of scheme specific part
# ------------------------------------------------
# ldap		//<host>:<port>/<dn>?<attrs>?<scope>?<filter>?<extensions>
# ------------------------------------------------


# ------------------------------------------------------------------------------
#     IMPORTANT WARNINGS
# ------------------------------------------------------------------------------
# (1) THE PATTERNS DEFINED BELOW (with one exception) ARE NOT USED FOR PARSING
#     URLS BY ANY OF THIS PACKAGE'S COMMANDS.
# (2) THAT EXCEPTION IS THE VARIABLE ::uri::ftp::typepart
# (3) AS LONG AS THAT VARIABLE IS ASSIGNED THE CORRECT VALUE, ALL THE
#     uri::register CALLS CAN BE DELETED WITHOUT AFFECTING THE uri::* COMMANDS.
# (2) REGISTRATION OF A SCHEME DOES NOT IMPLEMENT COMMANDS FOR THAT SCHEME.
# (3) REGISTRATION OF A SCHEME IS NOT NECESSARY TO IMPLEMENT COMMANDS FOR THAT
#     SCHEME.
#     Instead:
# (4) THE PATTERNS ARE FOR REFERENCE, AND CAN BE ACCESSED VIA THESE NAMESPACE
#     VARIABLES, OR IN SOME CASES VIA VARIABLES MAINTAINED BY uri::register.
# (5) THE VARIABLES schemepart AND url ARE MENTIONED IN THE DOCUMENTATION.
# (6) UNDOCUMENTED VARIABLES MIGHT BE ACCESSED BY THIRD-PARTY CODE.
# (7) THEREFORE EVERYTHING IS RETAINED FOR BACKWARD COMPATIBILITY.
# ------------------------------------------------------------------------------

# FTP
uri::register ftp {
    # Please read the warnings above.
    variable escape [set [namespace parent [namespace current]]::basic::escape]
    variable login  [set [namespace parent [namespace current]]::basic::login]

    variable	charN	{[a-zA-Z0-9$_.+!*'(,)?:@&=-]}
    variable	char	"(${charN}|${escape})"
    variable	segment	"${char}*"
    variable	path	"${segment}(/${segment})*"

    variable	type		{[AaDdIi]}
    variable	typepart	";type=(${type})"
    # Used elsewhere: typepart

    variable	schemepart	\
		    "//${login}(/${path}(${typepart})?)?"

    variable	url		"ftp:${schemepart}"
}

# FILE
uri::register file {
    # Please read the warnings above.
    variable	host [set [namespace parent [namespace current]]::basic::host]
    variable	path [set [namespace parent [namespace current]]::ftp::path]

    variable	schemepart	"//(${host}|localhost)?/${path}"
    variable	url		"file:${schemepart}"
}

# HTTP
uri::register http {
    # Please read the warnings above.
    variable	escape \
        [set [namespace parent [namespace current]]::basic::escape]
    variable	hostOrPort	\
        [set [namespace parent [namespace current]]::basic::hostOrPort]

    variable	charN		{[a-zA-Z0-9$_.+!*'(,);:@&=-]}
    variable	char		"($charN|${escape})"
    variable	segment		"${char}*"

    variable	path		"${segment}(/${segment})*"
    variable	search		$segment
    variable	schemepart	\
	    "//${hostOrPort}(/${path}(\\?${search})?)?"

    variable	url		"http:${schemepart}"
}

# GOPHER
uri::register gopher {
    # Please read the warnings above.
    variable	xChar \
        [set [namespace parent [namespace current]]::basic::xChar]
    variable	hostOrPort \
        [set [namespace parent [namespace current]]::basic::hostOrPort]
    variable	search \
        [set [namespace parent [namespace current]]::http::search]

    variable	type		$xChar
    variable	selector	"$xChar*"
    variable	string		$selector
    variable	schemepart	\
	    "//${hostOrPort}(/(${type}(${selector}(%09${search}(%09${string})?)?)?)?)?"
    variable	url		"gopher:${schemepart}"
}

# MAILTO
uri::register mailto {
    # Please read the warnings above.
    variable xChar [set [namespace parent [namespace current]]::basic::xChar]
    variable host  [set [namespace parent [namespace current]]::basic::host]

    variable schemepart	"$xChar+(@${host})?"
    variable url	"mailto:${schemepart}"
}

# NEWS
uri::register news {
    # Please read the warnings above.
    variable escape [set [namespace parent [namespace current]]::basic::escape]
    variable alpha  [set [namespace parent [namespace current]]::basic::alpha]
    variable host   [set [namespace parent [namespace current]]::basic::host]

    variable	aCharN		{[a-zA-Z0-9$_.+!*'(,);/?:&=-]}
    variable	aChar		"($aCharN|${escape})"
    variable	gChar		{[a-zA-Z0-9$_.+-]}
    variable	newsgroup-name	"${alpha}${gChar}*"
    variable	message-id	"${aChar}+@${host}"
    variable	schemepart	"\\*|${newsgroup-name}|${message-id}"
    variable	url		"news:${schemepart}"
}

# WAIS
uri::register wais {
    # Please read the warnings above.
    variable	uChar \
        [set [namespace parent [namespace current]]::basic::xChar]
    variable	hostOrPort \
        [set [namespace parent [namespace current]]::basic::hostOrPort]
    variable	search \
        [set [namespace parent [namespace current]]::http::search]

    variable	db		"${uChar}*"
    variable	type		"${uChar}*"
    variable	path		"${uChar}*"

    variable	database	"//${hostOrPort}/${db}"
    variable	index		"//${hostOrPort}/${db}\\?${search}"
    variable	doc		"//${hostOrPort}/${db}/${type}/${path}"

    #variable	schemepart	"${doc}|${index}|${database}"

    variable	schemepart \
	    "//${hostOrPort}/${db}((\\?${search})|(/${type}/${path}))?"

    variable	url		"wais:${schemepart}"
}

# PROSPERO
uri::register prospero {
    # Please read the warnings above.
    variable	escape \
        [set [namespace parent [namespace current]]::basic::escape]
    variable	hostOrPort \
        [set [namespace parent [namespace current]]::basic::hostOrPort]
    variable	path \
        [set [namespace parent [namespace current]]::ftp::path]

    variable	charN		{[a-zA-Z0-9$_.+!*'(,)?:@&-]}
    variable	char		"(${charN}|$escape)"

    variable	fieldname	"${char}*"
    variable	fieldvalue	"${char}*"
    variable	fieldspec	";${fieldname}=${fieldvalue}"

    variable	schemepart	"//${hostOrPort}/${path}(${fieldspec})*"
    variable	url		"prospero:$schemepart"
}

# LDAP
uri::register ldap {
    # Please read the warnings above.
    variable	hostOrPort \
        [set [namespace parent [namespace current]]::basic::hostOrPort]

    # very crude parsing
    variable	dn		{[^?]*}
    variable	attrs		{[^?]*}
    variable	scope		"base|one|sub"
    variable	filter		{[^?]*}
    # extensions are not handled yet

    variable	schemepart	"//${hostOrPort}(/${dn}(\?${attrs}(\?(${scope})(\?${filter})?)?)?)?"
    variable	url		"ldap:$schemepart"
}

package provide uri 1.2.7

Changes to modules/uri/urn-scheme.tcl.

119
120
121
122
123
124
125
126
127
128
129
130
131
132

133
134
135
136
137
138
139
        # Recode the array of utf-8 bytes to the proper internal rep.
        return [encoding convertfrom utf-8 $result]
    }
}

# -------------------------------------------------------------------------

namespace eval ::uri::urn {
	variable NIDpart {[a-zA-Z0-9][a-zA-Z0-9-]{0,31}}
        variable esc {%[0-9a-fA-F]{2}}
        variable trans {a-zA-Z0-9$_.+!*'(,):=@;-}
        variable NSSpart "($esc|\[$trans\])+"
        variable URNpart "($NIDpart):($NSSpart)"
        # Used elsewhere: NIDpart trans URNpart

}

# -------------------------------------------------------------------------

package provide uri::urn 1.0.3

# -------------------------------------------------------------------------







|





|
>







119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
        # Recode the array of utf-8 bytes to the proper internal rep.
        return [encoding convertfrom utf-8 $result]
    }
}

# -------------------------------------------------------------------------

::uri::register {urn URN} {
	variable NIDpart {[a-zA-Z0-9][a-zA-Z0-9-]{0,31}}
        variable esc {%[0-9a-fA-F]{2}}
        variable trans {a-zA-Z0-9$_.+!*'(,):=@;-}
        variable NSSpart "($esc|\[$trans\])+"
        variable URNpart "($NIDpart):($NSSpart)"
        variable schemepart $URNpart
	variable url "urn:$NIDpart:$NSSpart"
}

# -------------------------------------------------------------------------

package provide uri::urn 1.0.3

# -------------------------------------------------------------------------