Ticket UUID: | 219233 | |||
Title: | string match oddity when - is the last char | |||
Type: | Bug | Version: | obsolete: 8.3 | |
Submitter: | nobody | Created on: | 2000-10-26 05:03:56 | |
Subsystem: | 18. Commands M-Z | Assigned To: | dkf | |
Priority: | 7 High | Severity: | Minor | |
Status: | Open | Last Modified: | 2018-05-07 22:03:22 | |
Resolution: | Remind | Closed By: | nobody | |
Closed on: | ||||
Description: |
OriginalBugID: 4438 Bug Version: 8.3 SubmitDate: '2000-03-21' LastModified: '2000-04-03' Severity: MED Status: Assigned Submitter: techsupp ChangedBy: hobbs OS: Windows 95 OSVersion: OSR2 FixedDate: '2000-10-25' ClosedDate: '2000-10-25' Name: Keith Lea ReproducibleScript: % string match {[a-z0-9_/-]} \\ 1 % string match {[a-z0-9_/]} \\ 0 It's accidently interp'ing the "/-]" as "/-]]", taking last ] as ] endrange and ] endblock. -- 04/03/2000 hobbs This needs to be fixed in Tcl_String(Case)Match in tclUtil.c, but wait until 8.4 just in case someone was counting on the previous perverse behavior. -- 04/03/2000 hobbs | |||
User Comments: |
sebres added on 2018-05-07 22:03:22:
If the behavior like below is acceptable for you, I can do a back-porting from my own branches (I assume it's relative easy): % string match {[.-B} A invalid match pattern: brackets [] not balanced Otherwise, I would like to know how exactly it must be then "fixed" (after the possible refactoring)... E.g. another variant of the "fix" can be: % string match {[.-B} A 0 % string match {[.-B} \[.-A 0 % string match {[.-B} \[.-B 1 Quasi by unbalanced brackets the pattern {[.-B} will be equivalent to the {\[.-B} Anyway, current implementation is definitely wrong: % string match {[.-B} A 1 should result either to an error or to 0. dgp added on 2018-05-07 16:21:29: If someone wants to pursue a "refactor first, then fix" strategy, I can accept that. dgp added on 2018-05-07 16:19:56: "Wait until 9.0" is no longer a blockage. This should be fixed at least on the trunk, if not on earlier branches too. gneumann added on 2010-11-18 16:52:04: i was hit by apparently the same problem. % string match {[-.]} - 1 % string match {[.-]} - 0 # even worse % string match {[.-]} A 1 This is certainly unexpected behavior and should be at least documented. dkf added on 2007-06-14 21:50:12: Logged In: YES user_id=79902 Originator: NO Comments below indicate "wait until 9.0", this being a strategy that was mainly from Jeff. matzek added on 2007-06-14 21:00:55: Logged In: YES user_id=330806 Originator: NO I forgot to mention that the example below is taken from a recent Tcl interpreter (8.4.14)... matzek added on 2007-06-14 20:56:20: Logged In: YES user_id=330806 Originator: NO Hi! I add another example to the list: % string match {*[.-]*} "2.1-Beta" 0 % string match {*[-.]*} "2.1-Beta" 1 % string match {*[.-]*} "Beta-2.1" 1 % string match {*[-.]*} "Beta-2.1" 1 If this is not going to be fixed soon, I suggest to at least document this behavior. Could be enough to just add the hyphen to the list of characters that need to be escaped... kind regards -- Matthias Kraft dkf added on 2003-11-24 23:17:10: Logged In: YES user_id=79902 Backslashes aren't special inside the square-bracket term, so the first match doesn't match what you are expecting and the second match isn't looking for what you think it is: % set str \\ \ % string match {[\[]} $str 1 % set str {\]} \] % string match {[\]]} $str 1 The glob-matching engine used by [string match] isn't very smart... :^( Consider using regular expressions instead. lupylucke added on 2003-11-23 15:20:03: Logged In: YES user_id=915599 I encountered a problem with `string match' too. I suppouse it is, finally, the same bug: it is possible to match a [ in a character set quoting int with \. Thits should work with ] too, but it doesn't! % string match {[\[]} {[} 1 % string match {[\]]} {]} 0 dkf added on 2001-03-17 03:57:56: Logged In: YES user_id=79902 This behaviour won't be changed before 9.0 Any fixes that *are* done must be applied to code in both tclUtil.c and tclUtf.c dkf added on 2001-02-23 17:24:41: See Patch #103932 https://sourceforge.net/patch/?func=detailpatch&patch_id=103932&group_id=10894 dkf added on 2001-02-15 23:32:37: Apparently, the way [string match] handles syntactically invalid patterns is by failing to match anything at all. It's not entirely clear to me that this is an optimal strategy... dkf added on 2000-11-24 20:18:50: Improved detection of bug: % sstring match \[a-] ] 1 % string match \[a-]x ]x 0 dkf added on 2000-11-24 18:35:30: Hmm. The problem seems to be that the first pattern is actually malformed by the rules of [string match], but there is no way to indicate this. I suppose the correct way of dealing with this is to decide that we were not really matching a range after all, but that's not very good at all. Either that, or we state that a malformed pattern matches nothing at all. Hmm. On successful matching of a range, should we really back up a character at the unexpected end of string , or should we fail at that point? |