Tcl Source Code

View Ticket
Login
2013-06-18
07:50 Closed ticket [a876646efe]: re_expr character class cntrl: should contain \u0000 - \u001f plus 4 other changes artifact: 5ce43b8ff9 user: jan.nijtmans
07:50
Fix uniClass tool which was the real cause for [a876646efe], and add test-case for it. check-in: 6aa9adc7fc user: jan.nijtmans tags: trunk
07:47 Ticket [a876646efe] re_expr character class cntrl: should contain \u0000 - \u001f status still Open with 3 other changes artifact: a898acac1f user: jan.nijtmans
07:43
Fix uniClass tool which was the real cause for [a876646efe], and add test-case for it. check-in: f604a21bb0 user: jan.nijtmans tags: core-8-5-branch
2013-06-17
05:16 Ticket [a876646efe] re_expr character class cntrl: should contain \u0000 - \u001f status still Open with 3 other changes artifact: 3395c2f732 user: jan.nijtmans
04:54
Fix [a876646efe]: re_expr character class [:cntrl:] should contain \u0000 - \u001f check-in: 89b05343cb user: jan.nijtmans tags: trunk
04:52
Fix [a876646efe]: re_expr character class [:cntrl:] should contain \u0000 - \u001f check-in: cc1a71b4e5 user: jan.nijtmans tags: core-8-5-branch
04:50 Ticket [a876646efe] re_expr character class cntrl: should contain \u0000 - \u001f status still Open with 4 other changes artifact: fa040a5149 user: jan.nijtmans
2013-06-15
13:00 Ticket [a876646efe]: 4 changes artifact: 00b3d27b2a user: oehhar
2013-06-14
16:05 New ticket [a876646efe]. artifact: 543ae5df6c user: anonymous

Ticket UUID: a876646efe2220eb7990d9df248115a9ce6e7471
Title: re_expr character class [:cntrl:] should contain \u0000 - \u001f
Type: Bug Version: 8.6.0
Submitter: anonymous Created on: 2013-06-14 16:05:19
Subsystem: 43. Regexp Assigned To: jan.nijtmans
Priority: 5 Medium Severity: Minor
Status: Closed Last Modified: 2013-06-18 07:50:57
Resolution: Fixed Closed By: jan.nijtmans
    Closed on: 2013-06-18 07:50:57
Description:

IMHO the character class [:cntrl:] should contain characters \u0000 to \u001f and \u007f to \u009f

http://en.wikipedia.org/wiki/Unicode_control_characters

As shown by the following script, it does not contain \u0000 to \u001f

% for {set x 0} {$x < 256} {incr x} {
    puts -nonewline "[format %02x $x]:[regexp {^[[:cntrl:]]$} [format %c $x]] "
}
00:0 01:0 02:0 03:0 04:0 05:0 06:0 07:0 08:0 09:0 0a:0 0b:0 0c:0 0d:0 0e:0 0f:0
10:0 11:0 12:0 13:0 14:0 15:0 16:0 17:0 18:0 19:0 1a:0 1b:0 1c:0 1d:0 1e:0 1f:0
20:0 21:0 22:0 23:0 24:0 25:0 26:0 27:0 28:0 29:0 2a:0 2b:0 2c:0 2d:0 2e:0 2f:0
30:0 31:0 32:0 33:0 34:0 35:0 36:0 37:0 38:0 39:0 3a:0 3b:0 3c:0 3d:0 3e:0 3f:0 
40:0 41:0 42:0 43:0 44:0 45:0 46:0 47:0 48:0 49:0 4a:0 4b:0 4c:0 4d:0 4e:0 4f:0 
50:0 51:0 52:0 53:0 54:0 55:0 56:0 57:0 58:0 59:0 5a:0 5b:0 5c:0 5d:0 5e:0 5f:0 
60:0 61:0 62:0 63:0 64:0 65:0 66:0 67:0 68:0 69:0 6a:0 6b:0 6c:0 6d:0 6e:0 6f:0 
70:0 71:0 72:0 73:0 74:0 75:0 76:0 77:0 78:0 79:0 7a:0 7b:0 7c:0 7d:0 7e:0 7f:1 
80:1 81:1 82:1 83:1 84:1 85:1 86:1 87:1 88:1 89:1 8a:1 8b:1 8c:1 8d:1 8e:1 8f:1 
90:1 91:1 92:1 93:1 94:1 95:1 96:1 97:1 98:1 99:1 9a:1 9b:1 9c:1 9d:1 9e:1 9f:1 
a0:0 a1:0 a2:0 a3:0 a4:0 a5:0 a6:0 a7:0 a8:0 a9:0 aa:0 ab:0 ac:0 ad:1 ae:0 af:0 
b0:0 b1:0 b2:0 b3:0 b4:0 b5:0 b6:0 b7:0 b8:0 b9:0 ba:0 bb:0 bc:0 bd:0 be:0 bf:0 
c0:0 c1:0 c2:0 c3:0 c4:0 c5:0 c6:0 c7:0 c8:0 c9:0 ca:0 cb:0 cc:0 cd:0 ce:0 cf:0 
d0:0 d1:0 d2:0 d3:0 d4:0 d5:0 d6:0 d7:0 d8:0 d9:0 da:0 db:0 dc:0 dd:0 de:0 df:0 
e0:0 e1:0 e2:0 e3:0 e4:0 e5:0 e6:0 e7:0 e8:0 e9:0 ea:0 eb:0 ec:0 ed:0 ee:0 ef:0 
f0:0 f1:0 f2:0 f3:0 f4:0 f5:0 f6:0 f7:0 f8:0 f9:0 fa:0 fb:0 fc:0 fd:0 fe:0 ff:0

This is on Windows Vista 32 bit with tcl 8.4.20, tcl8.5.14 and 8.6.0

-Harald (oehhar)

User Comments: jan.nijtmans added on 2013-06-18 07:47:22:

Fixed uniClass tool and added test case in [f604a21bb0]


jan.nijtmans added on 2013-06-17 04:50:43:

Fixed in core-8-5-branch and trunk. Not closing yet because still to be done: 1) add test-case 2) fix the tool that generated this code.

The bug was introduced with commit [e9a619e9dc3cc5d6], before that the Unicode "control" table was not even used in the code but the two ranges hard-codes.