Tcl Source Code

View Ticket
Login
Ticket UUID: 3444754
Title: string tolower \u01c5 is wrong
Type: Bug Version: None
Submitter: msteveb Created on: 2011-11-29 03:42:34
Subsystem: 44. UTF-8 Strings Assigned To: nijtmans
Priority: 5 Medium Severity:
Status: Closed Last Modified: 2011-12-08 02:53:57
Resolution: Fixed Closed By: nijtmans
    Closed on: 2011-12-07 19:53:57
Description:
\u01c5 is the title case variant: Dž

The lower case variant should be \u01c6 (dž), and this works for 8.5.8 but instead  8.5.11 and 8.6b2 give \u01c5 (.i.e unchanged).

Here is the relevant entry from http://unicode.org/Public/UNIDATA/UnicodeData.txt

01C5;LATIN CAPITAL LETTER D WITH SMALL LETTER Z WITH CARON;Lt;0;L;<compat> 0044 017E;;;;N;LATIN LETTER CAPITAL D SMALL Z HACEK;;01C4;01C6;01C5
User Comments: nijtmans added on 2011-12-08 02:53:57:

allow_comments - 1

Fix committed to all open branches, so it will appear in Tcl 8.5.12 and 8.6b3

nijtmans added on 2011-12-06 20:52:44:
Here is the fix (see attached patch), just a single
number 32931 should have been
32963 (line 754 of tclUniData.c).

Will check that in soon, together with the
updated uniParse.tcl which generates
this correctly.

nijtmans added on 2011-12-06 20:48:47:

File Added - 430129: tclUniData.c.diff

nijtmans added on 2011-12-05 21:32:05:
Compare this UnicodeData.text line with the earlier entry in Unicode 2.x:

01C5;LATIN CAPITAL LETTER D WITH SMALL LETTER Z WITH CARON;Lt;0;L;<compat> 0044 017E;;;;N;LATIN LETTER CAPITAL D SMALL Z HACEK;;01C4;01C6;

So, the bug is introduced by a syntax change in the
UnicodeData.txt file, not by any change at the Tcl
side. The uniParse.tcl handles the line differently
when the 'totitle' entry is filled.

Other characters which changed the same way are
\u01cb and \u01f2 (as mentioned by Steve), but
many more.....

OK, now I have all information needed to fix this....

nijtmans added on 2011-12-04 16:09:59:
This bug is introduced earlier, at 2010-10-23
with the upgrade to Unicode 6.0 (Bug 3085863),
it has no relation to 3393714

nijtmans added on 2011-11-29 18:19:55:
Confirmed. Will have a look.

dkf added on 2011-11-29 16:30:41:
Probably related to the fix for 3393714.

msteveb added on 2011-11-29 10:54:42:
Ditto, \u01cb and \u01f2

Attachments: