Ticket UUID: | f9539dce52e2dac82090760757763521b583efde | |||
Title: | UTF-8 Source code not parsed correctly | |||
Type: | RFE | Version: | 8.6.1 OSX 10.9.2 | |
Submitter: | samoc | Created on: | 2014-07-10 03:53:25 | |
Subsystem: | 44. UTF-8 Strings | Assigned To: | jan.nijtmans | |
Priority: | 5 Medium | Severity: | Minor | |
Status: | Closed | Last Modified: | 2021-03-18 14:52:22 | |
Resolution: | Fixed | Closed By: | jan.nijtmans | |
Closed on: | 2021-03-18 14:52:22 | |||
Description: |
If I write "puts 😞" the output is "c3 b0 c2 9f c2 98 c2 9e 0a" the correct output would be "f0 9f 98 9e 0a" http://www.charbase.com/1f61e-unicode-disappointed-face | |||
User Comments: |
jan.nijtmans added on 2021-03-18 14:52:22:
This is fixed in Tcl 8.6.10 jan.nijtmans added on 2014-07-10 14:27:52: At this moment, Characters > 0xffff are not supported. Adding this is ongoing work, being done in implementing TIP 389. jan.nijtmans added on 2014-07-10 14:27:28: At this moment, Characters > 0xffff are not supported. Adding this is ongoing work, being done in implementing TIP 389. samoc added on 2014-07-10 04:05:42: It seems (after following the source code from Tcl_SourceObjCmd() all the way through to Tcl_UniCharToUtf()) that "#define TCL_UTF_MAX 3" in tcl.h is the problem. At the very least there should either be very noticeable warnings in the documentation that say that "UTF-8 is not fully supported and your valid UTF-8 data may be silently corrupted" or there should be an error thrown when a valid UTF-8 character is encountered that is not supported. |
Attachments:
- utf8_test.tcl [download] added by samoc on 2014-07-10 03:54:53. [details]