Tcl Source Code

View Ticket
Login
Ticket UUID: cef426ff2ceea3833a745335e5de798cadc354c6
Title: Encoding UTF-32 missing
Type: RFE Version: 8.7
Submitter: oehhar Created on: 2021-10-13 09:28:04
Subsystem: 44. UTF-8 Strings Assigned To: jan.nijtmans
Priority: 5 Medium Severity: Minor
Status: Closed Last Modified: 2021-10-31 15:10:41
Resolution: Fixed Closed By: jan.nijtmans
    Closed on: 2021-10-31 15:10:41
Description:

Dear TCL Team,

for me, the encoding "ISO8859-11" for Thai encoding is missing.

It is used in bar coding and it may be helpful, if it would be available.

Thank you, Harald

PS.: i am very happy, that TIP 547 has replaced the "unicode" encoding by clearly defined UTF-16LE, UTF16BE. May I ask why UTF-32LE and UTF32BE was not defined in this TIP?

Personal note: well, I open a lot of issues not cleaning anything up, I am sorry for that. Nevertheless, even identifying an isue is a good step.

User Comments: jan.nijtmans added on 2021-10-31 15:10:41:

Fixed [2c7852b42b0c31a5|here]. Will be in the next Tcl 8.7 alpha release


oehhar added on 2021-10-13 18:42:55:

Dear Jan,

I have made tests with utf-32le and utf-32be. They both work like a charm.

Great work, thank you ! Harald

P.S. Again, a pleasure photo is attached.


oehhar added on 2021-10-13 13:59:01:

Thank you, great ! Give me some time to test.

Thanks, Harald


jan.nijtmans added on 2021-10-13 13:43:52:

First shot here: [23f539bd7cbedf7c]. I think it's complete, but I didn't thorougly test it yet.


oehhar added on 2021-10-13 11:30:14:

Dear Jan,

thank you for the hint about the recently added 8859-11. I tested it. It works great! I try to attach a photo for your pleasure.

About UTF-32, I think, it is reasonable to add this. Specially, because that is one flavor of the current "unicode" encoding.

Thank you for the light-speed action, Harald

P.S.: Another important issue is gb18030 of ticket [367bfdcf89].


jan.nijtmans added on 2021-10-13 10:54:41:

How about [0c4485f7357885ed]? ;-) That was about 4 months ago, will be in 8.6.12

About UTF-32, that could be added as well, but there are places in Tcl code where the assumption is made that there are only 1-byte and 2-byte encodings, e.g. here: https://core.tcl-lang.org/tcl/file?udc=1&ln=1062-1066&ci=803b50919336549c&name=generic%2FtclEncoding.c. All places where encoding->nullsize is used must be adapted too, accepting '4' as possible value.

I changed the title, so we can add UTF-32 encodings later under this ticket.


Attachments: