Tcl Source Code

View Ticket
Login
Ticket UUID: 219314
Title: tcl8.2.2 can not handle iso2022-jp strings
Type: Bug Version: obsolete: 8.2.2
Submitter: nobody Created on: 2000-10-26 05:10:33
Subsystem: 10. Objects Assigned To: hobbs
Priority: 7 High Severity:
Status: Closed Last Modified: 2002-03-05 04:59:06
Resolution: Fixed Closed By: hobbs
    Closed on: 2002-03-04 21:59:06
Description:
OriginalBugID: 3670 Bug
Version: 8.2.2
SubmitDate: '1999-11-23'
LastModified: '1999-12-09'
Severity: CRIT
Status: UnAssn
Submitter: techsupp
ChangedBy: hobbs
OS: BSD
FixedDate: '2000-10-25'
ClosedDate: '2000-10-25'


Name:
Taguchi,Takeshi

ReproducibleScript:
system encoding iso2022-jp
fconfigure stdin -encoding iso2022-jp
fconfigure stdout -encoding iso2022-jp
# Ok, I think we can input iso2022-jp string  ...
set a {Some_ISO2022-JP_String}
EscapeToUtfProc: Invalid sub table
Abort trap (core dumped)

ObservedBehavior:
(gdb) where
#0  0x2812b4c4 in kill () from /usr/lib/libc.so.3
#1  0x2815f93f in abort () from /usr/lib/libc.so.3
#2  0x280aec5e in Tcl_PanicVA () from /usr/local/lib/libtcl8.2.so
#3  0x280aec84 in Tcl_Panic () from /usr/local/lib/libtcl8.2.so
#4  0x280936e5 in Tcl_FindExecutable () from /usr/local/lib/libtcl8.2.so
#5  0x2809327d in Tcl_FindExecutable () from /usr/local/lib/libtcl8.2.so
#6  0x28091d7b in Tcl_ExternalToUtf () from /usr/local/lib/libtcl8.2.so
#7  0x280a2233 in Tcl_GetsObj () from /usr/local/lib/libtcl8.2.so
#8  0x280a1da4 in Tcl_GetsObj () from /usr/local/lib/libtcl8.2.so
#9  0x280aa6c1 in Tcl_Main () from /usr/local/lib/libtcl8.2.so
#10 0x8048529 in main (argc=1, argv=0xbfbfd9f0) at ./../unix/tclAppInit.c:83
#11 0x80484a5 in _start ()

DesiredBehavior:
In interactive mode, I want to input multibyte string which has system encoding.
I think tcl8.2 can do it.......
User Comments: hobbs added on 2002-03-05 04:59:06:
Logged In: YES 
user_id=72656

See http://sourceforge.net/tracker/?
func=detail&aid=474358&group_id=10894&atid=310894 for 
resolution.

hobbs added on 2002-03-05 04:51:43:
Logged In: YES 
user_id=72656

This wasn't really fixed by the noted patches.  It was a 
couple of things.  The escapes needed to be fixed, but also 
the finalization of encodings wasn't correctly handling 
refcounts of encodings.  This is fixed for 8.4a4 and 8.3.4+.
See also bug 524674 and patch 474358.

andreas_kupries added on 2002-01-22 04:57:56:
Logged In: YES 
user_id=75003

I applied this patch to the current state of 8.3.4 and 
8.4cvs head. In both cases there are three encoding-related 
tests which will fail when the testsuite is run. See below. 

I am not well versed enough in this area to know if the 
change should make the tests fail (and thus the tests have 
to be updated) or if the failure points to a bug in the 
patch itself.

Because of this I believe that the patch as it is now is 
not applicable. If the tests have to be changed the patch 
should contain the updates to the testsuite.

==== encoding-11.5 LoadEncodingFile: escape file FAILED
==== Contents of test case:

    encoding convertto iso2022 \u4e4e

---- Result was:
ESC$@8CESC(B
---- Result should have been:
ESC(BESC$@8C
==== encoding-11.5 FAILED


==== encoding-13.1 LoadEscapeTable FAILED
==== Contents of test case:

    set x [encoding convertto iso2022 ab\u4e4e\u68d9g]

---- Result was:
abESC$@8CESC$(DD%ESC(Bg
---- Result should have been:
ESC(BabESC$@8CESC$(DD%ESC(Bg
==== encoding-13.1 FAILED

==== io-1.8 Tcl_WriteChars: WriteChars FAILED
==== Contents of test case:

    # This test written for SF bug #506297.
    #
    # Executing this test without the fix for the 
referenced bug
    # applied to tcl will cause tcl, more specifically 
WriteChars, to
    # go into an infinite loop.

    set f [open test2 w]
    fconfigure      $f -encoding iso2022-jp
    puts -nonewline $f [format %s%c [string repeat " " 4] 
12399]
    close           $f
    contents test2

---- Result was:
    ESC$@$OESC(B
---- Result should have been:
ESC(B    ESC$@$O
==== io-1.8 FAILED

taguchiv6 added on 2001-10-25 16:09:33:
Logged In: YES 
user_id=357728

I've uploaded patch as #474358.
It's seem work.
But I do not understand tclEncoding.c
So I afraid this patch may contain bugs.

Thanks.
---
Taguchi,T.

hobbs added on 2001-10-19 01:38:33:
Logged In: YES 
user_id=72656

The problem seems to be in a recursive need to access file 
encodings.

Once we are switched into the iso2022-jp encoding for the 
system, when we need to convert a string, it goes through 
tclEncoding.c:GetTableEncoding, which will load the 
encoding when necessary.  When the first escape encoding 
for jis0201 needs to be loaded, we have a problem because 
the system finds the jis0201.enc file, but wants to convert 
that name since the system thinks it everything (including 
system file names) are iso2022-jp encoding.

We need to fix this, but it also look like the general 
encoding system idea may not be what is necessary.

hobbs added on 2001-10-19 01:11:06:
Logged In: YES 
user_id=72656

OK, if I translate this right, this can be massaged like so:

set a "\u4e4e\u4e5e\u4e5f"
set b [encoding convertto iso2022-jp $a]

This makes b == "\x1b(B\x1b$@8C8pLi"

looking at iso2022-jp.enc, that means there is a signal
to use iso8859-1 immediately followed by the signal to
use jis0208.  This is confirmed when I get the same
chars in tkcon on Windows by just getting the value of:

encoding convertfrom jis0208 8C8pLi

I'm now trying to figure out why the escape driven encoding 
doesn't work right...

hobbs added on 2001-10-19 00:40:33:
Logged In: YES 
user_id=72656

This is now confirmed with the latest script from Taguchi:

----8<----8<----8<----8<----
#!/usr/local/bin/tclsh8.4
encoding system iso2022-jp
set a "\u4e4e\u4e5e\u4e5f"; # String with 3 Kanji chars
puts $a
----8<----8<----8<----8<----

That needs to be run as a script to trigger the bug.  
Occurs in 8.3.4cvs and 8.4a4cvs.

hobbs added on 2001-10-15 01:52:22:

File Added - 11957: crash-219314.tcl

hobbs added on 2001-10-15 01:52:21:
Logged In: YES 
user_id=72656

This was confirmed by taguchi at tohoku.iij.ad.jp to still 
crash on his configuration, but I cannot repeat it.  The 
attached script is supposed to crash.

hobbs added on 2001-10-13 03:04:12:
Logged In: YES 
user_id=72656

We would need to know exactly what the string was that 
caused the problem to be able to reproduce it.

Attachments: