Tcl Source Code

View Ticket
Login
Ticket UUID: 474358
Title: a patch for iso2022-jp probrems
Type: Patch Version: None
Submitter: taguchiv6 Created on: 2001-10-24 06:04:01
Subsystem: 44. UTF-8 Strings Assigned To: hobbs
Priority: 8 Severity:
Status: Closed Last Modified: 2002-03-05 04:59:23
Resolution: Fixed Closed By: hobbs
    Closed on: 2002-03-04 21:59:22
Description:
Here is a patch for iso2022-jp related
probrems(BUG iD#218099,219283,219314).

Sorry, I can't say this patch is good
enough.
But some test scripts are seem work.

Thanks.
---
Taguchi, Takeshi.
User Comments: hobbs added on 2002-03-05 04:59:23:

File Added - 18798: encoding-474358-219314-524674.patch

hobbs added on 2002-03-05 04:59:22:
Logged In: YES 
user_id=72656

OK, I've added Mr. Taguchi's patch in addition to one that 
correctly handles refcounts when freeing encodings.  This 
seems to fix this bug, the noted bugs, as well as bug 
524674.  Attached is a patch with tests.  I am applying 
this for 8.4a4 and 8.3.4+.

yamako added on 2002-03-03 19:08:16:
Logged In: YES 
user_id=475117

Hi, 

 I rewrote a new patch of iso2022-jp. This patch fixes the 
following 
problems.

1. RFC1468 infraction (fixed by Mr. Taguchi)
 RFC1468 obviously describes that the text starts in ASCII 
[ASCII], and 
switches to Japanese characters. It also describes that 
the text must end 
in ASCII. For more information, see Description section in 
RFC1468. I also 
modified encoding-11.5 and encoding-13.1 in encoding.test 
because these 
results are incorrect.

2. Tcl_GetsObj problem with iso2022-jp channel
 This problem seems to be fixed on CVS, but I include this 
fix in my patch 
to fix tcl8.3.4 source.
 TCL_ENCODING_START must be turned off when 
Tcl_ExternalToUtf() is invoked 
in FilterInputBytes().

3. JISX0208 escape sequence problem
 This problem seems to be fixed on CVS, but I include this 
fix in my patch 
to fix tcl8.3.4 source.
 There are two JISX0208 escape sequences, JISX0208-1978 
and JISX0208-1983. 
Tcl_UtfToExternal() should encode with JISX0208-1983. 
Therefore, it should 
use ESC$B rather than ESC$@.


You can download this patch from: 
http://www3.ocn.ne.jp/~yamako/tcl/iso2022-
jp.tcl834.2002mar03p1.patch

hobbs added on 2002-03-02 10:48:02:

File Added - 18691: encoding.patch

Logged In: YES 
user_id=72656

I've attached a slight variant of the patch from Keiichi 
(perhaps from Takeshi originally).  It also removes placing 
the ESC(B at the beginning of strings.  I'm not sure if 
this is more or less correct than Takeshi's version which 
just applies them to the end.  If we can solve this, we 
should be able to apply it.  I'm going on the assumption 
that the Japanese Tcl users know a lot more about Japanese 
encodings than I do.  :)

This was originally bug 524663, but I closed that out when 
I realized these were so close.

andreas_kupries added on 2002-01-22 04:58:48:
Logged In: YES 
user_id=75003

I applied this patch to the current state of 8.3.4 and 
8.4cvs head. In both cases there are three encoding-related 
tests which will fail when the testsuite is run. See below. 

I am not well versed enough in this area to know if the 
change should make the tests fail (and thus the tests have 
to be updated) or if the failure points to a bug in the 
patch itself.

Because of this I believe that the patch as it is now is 
not applicable. If the tests have to be changed the patch 
should contain the updates to the testsuite.

==== encoding-11.5 LoadEncodingFile: escape file FAILED
==== Contents of test case:

    encoding convertto iso2022 \u4e4e

---- Result was:
ESC$@8CESC(B
---- Result should have been:
ESC(BESC$@8C
==== encoding-11.5 FAILED


==== encoding-13.1 LoadEscapeTable FAILED
==== Contents of test case:

    set x [encoding convertto iso2022 ab\u4e4e\u68d9g]

---- Result was:
abESC$@8CESC$(DD%ESC(Bg
---- Result should have been:
ESC(BabESC$@8CESC$(DD%ESC(Bg
==== encoding-13.1 FAILED

==== io-1.8 Tcl_WriteChars: WriteChars FAILED
==== Contents of test case:

    # This test written for SF bug #506297.
    #
    # Executing this test without the fix for the 
referenced bug
    # applied to tcl will cause tcl, more specifically 
WriteChars, to
    # go into an infinite loop.

    set f [open test2 w]
    fconfigure      $f -encoding iso2022-jp
    puts -nonewline $f [format %s%c [string repeat " " 4] 
12399]
    close           $f
    contents test2

---- Result was:
    ESC$@$OESC(B
---- Result should have been:
ESC(B    ESC$@$O
==== io-1.8 FAILED

taguchiv6 added on 2001-10-26 11:39:43:

File Added - 12495: tcl-patch-iso2022.v2

Logged In: YES 
user_id=357728

I found a probrem in old patch.
This patch still add not-needed escape sequences to tail of
string.

here is a new patch.
I think this one will resolve many probrems on
escape driven encoding.

taguchiv6 added on 2001-10-24 13:10:02:

File Added - 12418: tcl-iso2022.patch

Logged In: YES 
user_id=357728

Sorry, I've forgot check box.
---
Taguchi,T.

Attachments: