Tcl Source Code

View Ticket
Login
Ticket UUID: 1201171
Title: TclpSetInitialEncodings works only once on Unix
Type: Bug Version: obsolete: 8.4.9
Submitter: yaroslav1 Created on: 2005-05-13 06:51:31
Subsystem: 38. Init - Library - Autoload Assigned To: dgp
Priority: 7 High Severity:
Status: Closed Last Modified: 2005-11-03 23:17:10
Resolution: Fixed Closed By: dgp
    Closed on: 2005-11-03 16:17:10
Description:
I think I found a bug in Tcl that prevents a tclkit to
find and use the proper system encoding in a
non-English locale.

Of course, it is not a bug for Tcl itself, because this
function
is really called just once on startup (and does its job).

But tclkit has to call it twice --- on startup and after
mounting VFS which contains encoding files.

A patch to unix/tclUnixInit.c:
-----------------PATCH---------------------
--- tclUnixInit.c       Thu May 12 20:18:51 2005
+++ tclUnixInit.c.new   Thu May 12 20:14:03 2005
@@ -485,7 +485,8 @@
 void
 TclpSetInitialEncodings()
 {
-    if (libraryPathEncodingFixed == 0) {
+
+/*    if (libraryPathEncodingFixed == 0) { */
        CONST char *encoding = NULL;
        int i, setSysEncCode = TCL_ERROR;
        Tcl_Obj *pathPtr;
@@ -647,6 +648,7 @@
         * dependent behavior.
         */

+        if (libraryPathEncodingFixed == 0) {
        setlocale(LC_NUMERIC, "C");

        /*
-----------------PATCH---------------------

This problem really needs to be fixed --- it makes tclkit 
practically unusable for non-English users. And I don't
see a way how Tcl will suffer from it.
User Comments: dgp added on 2005-11-03 23:17:08:
Logged In: YES 
user_id=80530


Thanks for that reference.

Following the instructions at
http://www.equi4.com/218
and the continuing instructions at
http://www.equi4.com/tkunicode.html
I was able to both confirm the
reported bug, and confirm that the
attached patch does fix the bug.

Committing patch for Tcl 8.4.12.

yaroslav1 added on 2005-11-03 13:32:16:
Logged In: YES 
user_id=1272045

I rebuild tclkit with all encoding files (as described at
http://www.equi4.com/tkunicode.html).

I know that it's fixed in Tclkit8.5a4 (I reported that to
Andreas
Kupries earlier), but this bug report is about tclkit8.4.x.

And, Tcl/Tk 8.4 is still the current stable version.

dgp added on 2005-11-03 04:42:05:
Logged In: YES 
user_id=80530


Very up-to-date Tclkits based on
Tcl/Tk 8.5 development sources
are available at

http://www.kroc.tk/tclkit/current/inter.htm

I've just verified they do not suffer
from this problem.  Simplest thing
might be to just use them.

dgp added on 2005-11-03 03:54:07:
Logged In: YES 
user_id=80530


Experts on the Tcl'ers chat say
the TclApp product from ActiveState
will solve this issue.

Since it's a solved problem, the
Activators can determine what
further changes need to go
into Tcl.

dgp added on 2005-11-03 03:36:32:
Logged In: YES 
user_id=80530


ok, I went through the exercise of
building a Tclkit and it seems
the root of the problem is that the
koi8-r.enc file is not included
in a standard Tclkit.  How are you
dealing with that?

yaroslav1 added on 2005-11-02 13:50:22:
Logged In: YES 
user_id=1272045

I've tested on:
$ uname -a
Linux NOSOR 2.6.3-7mdk #1 Wed Mar 17 15:56:42 CET 2004 i686
unknown unknown GNU/Linux

And:
$ uname -a
Linux debian 2.4.27-2-686 #1 Mon May 16 17:03:22 JST 2005
i686 GNU/Linux

dgp added on 2005-11-01 22:36:26:
Logged In: YES 
user_id=80530


Yaroslav, let's figure out a time
we can work together on this.
With testing feedback from you,
I'm sure we can resolve this.

What platform are you testing on?

andreas_kupries added on 2005-10-25 23:54:15:
Logged In: YES 
user_id=75003

Oh, I have received it, it being the mail. However the
accusatory undertone I found regarding my choice of patch
has me thoroughly demotivated to continue work on this,
derailing my initial plan of going through all the patches
here, from last to first, to see which of them fix the
problem and which don't. I also decided to not answer the
mail until I have cooled down and able to be polite despite
this. However now that this private communication is made
public I feel compelled to answer, even if with anger in my
heart.

yaroslav1 added on 2005-10-25 12:41:59:
Logged In: YES 
user_id=1272045

This is my answer to the letter from Andreas
Kupries (looks like he doesn't received it).

> Attached to the bug is a patch claiming to fix the
> problem. I have now created two sets of tclkits, one
> with and one without this patch. The kits are built
> for Linux/Intel.

Hello, Andreas.

I've tried your kits. They don't fix the problem.

Transcript:
% encoding system iso8859-1

So, I want to repeat my patch (it is in original report) to
unix/tclUnixInit.c:
-----------------PATCH---------------------
--- tclUnixInit.c       Thu May 12 20:18:51 2005
+++ tclUnixInit.c.new   Thu May 12 20:14:03 2005
@@ -485,7 +485,8 @@
void
TclpSetInitialEncodings()
{
-    if (libraryPathEncodingFixed == 0) {
+
+/*    if (libraryPathEncodingFixed == 0) { */
       CONST char *encoding = NULL;
       int i, setSysEncCode = TCL_ERROR;
       Tcl_Obj *pathPtr;
@@ -647,6 +648,7 @@
        * dependent behavior.
        */

+        if (libraryPathEncodingFixed == 0) {
       setlocale(LC_NUMERIC, "C");

       /*
-----------------PATCH---------------------

Tclkit I've built with it really fixes the problem.

Transcript:
% encoding system
koi8-r

Why don't you accept it? If you look into Windows
implementation of TclpSetInitialEncodings, you'll see
almost the same code I'm offering. What's wrong with it?

dgp added on 2005-09-27 08:32:23:
Logged In: YES 
user_id=80530

dgppatch is there; just needs testing.
[21:31]dgpthat's an 8.4 branch matter though.
[21:31]stevelthat would be a good one to assign to Andreas
Kupries

dgp added on 2005-06-20 23:39:37:
Logged In: YES 
user_id=80530

working on the 8.4.11 release now...

Can't anyone comment on whether
the attached patch does any good
for solving the reported problem?

I'm not willing to change the
"stable" branch of Tcl on this point
without a minimum endorsement
of effectiveness.

dgp added on 2005-06-02 02:26:36:
Logged In: YES 
user_id=80530


I received no reports about testing
the patch, so it will not be part
of Tcl 8.4.10 release.

dgp added on 2005-05-25 03:39:30:
Logged In: YES 
user_id=80530


any luck testing this?  8.5a3 release
is coming quickly.

jcw added on 2005-05-20 04:07:50:
Logged In: YES 
user_id=1983

Ok, thx.  A trial build for Linux is at http://www.equi4.com/tclkit85try.gz - it has 
the following dynlib dependencies:

$ ldd tclkit-dellie 
        linux-gate.so.1 =>  (0xffffe000)
        libdl.so.2 => /lib/libdl.so.2 (0xb7fde000)
        libstdc++.so.5 => /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.5/libstdc++.so.5 
(0xb7f24000)        libm.so.6 => /lib/libm.so.6 (0xb7f02000)        libgcc_s.so.1 
=> /usr/lib/gcc-lib/i686-pc-linux-gnu/3.3.5/libgcc_s.so.1 (0xb7ef9000)        
libc.so.6 => /lib/libc.so.6 (0xb7de4000)        /lib/ld-linux.so.2 => /lib/ld-linux.so.
2 (0xb7fea000)
$

The CVS checkout was anonymous, not an SF dev checkout so it may lag a 
couple of hours.  Oh, and the embedded Tcl runtime scripts are not updated 
from 8.5a2 (genkit just slaps a fixed starkit onto the end of the exe).

I can't do any testing right now, but please let me know if I need to tweak 
things and rebuild.

dgp added on 2005-05-20 03:44:16:

File Added - 135203: 1201171.patch

dgp added on 2005-05-20 03:44:15:
Logged In: YES 
user_id=80530


Here's a patch against
the core-8-4-branch of
development.  Ought to apply
to Tcl 8.4.9 as well.

Please test to see whether
it addresses the reported
problem.

dgp added on 2005-05-20 03:41:55:
Logged In: YES 
user_id=80530


Yes, the "tcl" cvs module has a
new "libtommath" submodule and
CVS limitations make the transition
a bit non-trivial.

Simplest thing to do is just get a
completely fresh tcl checkout:

  cvs -d .... checkout tcl

and notice the new subdirectory
tcl/libtommath .

If that's unattractive (maybe your
existing checkout has mods you
don't want to lose?), then we can
doing something else a bit more
involved.

jcw added on 2005-05-20 03:32:33:
Logged In: YES 
user_id=1983

When I follow the tclkit build instructions at http://www.equi4.com/218 
(replacing genkit by genkit85 everywhere, and tars by tars85), I get the 
following error while in the first compile (tclsh genkit85 B tcl):

gcc -pipe -c -O2  -Wall -Wno-implicit-int -fPIC -I. -I../../../src/tcl/unix -I../../../src/
tcl/unix/../generic -DTCL_TOMMATH -I../../../src/tcl/unix/../libtommath -
DPACKAGE_NAME=\"tcl\" -DPACKAGE_TARNAME=\"tcl\" -
DPACKAGE_VERSION=\"8.5\" -DPACKAGE_STRING=\"tcl\ 8.5\" -
DPACKAGE_BUGREPORT=\"\" -DSTDC_HEADERS=1 -
DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 
-DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -
DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -
DHAVE_LIMITS_H=1 -DHAVE_SYS_PARAM_H=1 -
DTCL_CFGVAL_ENCODING=\"iso8859-1\" -DSTATIC_BUILD=1 -
DPEEK_XCLOSEIM=1 -DTCL_SHLIB_EXT=\".so\" -
DTCL_CFG_OPTIMIZED=1 -DTCL_CFG_DEBUG=1 -
D_LARGEFILE64_SOURCE=1 -DTCL_WIDE_INT_TYPE=long\ long -
DHAVE_STRUCT_STAT64=1 -DHAVE_OPEN64=1 -DHAVE_LSEEK64=1 -
DHAVE_TYPE_OFF64_T=1 -DHAVE_GETCWD=1 -DHAVE_OPENDIR=1 -
DHAVE_STRTOL=1 -DHAVE_STRTOLL=1 -DHAVE_STRTOULL=1 -
DHAVE_TMPNAM=1 -DHAVE_WAITPID=1 -DUSE_TERMIOS=1 -
DHAVE_SYS_TIME_H=1 -DTIME_WITH_SYS_TIME=1 -
DHAVE_STRUCT_TM_TM_ZONE=1 -DHAVE_TM_ZONE=1 -
DHAVE_GMTIME_R=1 -DHAVE_LOCALTIME_R=1 -DHAVE_MKTIME=1 -
DHAVE_TM_GMTOFF=1 -DHAVE_TIMEZONE_VAR=1 -
DHAVE_STRUCT_STAT_ST_BLKSIZE=1 -DHAVE_ST_BLKSIZE=1 -
DHAVE_SIGNED_CHAR=1 -DHAVE_LANGINFO=1 -
DHAVE_SYS_IOCTL_H=1 -DTCL_UNLOAD_DLLS=1      ../../../src/tcl/unix/../
generic/tclObj.c
In file included from ../../../src/tcl/generic/tclObj.c:19:
../../../src/tcl/generic/tommath.h:31:27: tommath_class.h: No such file or 
directory
make: *** [tclObj.o] Error 1

I did a "cvs update" in all src/* directories after the "tclsh genkit A" step, which 
fetched everything.

Has something changed?  Do I need to get something else?  I can attach the 
build transcript if needed.

dgp added on 2005-05-20 03:24:23:
Logged In: YES 
user_id=80530


In Tcl 8.5a3, the TclpSetInitialEncodings
routine no longer suffers from the limitation
reported.  If Tclkit sources get updated to
use the new command
[::tcl::unsupported::EncodingDirs]
or the corresponding private C routine
TclSetEncodingSearchPath
before the second call to TclpSetInitialEncodings...

...AND if the encoding files are actually
in the directory to be found...

...then this issue should be solved
with 8.5a3 as a base.  A successful
test with the next Tcl release would be
a very good thing, and would be good
motivation to push these "unsupported"
and "private" interfaces public.

I still need to look into what can be
done for Tcl 8.4.10, if anything.

dgp added on 2005-05-20 03:09:49:
Logged In: YES 
user_id=80530


I don't follow this part of the report:

"I tried the newer tclkit version (8.5.a2) on Linux
... it does not start at all ...
If I execute "encoding system koi8-r" ...
then it seems to work."

If the program will not start, then what does
it mean to "execute 'encoding system koi8-r'" ?

dgp added on 2005-05-20 03:04:06:
Logged In: YES 
user_id=80530


thanks for the details.

part of that message reports
that a Tclkit built on
Tcl8.5a2 doesn't work at all.
Can we construct a Tclkit
from the current CVS sources
of both Tcl and Tclkit, and try
that again?  An official Tcl 8.5a3
release should be out in a few weeks,
and it would be best to find out now
if that combination will be broken.

yaroslav1 added on 2005-05-18 14:55:52:
Logged In: YES 
user_id=1272045

Sorry, I misled you (and forgot to describe a problem). 

This is a part from original letter (not in tclkit forum,
sorry again):
-------------------------
Sergey Vlasov vsu at altlinux.ru
Sat Apr 16 18:32:59 CEST 2005
Does anyone know how to make tclkit find and use the proper 
system encoding in a non-English locale?

I have tried tclkit-linux-x86-8.4.9 with LANG=ru_RU.KOI8-R,
and also the 8.4.9 Win32 version under Windows 98, and 
both have the same problem: even if I rebuild tclkit with all
 encoding files (as described at 
http://www.equi4.com/tkunicode.html), I still get:

$ tclkit
% encoding system
iso8859-1

(the Win32 version gives cp1252, which is also bad - 
the real system encoding in that case is cp1251).

When a Tk application is launched in such environment (I tried
Notebook - http://notebook.wjduquette.com/), it has major 
problems:
all keyboard input is assumed to be in the broken system 
encoding, therefore I get iso8859-1 accented letters instead
of Cyrillic characters.  Obviously, this makes starkits
unusable.

I tried the newer tclkit version (8.5.a2) on Linux
(http://www.equi4.com/pub/tk/8.5a2/tclkit-linux-x86.gz), and
with
LANG-ru_RU.KOI8-R it does not start at all:

$ tclkit-linux-x86-8.5a2 
system encoding "
zsh: abort      tclkit-linux-x86-8.5a2

If I execute "encoding system koi8-r" (after either adding the
appropriate encoding files to tclkit, or copying them as
described in
http://wiki.tcl.tk/10382), then it seems to work (I tried to
insert
this statement into
notebook2.1.1.vfs/lib/app-notebook/notebook.tcl
after copying of the encoding files, and such hacked
Notebook can
handle Cyrillic characters properly).  But obviously
hardcoding the
encoding name is not acceptable - the encoding should be 
determined automatically, like the "real" Tcl does it.

Looking at tclUnixInit.c:TclpSetInitialEncodings(), I see
that Tcl
tries several methods to detect the system encoding and uses
the 
first encoding for which Tcl_SetSystemEncoding() succeeds.  
However, at this point the encoding files stored inside
tclkit are 
not yet available (because vfs is not initialized),
therefore all
calls to Tcl_SetSystemEncoding() fail, and system encoding is
left set to "identity".
-------------------------------------------------
My additons to it:

On windows, tclkit fails because of its problems in working
with VFS. This is from my bug report to them:
----------------------------
On windows (and at all), there is a bug it boot.tcl in tclkit:

vfs::filesystem unmount $noe

Example in russian locale on Win98 (cp1251):
noe: D:/TCLKIT/русский/TCLKIT-WIN32-NEWEST.EXE
::vfs::filesystem info:
d:/tclkit/русский/tclkit-win32-newest.exe

Case differs, so it says: "no such mount" and crashes 
(on Win98 only).
Replaced with:
vfs::filesystem unmount [::vfs::filesystem info]
tclkit-win32-newest.exe with added encodings by 
tkunicode.html on the site works ok.
----------------------------

So, there is no bug in Tcl itself on Windows, but on Unix,
when tclkit tries to set system encoding second time (with
encodings loaded), this function just doesn't do it as I
described
before. So, system encoding just gets "fixed" by tclkit startup
code from "identity" to iso8859-1, which is not correct.

dgp added on 2005-05-17 07:48:37:
Logged In: YES 
user_id=80530


Sorry, jcw, I didn't ask the Q clearly enough.

I'm not looking for an approving opinion
on the proposed patch; I'm looking for the
bug report, and some background and
history.

Yarslov says the description of the problem
is in the archives of a "tclkit developer's forum" (?)

Can anyone give me a more direct pointer than that?

Rather than just accept a contributed patch,
I'd like to understand the problem being solved.
Also with a clearer understanding of the 
problem faced, I'll have a better idea whether
it's already fixed in Tcl 8.5 development.

jcw added on 2005-05-17 02:10:27:
Logged In: YES 
user_id=1983

Don, unfortunately I'm not really the person to ask for comments on this.  
Perhaps Vince or Jeff can comment.  I have never grasped the intricacies of 
Tcl's init sequence.

With that out of the way... if avoiding double init is merely an optimization, 
then it seems to me that the above change would be ok.  It essentially lets 
some of the code run on every call to TclpSetInitialEncodings. Might also be a 
hint about splitting this in two separate functions.  This code only affects the 
Unix builds, I don't know how Windows differs.

What Tclkit needs has always been the same: set up a Tcl interp without 
encodings at hand to let it run a bit of plain ASCII Tcl, then tell the system 
about encoding files which now are available.

Sorry I can't be of much help.

-jcw

yaroslav1 added on 2005-05-16 12:36:16:
Logged In: YES 
user_id=1272045

Yes, it's really agains 8.4.9. 
This bug was mentioned before (look in tclkit developer's forum
arhives), but solution was not found before.
I didn't look in 8.5a2 sources yet.

dgp added on 2005-05-13 22:11:23:
Logged In: YES 
user_id=80530


Just want to confirm that this
report is against release 8.4.9 
of Tcl?

And it does not pertain to
the heavily modified intialization
routines in Tcl 8.5a2 ?

Is this report in any way a new bug?
Why is it arising only now?

Attachments: