Tcl Source Code

View Ticket
Login
Ticket UUID: 1306162
Title: some args turned into garbages on cp932
Type: Bug Version: obsolete: 8.4.11
Submitter: nobody Created on: 2005-09-27 19:25:40
Subsystem: 50. Embedding Support Assigned To: dgp
Priority: 8 Severity:
Status: Closed Last Modified: 2005-10-18 21:37:41
Resolution: Fixed Closed By: dgp
    Closed on: 2005-10-18 14:37:41
Description:
ActiveTcl8.4.11/Windows2000 SP4
Japanese(system encoding is cp932)

test.tcl
puts $args

and on command line
>tclsh test.tcl 江戸

then I got 構]戸([lindex $args 0] was 構]戸).

江戸 is 0x8D 0x5D 0x8C 0xCB.
構]戸 is 0x8D 0x5C 0x5D 0x8C 0xCB.

cp932/shiftjis encoding has 0x5B, 0x5D, 0x7B, 0x7D
("[", "]", "{" and "}")in a 2byte character. so it
looks for me that we should encode args to utf-8 one by
one first, and then create args list in Tcl_Main.

thanks
User Comments: dgp added on 2005-10-18 21:37:41:
Logged In: YES 
user_id=80530


closing.  remaining issues to be
dealt with in a new report.

nobody added on 2005-10-18 21:16:25:
Logged In: NO 

thank you dgp.

I see REF 491789.
If we used args as unicode from a beginning,
argv0 and argv encoding problems weren't caused.
In this sense, these problems are related to REF 491789.
But, it looks a hard work...

About argv0 problem, I just want to say, 
we shouldn't replace \ to / just a multi-byte string.

argv0 problem would be caused when we put 
exe binary on specific multi-byte path.
It's a really rare case.
I'll post a new for argv0 problem 
after arranging the problem once again.
so please close here, sorry and thank you.

dgp added on 2005-10-18 08:37:50:
Logged In: YES 
user_id=80530


The Tk issue is now in the
tktoolkit Tracker with ID
1328926.

Passing the argv0 question
to another maintainer for
comments.

(Related to RFE 491789?)

nobody added on 2005-10-07 16:06:03:
Logged In: NO 

Thank you.
I have two ideas for argv0 problem.

First idea is to move replacing separator code from
main(win/tclAppInit.c) to Tcl_Main(generic/tclMain.c). This
method needs "#ifdef __WIN32__ macro" in Tcl_Main.
Second idea is to use WIN32API in main(win/tclAppInit.c).
Which is better?
I tried to make patches for the both and tested on Windows2000.

patches for the first idea (for tcl8.4.11)
==========================================================
*** win/tclAppInit.c.original       Wed Oct 15 07:41:42 2003
--- win/tclAppInit.c        Fri Oct 07 11:01:54 2005
***************
*** 104,114 ****

      GetModuleFileName(NULL, buffer, sizeof(buffer));
      argv[0] = buffer;
-     for (p = buffer; *p != '\0'; p++) {
-       if (*p == '\\') {
-           *p = '/';
-       }
-     }

  #ifdef TCL_LOCAL_MAIN_HOOK
      TCL_LOCAL_MAIN_HOOK(&argc, &argv);
--- 104,109 ----


*** generic/tclMain.c.p1        Fri Oct 07 11:09:45 2005
--- generic/tclMain.c   Fri Oct 07 13:29:03 2005
***************
*** 236,241 ****
--- 236,244 ----
       
TclSetStartupScriptFileName(Tcl_ExternalToUtfDString(NULL,
                TclGetStartupScriptFileName(), -1, &appName));
      }
+ #ifdef __WIN32__
+       TclWinNoBackslash(Tcl_DStringValue(&appName));
+ #endif
      Tcl_SetVar(interp, "argv0",
Tcl_DStringValue(&appName), TCL_GLOBAL_ONLY);
      Tcl_DStringFree(&appName);
      argc--;


patch for the second idea (for tcl8.4.11)
==========================================================
--- win/tclAppInit.c.originalWed Oct 15 07:41:42 2003
+++ win/tclAppInit.cFri Oct 07 17:27:15 2005
@@ -102,13 +102,21 @@
      * slashes substituted for backslashes.
      */
 
-    GetModuleFileName(NULL, buffer, sizeof(buffer));
-    argv[0] = buffer;
-    for (p = buffer; *p != '\0'; p++) {
-if (*p == '\\') {
-    *p = '/';
+GetModuleFileNameA(NULL, buffer, sizeof(buffer));
+{
+WCHAR *wp;
+WCHAR wBuf[MAX_PATH];
+MultiByteToWideChar(CP_ACP, 0, buffer, -1,
+wBuf, MAX_PATH);
+for (wp = wBuf; *wp != '\0'; wp++) {
+if (*wp == '\\') {
+*wp = '/';
+}
+}
+WideCharToMultiByte(CP_ACP, 0, wBuf, -1, 
+buffer, sizeof(buffer), NULL, NULL);
 }
-    }
+    argv[0] = buffer;
 
 #ifdef TCL_LOCAL_MAIN_HOOK
     TCL_LOCAL_MAIN_HOOK(&argc, &argv);




thanks

dgp added on 2005-10-06 22:37:38:
Logged In: YES 
user_id=80530


re-opened for another look.
at least the port to Tk needs
to be done.

nobody added on 2005-10-06 22:17:26:
Logged In: NO 

Is anyone still looking at here?
I noticed an argv0 problem similar to the patched argv
problem...

This argv0 problem is caused by replacing \\ to / before
calling Tcl_Main.
We have need to convert external to utf-8 before replacing
\\ to /.
Or we have to add a checking code for specific char, if we
couldn't convert external to utf-8 before replacing  \\ to /.

And I noticed that these argv and argv0 patches are needed
by wish.exe too.

thanks

dgp added on 2005-10-01 02:31:31:
Logged In: YES 
user_id=80530


Patches accepted.

See also related report 491789.

nobody added on 2005-09-30 11:12:28:
Logged In: NO 

I tried the two paches on Windows2000 and Linux.
8.4.11 was OK. and Basically 8.5a4(from CVS) was OK too.
I think my reported problem was fixed completely.
Thank you very much.

Tcl8.5a4 on Windows2000 had some little problems of test
unrelated to my reported problem. I will make a search for
bug reports about these.

thanks

dgp added on 2005-09-29 22:10:03:

File Added - 150792: main-85.patch

dgp added on 2005-09-29 22:10:00:
Logged In: YES 
user_id=80530


....and here's a corresponding patch
for Tcl 8.5a4.

dgp added on 2005-09-29 22:09:08:

File Added - 150791: main.patch

Logged In: YES 
user_id=80530


Thanks for testing and for the patch.

I've attached to this report a different
patch.  Can you give it a test please?

nobody added on 2005-09-29 20:06:34:
Logged In: NO 

I made patch for Tcl8.4.11. It was tested on Windows2000 and
Linux. It got no failures and solved the problem.
check it please...
thanks

--- generic\tclMain.c.originalThu May 30 07:59:33 2002
+++ generic\tclMain.cThu Sep 29 20:32:47 2005
@@ -206,9 +206,10 @@
 {
     Tcl_Obj *resultPtr;
     Tcl_Obj *commandPtr = NULL;
-    char buffer[TCL_INTEGER_SPACE + 5], *args;
+    Tcl_Obj *argvPtr = NULL;
+    char buffer[TCL_INTEGER_SPACE + 5];
     PromptType prompt = PROMPT_START;
-    int code, length, tty;
+    int code, length, tty, i;
     int exitCode = 0;
     Tcl_Channel inChannel, outChannel, errChannel;
     Tcl_Interp *interp;
@@ -238,12 +239,15 @@
      * all callers of Tcl_Main to do it.  (Those callers
are likely
      * in a main() that can't easily change its signature.)
      */
-    
-    args = Tcl_Merge(argc-1, (CONST char **)argv+1);
-    Tcl_ExternalToUtfDString(NULL, args, -1, &argString);
-    Tcl_SetVar(interp, "argv",
Tcl_DStringValue(&argString), TCL_GLOBAL_ONLY);
-    Tcl_DStringFree(&argString);
-    ckfree(args);
+
+    argvPtr = Tcl_NewListObj(0, NULL);
+    for (i=1; i<argc; i++) {
+        Tcl_Obj *argPtr = NULL;
+        Tcl_ExternalToUtfDString(NULL, (CONST
char*)argv[i], -1, &argString);
+        argPtr =
Tcl_NewStringObj(Tcl_DStringValue(&argString), -1);
+        Tcl_ListObjAppendElement(interp, argvPtr, argPtr);
+    }
+    Tcl_SetVar2Ex(interp, "argv", NULL, argvPtr,
TCL_GLOBAL_ONLY);
 
     if (TclGetStartupScriptPath() == NULL) {
 Tcl_ExternalToUtfDString(NULL, argv[0], -1, &argString);

dgp added on 2005-09-29 08:20:27:
Logged In: YES 
user_id=80530


Has the proposed patch been tested
and does it solve the reported problem?

I definitely like the looks of it.

nobody added on 2005-09-28 14:18:45:
Logged In: NO 

Well, I think the problem is here.

line 242- Tcl_Main function in generic/tclMain.c 
----------------------
args = Tcl_Merge(argc-1, (CONST char **)argv+1);
Tcl_ExternalToUtfDString(NULL, args, -1, &argString);
Tcl_SetVar(interp, "argv", Tcl_DStringValue(&argString),
TCL_GLOBAL_ONLY);


Tcl_Merge function escape { and } in argv. so if a 2byte
character has { or }, it will be escaped, and it makes garbage.
I think First, args should be encoded to utf-8, Second we
should make argv list.
I don't know a lot about Tcl API, but I try to write correct
code.
It will be more clear than my poor english...

----------------------
Tcl_Obj *listobj = Tcl_NewListObj(0, NULL);

for (i=1; i<argc; i++) {
    Tcl_Obj *argobj = NULL;
    Tcl_ExternalToUtfDString(NULL, argv[i], -1, &argString);
    argobj = Tcl_NewStringObj(Tcl_DStringValue(&argString), -1);
    Tcl_ListObjAppendElement(interp, listobj, argobj);
}
Tcl_SetVar2Ex(interp, "argv", NULL, listobj, TCL_GLOBAL_ONLY);
Tcl_DStringFree(&argString);


thanks

hobbs added on 2005-09-28 02:37:33:
Logged In: YES 
user_id=72656

unicode command line issues?

nobody added on 2005-09-28 02:32:25:
Logged In: NO 

I wrote japanese characters, and it turned into
&#27743;&#25144....very sorry...

Attachments: