Tcl Source Code

View Ticket
Login
Ticket UUID: 458361
Title: single word scripts get recompiled at each call
Type: Bug Version: obsolete: 8.4a3
Submitter: msofer Created on: 2001-09-04 12:02:38
Subsystem: 47. Bytecode Compiler Assigned To: msofer
Priority: 5 Medium Severity:
Status: Closed Last Modified: 2004-07-09 01:50:58
Resolution: Fixed Closed By: msofer
    Closed on: 2004-07-08 18:50:58
Description:
(see also
http://groups.google.com/groups?hl=en&safe=off&threadm=3B9458E6.93CEF15F%40ActiveState.com&prev=/groups%3Foi%3Ddjq%26as_ugroup%3Dcomp.lang.tcl)

The output of
    proc testproc {args} {set x 1}
    proc test0 {} {puts "proc test0:  [time {testproc}
100000]"}
    proc test1 {} {puts "proc test1:  [time {testproc }
100000]"}
    proc test2 {} {eval {puts "proc test2:  [time
{testproc} 100000]"}}

    puts "top level: [time {testproc } 100000]"
    test0
    test1
    test2
is
    [mig@mini unix]$ tclsh test2
    top level: 4 microseconds per iteration
    proc test0:  20 microseconds per iteration
    proc test1:  4 microseconds per iteration
    proc test2:  4 microseconds per iteration

I think that the problem in test0 is caused by
shimmering of the literal
"testproc":
  - it is converted to tclByteCodeType by Tcl_EvalObjEx
  - it is converted to tclCommandNameType by
Tcl_ExecuteByteCode

So, the difference would be:
   - in test0, "testproc" is parsed, compiled and
looked up as a 
     command repeatedly
   - at top-level and in test2, "testproc" never
becomes a 
     tclByteCodeType 
   - in test1, there are two different literals:
"testproc " is always 
     tclByteCodeType, "testproc" is always
tclCommandNameType

This would explain why the difference only occurs at
calls with no
arguments.
User Comments: msofer added on 2004-07-09 01:50:58:
Logged In: YES 
user_id=148712

Decided not to backport: this is just a relatively minor
performance bug, it doesn't seem reasonable to risk the
currently stable branch without more testing.

msofer added on 2004-07-08 05:17:05:
Logged In: YES 
user_id=148712

Fixed in HEAD by unsharing the cmdName literal for
single-word scripts.Not backported to core-8-4-branch due to
[Bug 983660] (leaking unshared literals); fix of 983660 not
backported until performance implications are clear.

Leaving open until backporting is done or decided against.

Note that the fix is not perfect, shimmering will still
occur in cases where the single-word script is called as a
command somewhere else (as opposed to: in its own body). The
correct solution may be to have a more discriminating policy
for literals - one table for cmdNames, one for varNames, one
for nsNames, one for 'everything else' (scripts, strings, ...)?

msofer added on 2002-01-10 05:16:09:
Logged In: YES 
user_id=148712

The approach in these patches is suspect at best; to
illustrate, what *should* be the return of this script?
  
   proc {set a} {} {return 0}
   set a 1
   {set a}
   catch {set a} res
   set res

The point is: should the {set a} argument to [catch] be
taken as one or two words, being that both interpretations
make sense? We (dgp, kennykb, myself) think it should be two
words - the approach in these patches fails to satisfy this.

Putting the patch on hold - the issue of "command names with
spaces" requires further thought.

The correct approach may be quite different anyway: avoid
sharing the literals so promiscuosly.

msofer added on 2001-10-05 20:37:11:
Logged In: YES 
user_id=148712

Retaking ownership of this ticket; it is somewhat related to
[Bug 467523]; solving that one may result in a better
solution to this bug too ...

msofer added on 2001-09-12 02:38:41:

File Added - 10670: singleWord.patch3

msofer added on 2001-09-12 02:38:40:
Logged In: YES 
user_id=148712

Remark that there are (with this one) four different
execution paths in Tcl_EvalObjEx:
(1) pure list optimisation
(2) direct parse+eval 
(3) the new one for single-word commands
(4) execute bytecodes

Actually, the decr is only in the _no_error_ case; when
there is an error result, execution falls through to the
long Nr.4 path, which decr's objPtr at the end (after lable
_done_).

Side remark: wouldn't it be better if all 4 paths ended at
the same spot, with the (faster)
   TclDecrRefCount(objPtr);
   return result;

I illustrate this in patch Nr.3

hobbs added on 2001-09-12 02:20:16:
Logged In: YES 
user_id=72656

Should the decr really only occur in the error case?

msofer added on 2001-09-11 19:06:18:

File Added - 10655: singleWord.patch2

Logged In: YES 
user_id=148712

Enclosing a corrected patch: if there is an error, the long
execution path is restored.

msofer added on 2001-09-11 18:41:54:
Logged In: YES 
user_id=148712

The feared scenario is:
  - a literal (say {set a}) is validly interpreted as a
command name in one context, hence retains tclCmdNameType
  - later on, the same literal is evaled in another context
where it is *not* a valid command name but it *is* a valid
script

Now, without the patch there would be no problem: the
literal is parsed and compiled. With the patch, there is no
parsing or compiling: the literal is still assumed to be a
command name; the lookup fails, you get an "invalid command
name"

OK, I did find how to how to trigger this:

  set a 1
  namespace eval tst {
       proc {set a} {args} {return YES}
       proc t {} {{set a}}
       t
   }
   namespace eval tst2 {set a}

Without the patch, the last return value is "1". With the
patch, you get
    invalid command name "set a"

hobbs added on 2001-09-11 07:03:07:
Logged In: YES 
user_id=72656

Given the two-legged crane of Tcl objects, I believe your 
fear is unfounded.  It is a valid tclCmdNameType or it is 
not.  I'm not seeing how that can be invalidated.

msofer added on 2001-09-11 06:42:58:
Logged In: YES 
user_id=148712

Consider

  set a 1
  namespace eval tst {
      proc {set a} {args} {return YES}
      proc t {} {{set a}}
      t
  }

After this the literal {set a} is a tclCmdNameType. If you
now send the script {set a} to Tcl_EvalObjEx from the global
namespace you may get in trouble.

Now: command names with spaces are risky/difficult in any
case. Plus, this is an abstract fear: I did not manage to
trigger such an error from a script.

Also: I have the same fear for a script accidentally
shimmering into a pure list before being [eval]'ed, causing
the pure list optimisation to produce an error. As far as I
know, there are no reports of such a thing ever happening -
so the fear is probably unfounded.

hobbs added on 2001-09-11 06:14:33:
Logged In: YES 
user_id=72656

Can you elaborate on how you think something could shimmer 
into tclCmdNameType without being a command name?

msofer added on 2001-09-11 04:49:21:

File Added - 10617: singleWord.patch

Logged In: YES 
user_id=148712

I enclose a patch that avoids this shimmering. I do think
the patch is safe; it would produce if ever something that
is *not* a command namecould have shimmered into a
tclCmdNameType Tcl_Obj ...

This is most unlikely, but it *could* conceivably happen I
guess ...

msofer added on 2001-09-04 19:06:58:
Logged In: YES 
user_id=148712

[set tcl_traceCompile 2] confirms the repeated compilation
of "testproc" only in test0.

Attachments: