Tcl Source Code

View Ticket
Login
Ticket UUID: 1661637
Title: direct compile of pure list scripts
Type: RFE Version: None
Submitter: dgp Created on: 2007-02-16 16:09:00
Subsystem: 47. Bytecode Compiler Assigned To: msofer
Priority: 8 Severity:
Status: Closed Last Modified: 2008-06-08 10:23:24
Resolution: Accepted Closed By: msofer
    Closed on: 2008-06-08 03:23:24
Description:
The attached patch modifies
TclSetBytecodeFromAny() so
that when the value we are
compiling is already a pure
list, we produce bytecode
directly, and don't go through
the waste of reparsing the
string rep, and recreating
word values out of the tokens
produced by parsing.

This means we don't shimmer
away any possibly valuable
internal reps in the elements
of the list.

As a simple example:

set ns ::foo
namespace eval $ns {}
namespace exists $ns
namespace eval [list namespace exists $ns]

The first [namespace exists] caches
converts value "::foo" to the "nsName"
Tcl_ObjType with a cached reference
to the Namespace that name represents.

Without this patch, that internal rep
data is lost when [namespace eval]
bytecode compiles the second
[namespace exists] command, and the
namespace has to be looked up from
string value again.

With this patch, the value $ns compiles
directly into the bytecode, and when
INST_INVOKE_STK1 executes the compiled
[namespace exists] command, it gets
the cached intrep data.

This is just a simple example to
illustrate the general principle that
we ought to keep intreps if it's
reasonably easy and there's not a reason
not to.

This principle has already proved
its worth for TCL_EVAL_DIRECT compiles,
and inspired lots of [list $cmd $arg]
coding style out there.  It's strange
that the same style has not until now
been supported in the bytecompiling
branches.
User Comments: msofer added on 2008-06-08 10:23:24:
Logged In: YES 
user_id=148712
Originator: NO

The third approach has been committed as part of Patch #1973096

dkf added on 2008-06-03 05:36:59:
Logged In: YES 
user_id=79902
Originator: NO

To be exact, the current logic distinguishes between lists in "proven canonical form" and everything else. A list is "proven canonical" when it either doesn't have a string representation at all, or when the string representation is derived from the list representation using UpdateStringOfList (IIRC). This avoids all the traps with things like semicolons, newlines, square brackets, etc. while remaining relatively robust in the face of things like debugging (unlike the previously used "pure list" concept). Non-canonical lists (well, they might be canonical, but we can't prove it cheaply) go through the other handling routes, which should be semantically the same even if less efficient in this case.

dgp added on 2008-06-03 03:17:11:
Logged In: YES 
user_id=80530
Originator: YES


Not to worry, Jan; we've been successfully
distinguishing the 'pure' or canonical lists
from other lists since the days of 8.3.0.

nijtmans added on 2008-06-03 03:13:31:
Logged In: YES 
user_id=61031
Originator: NO

The only thing we should worry about is when a list element contains
a ';', and it is not constructed using 'list'. E.g.:
>set x {puts stdout a;puts stdout b};# string containing two commands
puts stdout a;puts stdout b
>lindex $x 2;# now $x will become a list, but the string rep is not adapted
a;puts
> eval $x
a
b

Will this still function the same after applying this patch?

A way out of this problem is simply disallowing that string
elements contain ';'. Then the 'lindex $x 2' should simply
give an error that the string cannot be converted to a valid
list. Then, still, it is a slight incompatibility, but at
least the user is pointed to the place where the real problem
is: Trying to convert such kind of string to a list, which
changes the semantics.

This can be shown as well by:
>lappend x c
puts stdout {a;puts} stdout b

Now magically, curly braces appear around the "a;puts"
For the record: I agree with this patch, but the
consequence for the list behavior should be well considered.

Regards,
      Jan Nijtmans

msofer added on 2008-06-02 22:45:39:

File Added - 279884: 1661637-2.patch

Logged In: YES 
user_id=148712
Originator: NO

New patch that intercepts even earlier, in TclEvalObjEx: enable the canonical list optimisation also when TCL_EVAL_DIRECT has not been set. In this way, the compiler isn't ever called on canonical lists.
File Added: 1661637-2.patch

dgp added on 2007-02-17 05:11:58:
Logged In: YES 
user_id=80530
Originator: YES


miguel observes that most (all?)
of these issues can be addressed
by intercepting earlier in
TclCompEvalObj.  The second patch
attached takes that approach. 
Either patch, or even both patches
could be committed.
 
File Added: 1661637.patch

dgp added on 2007-02-17 05:11:57:

File Added - 216364: 1661637.patch

dgp added on 2007-02-16 23:09:00:

File Added - 216320: compList.patch

Attachments: