Tcl Source Code

View Ticket
Login
Ticket UUID: 644819
Title: Bytecompiled [switch] Command
Type: Patch Version: None
Submitter: dkf Created on: 2002-11-27 16:36:23
Subsystem: 47. Bytecode Compiler Assigned To: dkf
Priority: 5 Medium Severity:
Status: Closed Last Modified: 2003-03-06 05:42:33
Resolution: Fixed Closed By: dkf
    Closed on: 2003-03-05 22:42:33
Description:
Here's a first attempt at byte-compiling the [switch]
command; it needs a lot more work doing, but this
should be enough for people to see if I'm going about
it the right way.
User Comments: dkf added on 2003-03-06 05:27:53:
Logged In: YES 
user_id=79902

Extracted output from a tclbench run on Linux. First results
column is with the compiled switch, the second is vanilla
8.4.2.  All with static builds of tclsh.
025 SWITCH 1st true                6      19
026 SWITCH 2nd true                6      19
027 SWITCH 9th true               12      22
028 SWITCH default true           11      23

dkf added on 2002-12-12 23:05:46:
Logged In: YES 
user_id=79902

Fine with me. But you knew that already. :^)

dgp added on 2002-12-12 23:02:11:
Logged In: YES 
user_id=80530

Given the inexperience of the author
and the reviewers, I'd be more comfortable
seeing this patch applied to an 8.5a HEAD,
instead of the current 8.4.1.1 HEAD.

Time to branch?

dkf added on 2002-12-12 22:16:13:
Logged In: YES 
user_id=79902

Writing a piece of code to bytecode compile a command, and
especially one that has sub-scripts like [if], [for],
[while] and [switch] is really much harder than it looks. 
There's a lot of undocumented assumptions out there.  (I
think that it should be easier to generate bytecode for
commands that don't have such things.  Almost straight
forward once you understand what's going on with jump
fixups, stack depths and the bytecode models themselves.)

About the specific points you raise:

I think we should have split the 8.4.1 branch off
immediately after the 8.4.0 release.  :^)

The comment refers to the fact that we don't care if the
switch was called switch, or "switch" or {switch} or
whatever; if we get to this code, we assume we know that
we're at the right spot.  And that comment only refers to
the next line.  The ones after that check that we've got
options that we understand; not all of them are like that,
and there's not a lot of point in trying to compile switches
with -regexp as there's no neat way to generate bytecode for
them right now (I'd need to add bytecodes too, an
INST_RE_MATCH if nothing else.)  I just handle the cases
where we might get a real benefit.

The other comment that you pick out relates to the fact that
we are in the middle of handling a {key body key body ...}
construct, and that is really difficult to do because you
can't operate on a copy of the source (a fact which cost me
about a day's solid work to discover and fix!)

vincentdarley added on 2002-12-12 21:44:50:
Logged In: YES 
user_id=32170

Shouldn't we branch out a 8.4.1 branch of tcl/tk before
applying any somewhat experimental patches like this to the
HEAD?

My brief comments (since I was interested in how this whole
byte-compiling thing works!)...

+    /*
+     * We don't care how the command's word was generated;
we're
+     * compiling it anyway!
+     */
+    tokenPtr += tokenPtr->numComponents + 1;

this comment doesn't help me too much.  What's going on in
the next 20 lines of code?

+/*
+ * Test to see if we have guessed the end of the word
+ * correctly; if not, we can't feed the real string to the
+ * sub-compilation engine, and we're then stuck and so have to
+ * punt out to doing everything at runtime.
+ */

this suggests that for some values of 'word' we're not going
to compile the switch, but that seems to contradict the
earlier comment which only says we'll compile '--', '-glob
--', '-exact --'.

My conclusion: you are right, it is very complicated!!

dkf added on 2002-12-12 16:44:55:
Logged In: YES 
user_id=79902

Miguel's told me that he's too busy to review this right
now; should I apply it as is or do you (or any other Tcl
maintainer) want to review it first?  ;^)

dkf added on 2002-12-11 22:39:52:

File Deleted - 37311: 



File Added - 37374: compiled_switch.patch

Logged In: YES 
user_id=79902

Writing a piece of code to compile a command is really much
harder than it looks.  >:^(

Here's a version that seems to get *everything* right,
insofar as it has no strange crashes in the test suite any
more.  ;^)  Only thing I can't tell offhand is whether it
has any memory leaks.

dkf added on 2002-12-11 07:15:35:

File Added - 37311: compiled_switch.patch

Logged In: YES 
user_id=79902

OK, the code was wrong.  It fixed up some jumps far too
early.  Here's a version that puts off all fixing of jumps
until after it has laid out all the code.  And is actually
simpler too.

dkf added on 2002-11-29 16:07:01:
Logged In: YES 
user_id=79902

Hmm. Something is still not right with the patch, given how
many tests fail when it is applied.  I think this needs
going over by someone with more experience with the compiler
side of things...

dkf added on 2002-11-28 22:05:17:

File Deleted - 36392: 



File Added - 36393: compiled_switch.patch

Logged In: YES 
user_id=79902

Ooops!  Comment in code was very misleading...

dkf added on 2002-11-28 21:57:27:

File Deleted - 36315: 



File Added - 36392: compiled_switch.patch

Logged In: YES 
user_id=79902

Apart from the points Jeff raises (and which should be
addressed by the new patch below), there were a great number
of problems (relocation fixups and max stack depths were
both broken!)

(BTW, INST_STR_MATCH wasn't a straight slot-in replacement
for INST_STR_EQ because it matters which argument is which. 
INST_OVER is our savour!)

Performance wise on a 700MHz P3 Linux box:
% proc foo s {
    set x 0; set y 0
    foreach c [split $s {}] {
        switch -exact -- $c {
            a {incr x}
            b {incr y}
        }
    }
    return $x,$y
}
% proc bar s {
    set x 0; set y 0
    foreach c [split $s {}] {
        switch -exact -- $c a {incr x} b {incr y}
    }
    return $x,$y
}
% time {foo
shdgsdbcbcsjdsbvbsdjfvbsdbvsldfgsdfsvbsdhfbvlsdvsdbvldvxnvascv}
1000
275 microseconds per iteration
% time {bar
shdgsdbcbcsjdsbvbsdjfvbsdbvsldfgsdfsvbsdhfbvlsdvsdbvldvxnvascv}
1000
328 microseconds per iteration

hobbs added on 2002-11-28 06:26:06:
Logged In: YES 
user_id=72656

BTW, handling -glob would be easy - check for the -glob (and 
optionally --), and call TclEmitInst1(INST_STR_MATCH, 0 /* 
nocase */, envPtr); optionally instead of TclEmitOpcode
(INST_STR_EQ, envPtr);

You can look at TclCompileRegexpCmd to reuse some of 
that code.

hobbs added on 2002-11-28 06:17:40:
Logged In: YES 
user_id=72656

also the default case in the compiled command isn't checking 
to make sure that it is the last case, whereas the runtime one 
does.

hobbs added on 2002-11-28 06:14:30:
Logged In: YES 
user_id=72656

one thing that would need changing is that compiled 
commands should never throw an error - they should always 
throw TCL_OUT_LINE_COMPILE to let the error be raised at 
runtime (in the odd but possible situation that switch is 
overridden).

dkf added on 2002-11-27 23:36:23:

File Added - 36315: compiled_switch.patch

Attachments: