Tcl Source Code

View Ticket
Login
Ticket UUID: 552ed5eac53ff5e4685badea56935871627aff34
Title: Strange discrepancy regarding INST_START_CMD in body of cycles if compiled with or without invocation in condition/iterator...
Type: Bug Version: 8.6
Submitter: sebres Created on: 2018-04-20 20:33:30
Subsystem: 47. Bytecode Compiler Assigned To: nobody
Priority: 5 Medium Severity: Important
Status: Open Last Modified: 2018-04-23 10:19:17
Resolution: None Closed By: nobody
    Closed on:
Description:

During working on my performance branch (BTW interim state: 8.6th is ca. 6-7 times faster now, but I'm confident it'll outperform 10x plank soon), I found strange discrepancy between two almost equal code-pieces.

If one take a look into the byte-code of following scripts, he notes each set-command in first body receives additionally a 9-byte instruction "startCommand" (INST_START_CMD). This is good so (is also very fast command), but the second body totally misses it.

Just compare both disassemble-outputs of:

proc x {} {while {[t]} {set i0 0; set i1 1; set i2 2; set i3 3}}; tcl::unsupported::disassemble proc x
proc x {} {while {$va} {set i0 0; set i1 1; set i2 2; set i3 3}}; tcl::unsupported::disassemble proc x
Therefore it results in:

Source "while {[t]} {set i0 0; set i1 1; set i2 2; set i3 3}"
Cmds 6, src 52, inst 68, litObjs 6, aux 0, stkDepth 1, code/src 0.00
...
Source "while {$va} {set i0 0; set i1 1; set i2 2; set i3 3}"
Cmds 5, src 52, inst 30, litObjs 5, aux 0, stkDepth 1, code/src 0.00
...

The same is valid for all cycles (resp. body compilations so different if it does or does not contain invocation), e. g. on following "foreach":

proc x {} {set i1 0; unset i1; foreach a [test_itr] {set i0 0; set i1 1; set i2 2; set i3 3}}; tcl::unsupported::disassemble proc x
proc x {} {set i1 0; unset i1; foreach a {a1 a2 a3} {set i0 0; set i1 1; set i2 2; set i3 3}}; tcl::unsupported::disassemble proc x

results in:

Source "set i1 0; unset i1; foreach a [test_itr] {set i0 0; set..."
Cmds 8, src 81, inst 102, litObjs 6, aux 1, stkDepth 4, code/src 0.00
...
Source "set i1 0; unset i1; foreach a {a1 a2 a3} {set i0 0; set..."
Cmds 7, src 81, inst 46, litObjs 6, aux 1, stkDepth 4, code/src 0.00
...

And each 68 vs 30 resp. 74 vs 36 bytes of code. This is wrong (and weird at all).

BTW. Strange is also, in 8.5 as well as in my (own) another TclSE-edition, I can't see the same behavior (but some time ago I've rewritten there the handling round about "startCommand", so for example I don't have there `envPtr->atCmdStart` at all, it was implemented other way).

I'm relative sure, this 4 extra "startCommand" commands in body are missed in both second variants (I'm sure this will wrong calculate the count of interpreted command). So for example "info cmdcount" as well as interpreter limit will work totally wrong.

Following example illustrates the issue:

proc retv {v} {return $v}
set i 0
foreach c {
    { set j 10; while {[retv [incr j -1]]} {set i0 0; set i1 1; set i2 2; set i3 3} }
    { set j 10; while {      [incr j -1] } {set i0 0; set i1 1; set i2 2; set i3 3} }
    { foreach j [retv {9 8 7 6 4 3 2 1}]   {set i0 0; set i1 1; set i2 2; set i3 3} }
    { foreach j       {9 8 7 6 4 3 2 1}    {set i0 0; set i1 1; set i2 2; set i3 3} }
} {
    proc test_[incr i] {} $c
}
proc test {} {
    foreach i {1 2 3 4} {
        puts "[set cc [info cmdcount]]"
        test_$i
        puts "++ [expr {[info cmdcount] - $cc}]"
    }
}
test

Result should be:

++ 62
++ 52
++ 38
++ 37
And in current 8.6th it results in:
++ 62
++ 5     *WRONG*
++ 38
++ 5     *WRONG*

User Comments: sebres added on 2018-04-23 10:19:17:

> Note that optimisation is skipped if an interp limit exists.
This means the body of proc should be recompiled if limit set hereafter (epoch increment?).

If so, this is still worse in my opinion and very questionable way to do it.

In this case each setting of the limit will cause recompile of whole code-base of the project?...


sebres added on 2018-04-23 10:00:19:

Related ticket [931e4956b9]


aspect added on 2018-04-22 02:24:14:
A simpler example would make this easier to follow ..

    % proc x {t} {while {[t]} {incr x}}
      Command 1: "while {[t]} {incr x}"
        (0) jump1 +15       # pc 15
      Command 2: "incr x..."
        (2) startCommand +12 1      # next cmd at pc 14, 1 cmds start here
        (11) incrScalar1Imm %v1 +1  # var "x"
        (14) pop
      Command 3: "t..."
        (15) push1 0        # "t"
        (17) invokeStk1 1
        (19) nop
        (20) jumpTrue1 -18  # pc 2
        (22) push1 1        # ""
        (24) done

    % proc y {y} {while {$t} {incr x}}
      Command 1: "while {$t} {incr x}"
        (0) jump1 +6        # pc 6
      Command 2: "incr x..."
        (2) incrScalar1Imm %v1 +1   # var "x"
        (5) pop
        (6) loadScalar1 %v2         # var "t"
        (8) nop
        (9) jumpTrue1 -7    # pc 2
        (11) push1 0        # ""
        (13) done

Is this due to the optimisation at https://core.tcl.tk/tcl/artifact/49a6bfc698a0d1c0?ln=825 ?

It looks legitimate to me, since START_CMD only (?) exists for book-keeping
for the error path and interp limits.  Note that optimisation is skipped if
an interp limit exists.

The [info cmdcount] discrepancy is incorrect, but I understand [info cmdcount]
to be (necessarily) fragile in the face of BCC.  Probably the manual
should be updated to reflect what is said in https://wiki.tcl.tk/9813

sebres added on 2018-04-20 20:47:01:

Check-in [1283a17cbd1b1242] contains a test-case that illustrated this bug.