Tcl Source Code: View Ticket

Ticket UUID:	458361
Title:	single word scripts get recompiled at each call
Type:	Bug	Version:	obsolete: 8.4a3
Submitter:	msofer	Created on:	2001-09-04 12:02:38
Subsystem:	47. Bytecode Compiler	Assigned To:	msofer
Priority:	5 Medium	Severity:
Status:	Closed	Last Modified:	2004-07-09 01:50:58
Resolution:	Fixed	Closed By:	msofer
		Closed on:	2004-07-08 18:50:58
Description:	(see also http://groups.google.com/groups?hl=en&safe=off&threadm=3B9458E6.93CEF15F%40ActiveState.com&prev=/groups%3Foi%3Ddjq%26as_ugroup%3Dcomp.lang.tcl) The output of proc testproc {args} {set x 1} proc test0 {} {puts "proc test0: [time {testproc} 100000]"} proc test1 {} {puts "proc test1: [time {testproc } 100000]"} proc test2 {} {eval {puts "proc test2: [time {testproc} 100000]"}} puts "top level: [time {testproc } 100000]" test0 test1 test2 is [mig@mini unix]$ tclsh test2 top level: 4 microseconds per iteration proc test0: 20 microseconds per iteration proc test1: 4 microseconds per iteration proc test2: 4 microseconds per iteration I think that the problem in test0 is caused by shimmering of the literal "testproc": - it is converted to tclByteCodeType by Tcl_EvalObjEx - it is converted to tclCommandNameType by Tcl_ExecuteByteCode So, the difference would be: - in test0, "testproc" is parsed, compiled and looked up as a command repeatedly - at top-level and in test2, "testproc" never becomes a tclByteCodeType - in test1, there are two different literals: "testproc " is always tclByteCodeType, "testproc" is always tclCommandNameType This would explain why the difference only occurs at calls with no arguments.
User Comments:	msofer added on 2004-07-09 01:50:58: Logged In: YES user_id=148712 Decided not to backport: this is just a relatively minor performance bug, it doesn't seem reasonable to risk the currently stable branch without more testing. msofer added on 2004-07-08 05:17:05: Logged In: YES user_id=148712 Fixed in HEAD by unsharing the cmdName literal for single-word scripts.Not backported to core-8-4-branch due to [Bug 983660] (leaking unshared literals); fix of 983660 not backported until performance implications are clear. Leaving open until backporting is done or decided against. Note that the fix is not perfect, shimmering will still occur in cases where the single-word script is called as a command somewhere else (as opposed to: in its own body). The correct solution may be to have a more discriminating policy for literals - one table for cmdNames, one for varNames, one for nsNames, one for 'everything else' (scripts, strings, ...)? msofer added on 2002-01-10 05:16:09: Logged In: YES user_id=148712 The approach in these patches is suspect at best; to illustrate, what should be the return of this script? proc {set a} {} {return 0} set a 1 {set a} catch {set a} res set res The point is: should the {set a} argument to [catch] be taken as one or two words, being that both interpretations make sense? We (dgp, kennykb, myself) think it should be two words - the approach in these patches fails to satisfy this. Putting the patch on hold - the issue of "command names with spaces" requires further thought. The correct approach may be quite different anyway: avoid sharing the literals so promiscuosly. msofer added on 2001-10-05 20:37:11: Logged In: YES user_id=148712 Retaking ownership of this ticket; it is somewhat related to [Bug 467523]; solving that one may result in a better solution to this bug too ... msofer added on 2001-09-12 02:38:41: File Added - 10670: singleWord.patch3 msofer added on 2001-09-12 02:38:40: Logged In: YES user_id=148712 Remark that there are (with this one) four different execution paths in Tcl_EvalObjEx: (1) pure list optimisation (2) direct parse+eval (3) the new one for single-word commands (4) execute bytecodes Actually, the decr is only in the _no_error_ case; when there is an error result, execution falls through to the long Nr.4 path, which decr's objPtr at the end (after lable _done_). Side remark: wouldn't it be better if all 4 paths ended at the same spot, with the (faster) TclDecrRefCount(objPtr); return result; I illustrate this in patch Nr.3 hobbs added on 2001-09-12 02:20:16: Logged In: YES user_id=72656 Should the decr really only occur in the error case? msofer added on 2001-09-11 19:06:18: File Added - 10655: singleWord.patch2 Logged In: YES user_id=148712 Enclosing a corrected patch: if there is an error, the long execution path is restored. msofer added on 2001-09-11 18:41:54: Logged In: YES user_id=148712 The feared scenario is: - a literal (say {set a}) is validly interpreted as a command name in one context, hence retains tclCmdNameType - later on, the same literal is evaled in another context where it is not a valid command name but it is a valid script Now, without the patch there would be no problem: the literal is parsed and compiled. With the patch, there is no parsing or compiling: the literal is still assumed to be a command name; the lookup fails, you get an "invalid command name" OK, I did find how to how to trigger this: set a 1 namespace eval tst { proc {set a} {args} {return YES} proc t {} {{set a}} t } namespace eval tst2 {set a} Without the patch, the last return value is "1". With the patch, you get invalid command name "set a" hobbs added on 2001-09-11 07:03:07: Logged In: YES user_id=72656 Given the two-legged crane of Tcl objects, I believe your fear is unfounded. It is a valid tclCmdNameType or it is not. I'm not seeing how that can be invalidated. msofer added on 2001-09-11 06:42:58: Logged In: YES user_id=148712 Consider set a 1 namespace eval tst { proc {set a} {args} {return YES} proc t {} {{set a}} t } After this the literal {set a} is a tclCmdNameType. If you now send the script {set a} to Tcl_EvalObjEx from the global namespace you may get in trouble. Now: command names with spaces are risky/difficult in any case. Plus, this is an abstract fear: I did not manage to trigger such an error from a script. Also: I have the same fear for a script accidentally shimmering into a pure list before being [eval]'ed, causing the pure list optimisation to produce an error. As far as I know, there are no reports of such a thing ever happening - so the fear is probably unfounded. hobbs added on 2001-09-11 06:14:33: Logged In: YES user_id=72656 Can you elaborate on how you think something could shimmer into tclCmdNameType without being a command name? msofer added on 2001-09-11 04:49:21: File Added - 10617: singleWord.patch Logged In: YES user_id=148712 I enclose a patch that avoids this shimmering. I do think the patch is safe; it would produce if ever something that is not a command namecould have shimmered into a tclCmdNameType Tcl_Obj ... This is most unlikely, but it could conceivably happen I guess ... msofer added on 2001-09-04 19:06:58: Logged In: YES user_id=148712 [set tcl_traceCompile 2] confirms the repeated compilation of "testproc" only in test0.

Attachments:

singleWord.patch3 [download] added by msofer on 2001-09-12 02:38:40. [details]
singleWord.patch2 [download] added by msofer on 2001-09-11 19:06:18. [details]
singleWord.patch [download] added by msofer on 2001-09-11 04:49:21. [details]