Tcl Source Code

View Ticket
Login
Ticket UUID: 1737021
Title: unacceptable performance gap for switch
Type: Bug Version: obsolete: 8.5a6
Submitter: nobody Created on: 2007-06-14 09:51:55
Subsystem: 43. Regexp Assigned To: dkf
Priority: 5 Medium Severity:
Status: Closed Last Modified: 2007-06-16 21:02:20
Resolution: Wont Fix Closed By: dkf
    Closed on: 2007-06-16 14:02:20
Description:
A small program (attached) takes 2 seconds to operate. It contains a single switch statement executed 10000 times.

When you slighly modify one of the patterns (put a ^ in front of it), the execution time is multiplied by 9i (it then takes about 20 seconds to operate). This seems unacceptable to me.

This has been observed under at least Cygwin, GNU/Linux and Solaris 8. My e-mail address is Denis.Excoffier AT free.fr.
User Comments: [email protected] added on 2007-06-14 22:57:28:
Logged In: NO 

Thank you for all: the explanations, and the hint that storing the RE (in its entirety) in a global variable allows the interpreter to compile it only once.

Those interested in enlarging the thread-cache can probably try to modify "#define NUM_REGEXPS 30" in ./generic/tclRegexp.c.

dkf added on 2007-06-14 21:10:52:
Logged In: YES 
user_id=79902
Originator: NO

I *think* the thread-cache can hold 30 elements, and so if one of the REs in your example is spilled to a variable (or a literal), it turns out that all the RE compilations become cached (either in Tcl_Obj value caches or in the thread-cache) and that makes things much faster. I'm not keen on tweaking the size of the thread-cache, since it is a cost paid for by everyone; you can get better performance by doing something like this:
  # _Once_ only...
  foreach {k v} [array get global] {
     set re($k) ^$v
  }
  # In the loop
  global re
  switch -regexp -- $x $re(v00) - $re(v01) - $re(v02) - $re(v03) - ... {
     ....
  }

If you use *that*, you'll find that things go faster (since the re array will manage the caches for you).

[email protected] added on 2007-06-14 17:48:50:
Logged In: NO 

Sorry for that, the program included already has the ^ at the beginning of the line containing "v30".
If you remove the ^ in this line, the program goes 9 times faster.

The problem is not on the performance itself, but on the difference of execution times with or without this
^ in front of the "v30" line.

dkf added on 2007-06-14 17:08:36:
Logged In: YES 
user_id=79902
Originator: NO

I'm not quite sure what I'm supposed to do to trigger the change (all the patterns already have ^ characters!) but I would note that you seem to be going out of your way to make your code inefficient. In particular, you use lots of runtime-generated patterns, which means that you defeat both the in-value and per-thread RE caches, forcing recompilation of each pattern every time through the loop. That sucks, but you've managed to create a horrible degenerate case.

If you change your code so that you keep each *whole* RE in the "global" array instead of just fragments, you'll find the code goes much faster. (Alternatively, use fewer REs so that the per-thread cache isn't exceeded, but that might not be easy with real-life REs.)

Otherwise, I've no idea what I'm supposed to do about this to magically make your code efficient. (Caching RE fragments would be horribly tricky to implement - it requires lots of changes in lots of places, including wholly new APIs - but that's the only thing I can think of which would guarantee to make your code zippy...)

[email protected] added on 2007-06-14 16:51:55:

File Added - 233011: tclunefficient

Attachments: