Tcl Source Code

View Ticket
Login
Ticket UUID: 680169
Title: Tcl interpreter pkgIndex 'cache'
Type: RFE Version: None
Submitter: lvirden Created on: 2003-02-04 13:29:30
Subsystem: 39. Package Manager Assigned To: dgp
Priority: 5 Medium Severity:
Status: Open Last Modified: 2012-05-03 00:58:32
Resolution: None Closed By:
    Closed on:
Description:
 A feature I believe would be useful is a master
package index cache that would be updated each time a
package was installed into a subtree.

If the cache were present, it would be, by default, the
only place searched for a package's presence (though
this could be overridden by an application if so desired).

This would reduce the startup time of a tcl
interpreter, which in a general case can be quite
lengthy if a site has 50 or more extensions installed.
User Comments: basilik99 added on 2012-05-03 00:58:32:
Please find attached script in new feature request ID 3523089 "Tcl interpreter pkgIndex 'cache' - new version".

dkf added on 2012-05-02 20:37:07:
SF's pretty restrictive about who can manipulate attachments (probably because of abuse). Open a patch ticket and just comment here with its artifact ID.

basilik99 added on 2012-05-02 02:13:13:
Oops. Cannot attach file as I am not ticket owner... Could I send you an email dgp so that you attach it?

basilik99 added on 2012-05-02 02:02:05:
I attached an enhanced version of dgp's script. I run Tcl 8.5. on Windows. I put the code right after Windows implementation of auto_execok.

1- Replaced "~" by "$env(APPDATA)" as I am on Windows.

2- Replaced ugly foreaches (sorry dgp!) by "lassign" new command.

Now for real changes:
3- Fixed a problem which was probably not present for Tcl 8.4 (for which you initialy developped the patch right?) Parameter "version" sometimes gets sets to "0-", which is not accepted as a version in [package ifneeded]. So, when reading the cache, avoided the call to ifneeded when version was not found. Anyway, ifneeded cannot return different from "" in that case...

4- The proposed solution worked for almost all packages, but not for Thread. Indeed, its pkgIndex.tcl contains:
package ifneeded Thread 2.6.5 [list thread_load $dir]
where "thread_load" is a proc declared in the pkgIndex.tcl. So caching only [package ifneeded] scripts was not enough. I had to cache the pkgIndex.tcl itself. I didn't copy the pkgIndex.tcl's content, but only its path. Then I sourced it while having set $dir. I needed a special trick at the end to effectively execute the script inside the [package ifneeded], not only the pkgIndex.tcl's content. For TM, I used the previous solution.
I am seeking your comments about this change.

With your initial patch, I got the following:
% time {package require md5}
115286 microseconds per iteration
% time {package require tdom}
125298 microseconds per iteration
% time {package require Tclx}
468442 microseconds per iteration

With my new patch, it is similar, but a bit slower. But that's acceptable for me as the solution is generic:
% time {package require md5}
178442 microseconds per iteration
% time {package require tdom}
138962 microseconds per iteration
% time {package require Tclx}
497901 microseconds per iteration

Previously, loading md5 took 5 seconds, tdom around 3 seconds, asme for Tclx!

Now for questions:
Q1- I do not understand the benefit of putting the cache on the Tcl installation directory (which is a network drive for me). Anyway the cache is per-machine. A cache in user's local directory would be enough I think. Moreoever, for a single machine, we reached a cache file of around 5000 lines. When 10 or 20 users will use that cache system, the file is gonna get huge! .. probably slow to [source]...

jenglish added on 2003-12-16 06:49:57:

File Added - 70683: mkcache.tcl

Logged In: YES 
user_id=68433

Attached mkcache.tcl is the script I've been using to
rebuild the package cache.

The script is very simple-minded: first it [package forget]s
everything (to clear the cache); then it calls
[tclPkgUnknown] to source all the pkgIndex.tcl files;  and
finally it outputs the [package ifneeded] script for
everything discovered in part 2.

jenglish added on 2003-10-22 04:06:03:

File Added - 65010: tcl-pkgcache.patch

Logged In: YES 
user_id=68433

Attached tcl-pkgcache.patch is an alternate approach (which
I've been using for several months now locally).  I believe
this is the minimum necessary and sufficient change to
init.tcl:

+    if {[info library] != ""} {
+       variable pkgCache [file join [info library]
pkgCache.tcl]
+       if {[file exists $pkgCache]} { 
+           source $pkgCache
+       }
+    }

This just sources the file $tcl_library/pkgCache.tcl on
startup, if it exists.  Generating and maintaining the
package cache is left up to local administrative policies.

nobody added on 2003-07-11 12:20:38:
Logged In: NO 

Works *much* better now; my 5+ second delay before getting a
prompt is now gont!

Script started on Thu Jul 10 23:55:48 2003
sh$ time tclsh
exit
% exit

real    0m5.556s
user    0m1.587s
sys     0m1.318s

sh$
sh$ # After patch to init.tcl
sh$

sh$ time tclsh
exit
% exit

real    0m6.073s
user    0m0.757s
sys     0m1.368s

sh$ # The .tclPackageIndexCache was created
sh$
sh$ time tclsh
exit
% exit

real    0m0.750s
user    0m0.287s
sys     0m1.124s
sh$ exit
Script done on Fri Jul 11 00:03:10 2003

Dave Bodenstab <[email protected]>

dgp added on 2003-03-05 03:38:15:
Logged In: YES 
user_id=80530

FWIW, the caches should probably
also be per-[package unknown] so
that caches of one unknown handler
don't interrupt a different one.

dgp added on 2003-03-05 03:35:22:
Logged In: YES 
user_id=80530


If the central cache is truly dangerous, then
it's very simple to remove [info library] and
make use of per-user caches only.

Sadly, yes, all the per-* restictions are needed.
The only way to get around that is to cache
the index script themselves rather than the
results of running them.  Too much depends
on interpreter state to do otherwise.

dgp added on 2003-03-05 03:27:38:
Logged In: YES 
user_id=80530

just to clarify one point: I do not
propose that this patch be
integrated into the official Tcl
sources.  Instead, it should
remain here as an aftermarket
add-on that sysadmins can
add (and tweak) *if* they find
that startup times on their systems
are a serious problem.

Alternative approaches are welcome
and could also be offered on the
same terms.

jenglish added on 2003-03-05 03:20:33:
Logged In: YES 
user_id=68433

This approach looks overly complicated and error-prone to
me.

It's especially dangerous to automatically generate the
central package index in [info library] in the [package
unknown] handler -- this is something that should be left up
to the system administrator to do by hand.

It it necessary to parameterize the cache by [info
hostname], [info nameofexecutable], and the value of
$::auto_path? Why would the same program need a different
cache on different hosts?  (Actually, I think it would be
best if any Tcl program with the same [info library] could
use the same package cache, but there are pkgIndex.tcl
scripts which use (IMO overly-)clever heuristics to decide
whether or not to register 'ifneeded' scripts, so checking
[info nameofexecutable] might be appropriate).

Instead of parameterizing the cache by the $::auto_path
value, it might make more sense to keep per-directory
caches; [package unknown] could check for a cache before
scanning for pkgIndex.tcl files while processing
$::auto_path.

dgp added on 2003-03-05 00:30:42:

File Deleted - 44022:

dgp added on 2003-03-05 00:30:41:

File Added - 44025: indexCache.patch

Logged In: YES 
user_id=80530

corrected patch

dgp added on 2003-03-05 00:27:07:

File Added - 44022: indexCache.patch

Logged In: YES 
user_id=80530

here is a first pass at a patch that
implements a cache for package indexes.

It's main method of maintaining
agreement with what is installed
is via a cache miss.  A miss triggers
a rebuild of the cache.  So, at any
time you can rebuild the caches by

    package require non-existent-package

Until the scheme become more sohpisticated,
users of it should do that each time a package
is installed/uninstalled.  Revisions to improve on
this are welcome.

Attachments: