TIP 430: Add basic ZIP archive support to Tcl

Login
Bounty program for improvements to Tcl and certain Tcl packages.
Author:         Sean Woods <yoda@etoyoc.com>
Author:         Donal Fellows <donal.k.fellows@manchester.ac.uk>
Author:         Poor Yorick <tk.tcl.tip@pooryorick.com>
Author:         Harald Oehlmann <oehhar@users.sourceforge.net>
State:          Draft
Type:           Project
Vote:           Pending
Created:        03-Sep-2014
Post-History:
Keywords:       virtual filesystem,zip,tclkit,boot,bootstrap
Tcl-Version:    8.7

Abstract

This proposal will add basic support for mounting zip archive files as virtual filesystems to the Tcl core.

Target Tcl-Version

This TIP targets TCL Version 8.7

Rationale

Tcl/Tk relies on the presence of a file system containing Tcl scripts for bootstrapping the interpreter. When dealing with code packed in a self-contained executable, a chicken-and-egg problem arises when developers try to provide this bootstrap from their attached VFS with extensions like TclVfs. TclVfs runs in the Tcl interpreter. The interpreter needs init.tcl, which would mean that the filesystem containing init.tcl is not present until after TclVfs mounts it, yet that mount cannot happen until after init.tcl has been loaded. Bootstrap filesystem mounts require built-in support for the filesystem that they use.

With the inclusion of Zlib in the core (starting with 8.6, [234]), all that is required to implement a zip file system based VFS is to add a C-level VFS implementation to decode the zip archive format. Thus: this project.

Note that we are prioritizing the zip archive format also because it is practical to generate the files without a Tcl installation being present; it is a format with widespread OS support. This makes it much easier to bootstrap a build of Tcl that uses it without requiring a native build of tclsh to be present.

Specification

There shall be new commands added to safe interpreters within Tcl. All of which shall be in the ::zipfs namespace. These commands shall include:

  • zipfs mount ?archive? ?mountpoint?

    Mounts the ZIP file archive at the location given by mountpoint, which will default to zipfs:/archive if absent. With no arguments this command describes all current mounts, returning a list of pairs.

  • zipfs root

    Return the root mount point for Zipfs file systems. On windows this returns zipfs:/. On all other platforms this returns //zipfs:/

  • zipfs unmount archive

    Unmounts the ZIP file archive, which must have been previously mounted.

VFS Mount Point

On Windows ZipFs will mount all archives under zipfs:/. On all other platforms, ZipFs will mount all archives under //zipfs:/. Which root is being used for the current platform can be accessed via a call to zipfs root. For the remainder of this document, the mount point for zipfs will be referred to a ZIPFS_ROOT.

Volumes may be mounted at any point under ZIPFS_ROOT, and if a mount point does not start with ZIPFS_ROOT the path will be considered relative to ZIPFS_ROOT. This conventions avoids some confusing interactions between file normalize and glob that differ between Windows and Unix and make building global paths either hop volumes or interact with the native file system.

Having a fixed mount point breaks from the tradition of mounting volumes under / or info nameofexecutable that other zipfs implementations use. However, if a kit builder wishes to retain that capability, all that is required is to load their own zipfs implementation using the conventional shims provided for kit building. The function names for the core implementation have been modified to not conflict with zipfs implementations that are out in the wild.

Implementation

I have adapted Richard Hipp's work on Tcl As One Big Executable (TOBE) to operate inside of a modern Tcl. That implementation consists of one C file (tclZipvfs.c). I have also prepared a set of kit-like behaviors for the core to express when tclAppInit.c is not compiled with a TCL_LOCAL_MAIN_HOOK defined. Those behaviors reside in the TclZipfs_AppHook() function.

This work is checked in as the "core_zip_vfs" branch on both Tcl and Tk.

Modifications to tclBasic.c

tclBasic.c will contain a call to *TclZipfs_Init() which will initialize the portions of C needed to implement zipfs as well as inject the zipfs command into the interpreter.

New C File tclZipFS.c

This file is a self-contained implementation in C of a zip based VFS. It includes all functions needed for implementing zipfs.

Modifications to tclAppInit.c

tclAppInit.c will now call TclZipfs_AppHook() if no TCL_LOCAL_MAIN_HOOK was defined.

Modifications to the Tcl build system

Tcl will now build a copy of the minizip program, whose source is currently distributed in /compat/zlib/contrib/minizip. The tcl.m4 macro now detects if the compiler used can produce native native executables, and in cases where it cannot, will search for a C compiler that can, an substitute that value into the Makefile as HOST_CC. The C compiler will generate a native executable minizip which will be compiled in the same directory as tcl, and be used for all archive creation.

A new build target libtcl_MAJORMINORPATCHLEVEL.zip is created from the /library directory in the tcl sources. For static library installs, this archive is copied to the tcl standard install location. For shared library builds this archive is appended to the dynamic library.

Modifications to the /library file system

To reduce the complexity of building archives, init.tcl has been modified to look for the presence of an adjacent file pkgIndex.tcl. That file contains all of the package ifneeded calls to direct the core to find the core distributed packages relative to location of tcl_library. Unlike other pkgIndex.tcl files, this file must be manually maintained and kept up to date as package names and versions change, are added, or removed.

C API

  • int TclZipfs_AppHook(int *argc, char ***argv);

  • If the current executable has an attached zip file system, mount that to ZIPFS_ROOT/app.

  • If the file ZIPFS_ROOT/app/main.tcl exists, register that file as the process startup script.

  • If the file ZIPFS_ROOT/app/tcl_library/init.tcl exists, register ZIPFS_ROOT/app/tcl_library/init.tcl as tcl_library

  • If the file ZIPFS_ROOT/app/tk_library/init.tcl exists, register ZIPFS_ROOT_/app/tk_library/init.tcl* as tk_library

  • If tcl_library was not set, the function will then scan the local environment for a zipfs file system attached to either the tcl dynamic library or an archive named libtcl_MAJOR_MINOR_PATCHLEVEL.zip. That file can either be in the present working directory or in the standard system install location for Tcl.

  • int TclZipfs_Mount(Tcl_Interp *interp, const char *zipname, const char *mntpt, const char *passwd);

    Mounts a zip file zipname to the mount point mntpt. If passwd is non-null, that string is used as the password to decrypt the contents. mntpnt will always be relative to zipfs:

  • int TclZipfs_Unmount(Tcl_Interp *interp, const char *zipname);

    Unmount the file system created by a prior call to TclZipfs_Mount()

Creating a wrapped executable

With this tip, producing a wrapped executable is now a matter of:

mkdir myvfs.vfs
cd myvfs.vfs
echo "puts {hello world}" > main.tcl
zip -r ../hello.zip .
cd ..
cp tclsh8.7 hello
cat hello.zip >> hello
./hello
> hello world

Copyright

This document has been placed in the public domain.

History