Author:         Arjen Markus <[email protected]>
State:          Draft
Type:           Informative
Vote:           Pending
Created:        02-Oct-2001
Post-History:   
Keywords:       installation,initialisation,embedded,resources

Abstract

This TIP describes the development and deployment of Tcl/Tk applications, with particular attention on how to embed the interpreter into executables written in C or C++.

Introduction and Background

Usually, an application that uses Tcl/Tk in some way uses an independent installation and the application itself is started via a standard shell, like tclsh or wish. There are numerous occasions when such a set-up is not convenient:

Installation of external software is not allowed unless the IT department at the client's site consents - a very reasonable approach to the uncountable problems that occur due to conflicting software in modern computing environments.
Distribution of a stand-alone program is much easier - less cluttering of disk space, all the files are in one directory tree, programs can easily be demonstrated or uninstalled.
If the scripting part solves a small problem in a larger environment that does not require Tcl/Tk, the extra megabytes and the separate installation seem an overkill.

Another reason to document the resources used by Tcl/Tk is that this provides better insight in how to tune Tcl/Tk for a particular application.

Two examples may illustrate the need for such stand-alone applications and what is involved:

When we were building an installation script for an MS Windows application using one of the commercial tools that are available for this arcane job, we ran into a bizarre limitation: text replacement was possible for the so-called Windows INI-files only, but not for other types of files. The text to be replaced was the name of the installation directory. After several trials with the programming constructs the tool allowed, we chose a much better solution: a small Tcl script wrapped into a stand-alone program using Freewrap. (The application itself now actually uses another stand-alone Tcl script to take care of the file management that was too complicated for ordinary DOS batch files.)
The second example involves a small program that proves the usefulness of Tcl/Tk in on-line visualisation. The idea there is that large computational programs can send their data at regular steps during the computation to a separate program that plots these results in some meaningful way. To achieve this the program exports the results to the Tcl interpreter which uses the socket command to send them to a (primitive) viewer. For demonstration purposes you must be able to copy the program along with some files it needs on an arbitrary computer and, later, remove it with just a little effort.

Applications that use Tcl/Tk as an embedded library to achieve their goals, rather than exist as extensions or applications written in Tcl, can be quite useful. Examples include on-line visualisation in large computational programs, network applications that can be deployed as a single file etc. There is, however, little documentation on how to build such applications and what is required for their installation.

The aim of this TIP is to provide guidelines that make this development easier:

How to create an interpreter and test it within the larger environment?
What you can and can not do with a bare interpreter?
How to enhance its capabilities, such that it works as in an ordinary Tcl shell?
What (binary and script) libraries are required?
How to deal with other programming languages than C/C++?
How to create applications that can be installed without an independent Tcl installation?

Related TIPs and Discussions

The are several TIPs at the moment of this writing that are in some way related to the subject:

[4] proposes to outline the release and distribution philosophy, so that it becomes easy to include generally useful extensions - the so-called "batteries included".
[12] focuses completely on the "batteries included" aspect of the source distribution.
[34] is intended to solve some of the more awkward issues of TEA, as the current build system actually requires separate versions for UNIX and Windows.
[55] defines the set-up of packages that can be automatically installed into an existing installation.
Postings on the news:comp.lang.tcl newsgroup frequently involve how to embed Tcl into a C application, with an emphasis on loading packages and the use of the Tcl_Init() function.
Recently discussions have been held about supporting programming languages other than C. Notably: Pascal, FORTRAN, Visual Basic.

Contents of the Planned Document

The document that should help programmers with the issues discussed here will have the following (tentative) table of contents:

Introduction, outlining its purpose.
Tcl's bootstrap procedure, describing how the usual shells work.
Creating interpreters, what a bare interpreter can and can not do, how to enrich it via start-up scripts like init.tcl.
Compiling and linking, the usual issues surrounding the making of a binary executable.
Interfacing to other programming languages, though possibly a huge subject, it will present some guidelines, both practical implementation and design issues.
Installation and deployment, should inform about the external resources (environment variables, libraries, etc) for the application.
Overview, provide a checklist of the various possibilities and how to achieve them, with pointers for further information.
Literature, all the good books and other references.

Discussion

Issues that arise are:

what is the simplest way to embed Tcl,
what resources are needed (in terms of script and binary libraries) by such an application,
how can the application find everything it needs?

This TIP is meant to be a document that enables programmers who do not have intimate knowledge of the Tcl core to build such application and deploy them in the way they want.

Should it turn out that some automated tool would be nice to help the programmers, then this TIP will also cover such a tool.

Using the Tcl library

There are numerous ways an application written mainly in a language like C can use the Tcl and Tk libraries (in short: Tcl):

The application can simply use Tcl as a convenient library of C routines. In that case, Tcl would provide such facilities as regular expressions or channels.
The application can use Tcl as a scripting tool, that is, it will call Tcl to evaluate scripts and import the results.
The application can use Tcl in a more complicated mixture: Tcl scripts get evaluated that require binary extensions (both defined outside the application and as an integral part of the application).
An application that uses Tcl need not be written in C, but could be written in any programming language that allows calls to and from C routines directly or indirectly.

Note: due to the fact that the author is mostly familiar with the UNIX/LINUX and Windows platforms, no comments will be made about the Macintosh. This is completely due to ignorance, not to arrogance.

In principle, using the Tcl/Tk libraries is very simple: just create a Tcl interpreter, fill it with variables, commands and so on and feed it scripts, either as a file or as a string. It gets more complicated in the following situations:

The interpreter must be able to handle packages and interact with the environment in much the same way as tclsh or wish.
The application needs to intermix its own processing with Tcl event loops (such as continuing a calculation while a Tk window shows the progress).
It must be possible to use the application independently from a full Tcl installation.

The key to a successful implementation is: understanding how to properly initialise Tcl.

The application with which we will illustrate the various options is a simple program without any virtues of its own. It will read some data from a file, perform an insanely complicated but further unspecified computation on these data and output them into some convenient format to file.

The application will have two versions, a simple one consisting of the three separate steps and a more complicated one consisting of a preliminary step and then a loop involving both the computation and the output.

Let us assume that the application is written in some convenient programming language like C. The reasons for using Tcl are:

Flexible input routines

By using the scripting capabilities of Tcl one can easily adapt the program to the input file or files that it should read.
Flexible output routines

Again, the scripting capabilities allow adapting the output to the customer's wishes, without having to recompile and link it. This can be done for simple files on disk, but also graphical output or storage in a database is possible, without changing the program itself.

The simplest way: create a bare interpreter

With the Tcl routine Tcl_CreateInterp() you can create an interpreter that is capable of all the basic commands:

  Tcl_Interp * interp         ;
  char       * input_filename ;
  char       * buffer         ;
  double       x, y, z        ;

  /* Create the interp, use it to read the given input file,
     Note:
     Using the string API for simplicity, no error checking
  */
  interp = Tcl_CreateInterp() ;
  Tcl_SetVar( interp, "input_file", input_filename, TCL_GLOBAL_ONLY ) ;
  Tcl_EvalFile( interp, startup_script ) ;

  /* Extract the input data
  */
  buffer = Tcl_GetVar( interp, "x", TCL_GLOBAL_ONLY ) ;
  Tcl_GetDouble( interp, buffer, &x ) ;
  buffer = Tcl_GetVar( interp, "y", TCL_GLOBAL_ONLY ) ;
  Tcl_GetDouble( interp, buffer, &y ) ;
  buffer = Tcl_GetVar( interp, "z", TCL_GLOBAL_ONLY ) ;
  Tcl_GetDouble( interp, buffer, &z ) ;

  /* Destroy the interp - if you do not need it any longer
  */
  Tcl_DestroyInterp( interp ) ;

The output routine contains a similar fragment (note, we assume the Tcl interpreter was stored somewhere):

  Tcl_Interp * interp                   ;
  char       * output_filename          ;
  char         buffer[TCL_DOUBLE_SPACE] ;
  double       a, b                     ;

  /* Export the results to the interpreter
  */
  Tcl_PrintDouble( interp, a, buffer ) ;
  Tcl_SetVar( interp, "a", buffer, TCL_GLOBAL_ONLY ) ;
  Tcl_PrintDouble( interp, b, buffer ) ;
  Tcl_SetVar( interp, "b", buffer, TCL_GLOBAL_ONLY ) ;

  Tcl_SetVar( interp, "output_file", input_filename, TCL_GLOBAL_ONLY ) ;
  Tcl_EvalFile( interp, report_script ) ;

To add error checking (always do!), use code like this:

  Tcl_Channel errChannel ;

  if ( Tcl_EvalFile( ... ) != TCL_OK ) {
     errChannel = Tcl_GetStdChannel( TCL_STDERR ) ;
     if ( errChannel != NULL ) {
        TclWriteObj( errChannel, Tcl_GetObjResult(interp) ) ;
        TclWriteChars( errChannel, "\n", -1 ) ;
        ... /* Quit the program or other error handling? */
     }
  }

With this approach you need only to worry about the Tcl binary libraries: if the dynamic versions are linked to your application, then distribution of your application should include these binaries. If, on the other hand the static versions are used, your application already contains all of Tcl it needs all by itself.

The limitations of this approach are:

The utilities ordinarily defined via the Tcl initialisation script init.tcl are not available. (Note that these include such procedures as tclPkgSetup and unknown)
The Tcl variables argc, argv and argv0 are not set. This may be problematic if you want to use these variables to communicate with the user, e.g. provide an initial script file on the command-line.
Character encodings are not available. This will limit your application to ASCII characters.

Complete initialisation: the role of init.tcl

The next section outlines the full initialisation procedure that is used in the standard tclsh shell. This section concentrates instead on some practical observations:

The routine Tcl_FindExecutable() does a lot more than its name suggests: it is responsible for initialising the various subsystems in a controlled way, it will find all the character encodings.

It has to be called very early, before creating an interpreter. The results are stored in private variables that are used for all threads.

If it can not find the executable, no harm is done: it will have initialised the subsystems anyway.
The routine Tcl_Init() is responsible for setting up the script library by evaluating the script init.tcl. It should be called after the creation of an interpreter, to add the various commands to it.

If it can not find this script, it will return with an error.

(The routine actually has two additional hooks to allow customisation, but these will probably be used in unusual circumstances only.)

The script init.tcl and any it sources (directly or indirectly via auto_load) must be found via the tcl_library variable. On UNIX this variable is initialised via the TCL_LIBRARY environment variable is used, whereas on MS Windows the pathname of the Tcl DLLs is used as well.

As long as these scripts can be found, they can actually reside in a large number of directories with names related to the Tcl library path.

This leads to the following code to create a full-fledged interpreter:

  Tcl_Interp * interp         ;

  /* Initialise the Tcl library thouroughly
  */
  Tcl_FindExecutable( argv[0] ) ;

  /* Create the interp, evaluate "init.tcl" for the script
     level initialisation.
  */
  interp = Tcl_CreateInterp() ;

  if ( Tcl_Init( interp ) != TCL_OK ) {
     ... Report the error
  }

With init.tcl loaded, we have a number of additional commands and global variables:

tclLog, unknown, auto_load, auto_execok are the most important ones.
auto_path, errorInfo, errorCode

To create an interpreter that can handle Tk as well, you should be aware of the following:

Tk-able interpreters always need to be initialised via Tk_Init() and therefore require the start-up scripts: these scripts contain the default bindings and resource definitions and are therefore indispensable for Tk.
An application written using Tk needs to process events in a well-defined event loop.

TODO: how to write the event loop, what choices are available?

Initialisation via the standard shell

The details of the initialisation done in the standard tclsh shell are quite intricate. They involve, in addition to the initialisation via Tcl_FindExecutable() and Tcl_Init() also:

processing the command-line arguments
customisation via various hooks
preparing the Tcl parser by setting the locale to "C", as only this guarantees everything works as expected.

A summary of the steps found in the initialisation code is given below:

main() is a system-dependent routine which:

* sets the locale (Windows version)

* parses the command-line according to the UNIX rules (Windows version)

* calls Tcl_Main(), which is not supposed to return
Tcl_Main() takes as arguments the well-known argc/argv command-line arguments and a pointer to the initialisation routine, which in the case of tclsh is Tcl_AppInit():

* After calling Tcl_FindExecutable(), processing the command-line arguments and calling the initialisation routine, it can do either of two things:

* Evaluate the script file, if the first argument does not start with a minus sign

* Or go into an interactive loop to read the commands from the prompt. The preparation in that case is to evaluate the start-up script (such as ~/.tclshrc or ~/tclshrc.tcl)

* It exits by evaluating the Tcl "exit" command, not by calling the C routine exit() directly
The standard initialisation routine Tcl_AppInit() is meant to initialise the various application-specific commands and static packages via routines like Tcl_CreateCommand(). It also sets the Tcl variable "tcl_rcFile" to the user's start-up script.

(Curiously, the standard routine is found in a platform-dependent source file, tclXXXInit.c)
Tcl_Init() by the way provides two hooks for customisation:

* A pre-initialisation script that gets evaluated when the static variable "tclPreInitScript" has been set.

* The initScript variable that defines a Tcl procedure that looks up the init.tcl script.

Thus, before the shell is ready for processing, a lot of initialisation is done. Much of this process can be customised without the need to change the standard source files.

Overview

This section provides an overview of the resources that an application requires, given the type of usage:

Bare Tcl only interpreter:

Just the Tcl dynamic libraries

Complete initialisation for Tcl only:

The Tcl dynamic libraries
The environment variable TCL_LIBRARY
The initialisation script file init.tcl
The character encoding tables (optional)

Customised Tcl shell (adapted Tcl_AppInit()):

The Tcl dynamic libraries
The environment variable TCL_LIBRARY
The initialisation script file init.tcl
The character encoding tables (optional)
Possibly a so-called RC file to define the initialisation for interactive use

Customised Tk shell (wish; adapted Tk_AppInit()):

The Tcl and Tk dynamic libraries
The environment variables TCL_LIBRARY and TK_LIBRARY
The initialisation script file init.tcl, and the Tk specific bindings (in tk.tcl and others)
The character encoding tables (optional)
Possibly a so-called RC file to define the initialisation for interactive use

Equally important are the limitations:

Bare Tcl only interpreter:

No customisable initialisation (not automatically)
No access to the command-line arguments or the directory that contains the executable
No alternative character encodings
Possibly problems loading packages, as the auxiliary procedures for this are defined in init.tcl and others.
Possibly problems with the locale (best to explicitly set it to "C")
No interactive use

Complete initialisation for Tcl only:

Possibly problems with the locale (best to explicitly set it to "C")
No interactive use

Customised Tcl shell (adapted Tcl_AppInit()):

None

Customised Tk shell (wish; adapted Tk_AppInit()):

None

Compiling and linking

Nowadays, it seems the default to use dynamic or shared libraries. So, with many installations, there will exist dynamic versions of the libraries and sometimes there will be no static versions. This has a number of advantages:

The executables are much smaller, the memory usage can be smaller as well, as the code will be shared.
The libraries can be replaced without the need to rebuild the application. This is especially true if you enable the use of stubs for your binary packages (see below).

However, as the Tcl libraries now reside outside your application, they will have to be shipped with the application and the dynamic loader must somehow be able to find the libraries. The latter certainly has consequences: each system tends to have its own method.

When you have the Tcl/Tk sources, you can decide to create your own libraries. Of special interest are the following two situations:

You want to be able to use the stubs facility, as this makes it possible to run with different versions of Tcl/Tk with the same binary.
You want to get rid of as much extra stuff outside your application as possible, so you want to use the static version of the Tcl libraries.

Stubs were introduced to make binary extensions and applications independent of the specific Tcl version. They are enabled by defining the macro TCL_USE_STUBS during the compilation and linking of the Tcl/Tk library and especially your own extension.

In the initialisation procedure for your pacakge or application you need to initialise the stubs jump table via Tcl_InitStubs():

 #ifdef USE_TCL_STUBS
    if (Tcl_InitStubs(interp, "8.1", 0) == NULL) {
       return TCL_ERROR;
    }
 #endif

(details: http://mini.net/tcl/1687.html )

The technique, as Brent Welch explains, is simple in principle:

By enabling stubs, all calls to Tcl routines are turned into function pointers. These pointers are kept in a large table that is filled with the correct pointer values via the Tcl_InitStubs() routine.

Linking your application or extension should then be done against the "stub version" of the Tcl/Tk libraries.

If you do not want dynamic libraries, then perhaps a build with the option STATIC_BUILD is a solution. With this option, static libraries are built. The libraries are then incorporated into the executable itself.

Note: On some platforms, notably Windows, the specific calling convention is then turned to standard C (with dynamic libraries, the calling convention exports the various routines explicitly).

When you do not care about the dynamic libraries having to be present, at least be aware of the way the various systems want to define their position.

The information above is summarised as follows:

Using dynamic libraries:

Most UNIX versions and LINUX use the environment variable LD_LIBRARY_PATH, colon-separated just like PATH to indicate the position of dynamic libraries.
Some use the variable SHLIB_PATH instead (notably: HPUX).
Under Windows (all flavours) the PATH variable is used and a predefined sequence of directories to find the DLL's. One important case is that the libraries are found in the same directory as the executable.

Building for general Tcl versions:

Compile your sources with the macro TCL_USE_STUBS
Use the proper call to Tcl_InitStubs() to initialise the jump table.
Link against the stub versions of the Tcl/Tk libraries.

Building statically:

Use the flag STATIC_BUILD to build the static Tcl/Tk libraries.
Use this flag for your own sources as well
Link against the static versions.

Copyright

This document is placed in the public domain.

TIP 66: Stand-alone and Embedded Tcl/Tk Applications