TIP 195: A Unique Prefix Handling Command

Bounty program for improvements to Tcl and certain Tcl packages.
Tcl 2017 Conference, Houston/TX, US, Oct 16-20
Send your abstracts to tclconference@googlegroups.com
by Aug 21.
Author:         Peter Spjuth <peter.spjuth@space.se>
Author:         Peter Spjuth <peter.spjuth@gmail.com>
State:          Final
Type:           Project
Vote:           Done
Created:        02-May-2004
Keywords:       Tcl
Obsoletes:      105
Tcl-Version:    8.6


This TIP adds a new command to support matching of strings to unique prefixes of patterns, similar to Tcl's existing subcommand-name matching or Tk's option-name matching.


When code (particularly in script libraries) wants to support shortest unique prefix matching in the manner of the Tcl core (as provided by Tcl_GetIndexFromObj) currently either the prefixes have to be precomputed (by hand or by script) or the matching has to be done backwards. In the first case, this is either error-prone or requires an extra piece of code that has to be developed by the programmer. In the second case, the code has to be converted into a pattern which is matched against the list of supported options in some way, which is either inefficient or has hazards if the string being matched contains characters that are meaningful to the matching engine being used. Instead, it would be far nicer if we could make the core support this directly, so that script authors could just say what they mean.

See also [105], which the text above comes from.

Another benefit of this command is getting an error message that looks like those nice and informative error messages that built in commands have. The proposed command includes a flag to control the error behaviour.

Another use of prefix matching is for completion, such as a combobox showing matching alternatives, or tab completion at a prompt.

The former needs a list of matching elements, and the latter needs a longest common prefix.

These functionalities are similiar enough to group them but different enought to make it a bad idea for flags, so a prefix ensemble seems approriate for them.

Proposed Change

To support this, I propose adding a ::tcl::prefix ensemble with the following subcommands

All commands are given a list of possibilities and a string to match.

prefix match returns the matching element in the list or an error. Basically it does what Tcl_GetIndexFromObj does except it returns a string instead of an index. The options -exact and -message corresponds to the flags and msg arguments to Tcl_GetIndexFromObj. The default value for -message is "option".

The -error options are used when no match is found. If -error is empty, no error is generated and an empty string is returned. Otherwise the options are used as return options when generating the error message. The default corresponds to setting "-level 0". Example: If -error "-errorcode MyError -level 1" is used, an error would be generated as [return -errorcode MyError -level 1 -code error "ErrMsg"].

prefix longest returns the longest common prefix within the group that matches the given string. If there is no match, an empty string is returned.

prefix all returns a list of all matching elements.


Basic use:

% namespace import ::tcl::prefix
% prefix match {apa bepa cepa} apa
% prefix match {apa bepa cepa} a
% prefix match -exact {apa bepa cepa} a
bad option "a": must be apa, bepa, or cepa
% prefix match -message "switch" {apa ada bepa cepa} a
ambiguous switch "a": must be apa, ada, bepa, or cepa
% prefix longest {fblocked fconfigure fcopy file fileevent flush} fc
% prefix all {fblocked fconfigure fcopy file fileevent flush} fc
fconfigure fcopy

Simplifying option matching:

array set opts {-apa 1 -bepa "" -cepa 0}
foreach {arg val} $args {
    set opts([prefix match {-apa -bepa -cepa} $arg]) $val

Switch similar to [105]:

switch -- [prefix match {apa bepa cepa} $arg] {
    apa  { }
    bepa { }
    cepa { }

Alternative names

Alternative names for the command that have been suggested are prefix, string prefix, string disambiguate, string prefixmatch, string matchprefix and lsearch -prefix.

Any name based on Tcl_GetIndexFromObj feels wrong since this command does not get any index.


Pascal Scheffers wrote:

I am not very fond of new toplevel commands, as they have a habit of breaking existing code, or, more likely, being redefined by existing code. How about string prefix ... or lsearch -prefix? It feels like something that could logically reside inside string handling or list handling.

Donal Fellows wrote:

Hmm, a new matching style for lsearch could work (yet another option for an overly scary command!) especially in conjunction with some of the other options like -inline, but I suspect it'll have to be string prefix or something like that since then we can easily state that the behaviour is not to necessarily return the first matching element. That was what scuppered #105 after all (much to my annoyance.)

Peter Spjuth: TIP was at this point altered to suggest string disambiguate after a suggestion from Donal.

Donal Fellows wrote:

I'd suggest that the -exact option be dropped, as it is trivially implementable using lsearch -exact, dict exists, info exists, switch, etc. It's also corresponding to a very rarely used flag to Tcl_GetIndexFromObj.

Peter Spjuth: Other ways to do -exact doesn't give the standardized error message so it has some value for those that want that functionality. The Core uses TCL_EXACT in a few places where it makes sense so the flag can be useful.

Voices were raised against the proposed string disambiguate and Jeff Hobbs pointed out other prefix related functions that are useful. This lead to the prefix ensemble suggestion.

Reference Implementation

Partial implementation at: https://sourceforge.net/support/tracker.php?aid=1040206


This document has been placed in the public domain.