Tcl Source Code

Artifact [da08e5a9a3]
Login

Artifact da08e5a9a343a67a6ab1129695559c9e27b0e95a:

Attachment "dict-filter.tip" to ticket [2370575fff] added by lars_h 2008-12-02 20:34:43.
TIP:            
Title:          Multiple dict filter patterns
Version:        
Author:         Lars HellstrĀšm <Lars dot Hellstrom at residenset dot net>
State:          Draft
Type:           Project
Vote:           
Tcl-Version:    8.6
Created:        27-Nov-2008
Keywords:       dict filter, set intersection
Post-History:   


~ Abstract

The '''key''' and '''value''' forms of '''dict filter''' are generalised 
to allow an arbitrary number of patterns.


~ Specification

The two '''dict filter''' command forms

 > '''dict filter''' ''dictionary'' '''key''' ''pattern''

 > '''dict filter''' ''dictionary'' '''value''' ''pattern''

are generalised to

 > '''dict filter''' ''dictionary'' '''key''' ?''pattern'' ...?

 > '''dict filter''' ''dictionary'' '''value''' ?''pattern'' ...?

and the results are the sub-dictionaries of those keys and values 
respectively which match at least one of the patterns.


~ Rationale

Although there are '''dict''' subcommands which allow deleting some keys 
from a dictionary ('''dict remove''') and inserting some keys into a 
dictionary ('''dict replace'''), there is no direct way of requesting 
the sub-dictionary which only has keys from a given list; if we think 
of only the set of keys in the dictionary, then we have subcommands for 
set minus and set union, but none for set intersection. 
A situation where this would be useful is that the option dictionary for 
a high-level procedure can contain options meant to be passed on to 
lower level commands, and it is necessary to extract the subdictionary 
of options that the lower level command would accept (since passing one 
which is not supported would cause it to throw an error).

There is of course already the '''dict filter''' command, which indeed 
returns a subdictionary of an existing dictionary, but its '''key''' form 
only accepts one '''string match''' pattern and therefore cannot be used 
to e.g. select all three of -foo, -bar, and -baz (it could select both 
-bar and -baz through the pattern -ba[rz], but that's neither common nor 
particularly readable). However, in many instances where this kind of 
pattern is used (notably '''glob''', '''namespace export''', and 
'''switch'''), it is possible to give several such patterns and have it 
interpreted as the union of the patterns. Were that the case with 
'''dict filter''', the "-foo, -bar, and -baz" problem could be solved as 
easily as

|  dict filter $opts key -foo -bar -baz

which is comparable to

|  dict remove $opts -foo -bar -baz
|  dict replace $opts -foo 1 -bar off -baz 42

and much nicer than the '''script''' counterpart

|  dict filter $opts script {key val} {
|     ::tcl::mathop::in $key {-foo -bar -baz}
|  }

If the '''key''' form is generalised like this, then it seems appropriate 
to also generalise the '''value''' form in the same way to keep the 
symmetry, even though I have no immediate use-case for that feature.

Since it is generally good to Do Nothing Gracefully, the command syntax is 
also generalised to allow the case of no patterns at all.


~ Rejected alternatives

A more direct way of meeting the motivating need would be a command 
'''dict select''' with the same syntax as '''dict remove''' (no pattern 
matching) but logic reversed. This would however be so close to 
'''dict filter''' ... '''key''' that extending the syntax of the latter 
seemed more appropriate.

An alternative to allowing multiple patterns with '''dict filter''' could 
be to allow a regular expression pattern, since the union of two regular 
languages is again a regular language. Any syntax that could be picked for 
that would however on one hand already be rather close to

|  dict filter $opts script {key val} {regexp $RE $key}

and on the other it would be rather difficult to read, as the regular 
expression corresponding to "-foo or -bar or -baz" is

|  ^(-foo|-bar|-baz)$

which it is tempting but incorrect to simplify to "-foo|-bar|-baz".


~ Implementation Notes

An implementation exists (it's a very trivial to modify '''dict filter''' 
... '''value''' to work this way: just add an inner loop over the list of 
patterns); see SF path #2370575.

What might be tricky is the case of '''dict filter''' ... '''key''', since 
this currently has an optimisation for the case of a pattern without glob 
metacharacters that would be very desirable to keep for the motivating 
use-case of selecting specific keys from a dictionary. The natural way to 
do that would be to make the loop over patterns the outer loop and the 
loop over dictionary entries the inner loop, which is only entered if the 
current pattern contains metacharacters. Such an optimisation would 
however have the script-level-visible consequence of having the keys show 
up in the order of the patterns rather than the order of the original 
dictionary, so it may be a good idea to also explicitly specify that 
'''dict filter''' does not guarantee keys in the result to be in the same 
order as in the input dictionary.

Indeed, a '''dict filter''' ... '''key''' that reorders keys according to 
its pattern arguments could sometimes be useful in interactive situations, 
as a way of getting selected keys up from in a dictionary:

|  set D {-baz 0 -bar 1 -foo 2}
|  dict filter $D key -foo -bar *

On the other hand, this effect can mostly be obtained through use of 
'''dict merge''' already:

|  dict merge {-foo x -bar x} $D


~ Copyright

This document has been placed in the public domain.