Attachment "dict-filter.tip" to
ticket [2370575fff]
added by
lars_h
2008-12-02 20:34:43.
TIP:
Title: Multiple dict filter patterns
Version:
Author: Lars HellstrĀm <Lars dot Hellstrom at residenset dot net>
State: Draft
Type: Project
Vote:
Tcl-Version: 8.6
Created: 27-Nov-2008
Keywords: dict filter, set intersection
Post-History:
~ Abstract
The '''key''' and '''value''' forms of '''dict filter''' are generalised
to allow an arbitrary number of patterns.
~ Specification
The two '''dict filter''' command forms
> '''dict filter''' ''dictionary'' '''key''' ''pattern''
> '''dict filter''' ''dictionary'' '''value''' ''pattern''
are generalised to
> '''dict filter''' ''dictionary'' '''key''' ?''pattern'' ...?
> '''dict filter''' ''dictionary'' '''value''' ?''pattern'' ...?
and the results are the sub-dictionaries of those keys and values
respectively which match at least one of the patterns.
~ Rationale
Although there are '''dict''' subcommands which allow deleting some keys
from a dictionary ('''dict remove''') and inserting some keys into a
dictionary ('''dict replace'''), there is no direct way of requesting
the sub-dictionary which only has keys from a given list; if we think
of only the set of keys in the dictionary, then we have subcommands for
set minus and set union, but none for set intersection.
A situation where this would be useful is that the option dictionary for
a high-level procedure can contain options meant to be passed on to
lower level commands, and it is necessary to extract the subdictionary
of options that the lower level command would accept (since passing one
which is not supported would cause it to throw an error).
There is of course already the '''dict filter''' command, which indeed
returns a subdictionary of an existing dictionary, but its '''key''' form
only accepts one '''string match''' pattern and therefore cannot be used
to e.g. select all three of -foo, -bar, and -baz (it could select both
-bar and -baz through the pattern -ba[rz], but that's neither common nor
particularly readable). However, in many instances where this kind of
pattern is used (notably '''glob''', '''namespace export''', and
'''switch'''), it is possible to give several such patterns and have it
interpreted as the union of the patterns. Were that the case with
'''dict filter''', the "-foo, -bar, and -baz" problem could be solved as
easily as
| dict filter $opts key -foo -bar -baz
which is comparable to
| dict remove $opts -foo -bar -baz
| dict replace $opts -foo 1 -bar off -baz 42
and much nicer than the '''script''' counterpart
| dict filter $opts script {key val} {
| ::tcl::mathop::in $key {-foo -bar -baz}
| }
If the '''key''' form is generalised like this, then it seems appropriate
to also generalise the '''value''' form in the same way to keep the
symmetry, even though I have no immediate use-case for that feature.
Since it is generally good to Do Nothing Gracefully, the command syntax is
also generalised to allow the case of no patterns at all.
~ Rejected alternatives
A more direct way of meeting the motivating need would be a command
'''dict select''' with the same syntax as '''dict remove''' (no pattern
matching) but logic reversed. This would however be so close to
'''dict filter''' ... '''key''' that extending the syntax of the latter
seemed more appropriate.
An alternative to allowing multiple patterns with '''dict filter''' could
be to allow a regular expression pattern, since the union of two regular
languages is again a regular language. Any syntax that could be picked for
that would however on one hand already be rather close to
| dict filter $opts script {key val} {regexp $RE $key}
and on the other it would be rather difficult to read, as the regular
expression corresponding to "-foo or -bar or -baz" is
| ^(-foo|-bar|-baz)$
which it is tempting but incorrect to simplify to "-foo|-bar|-baz".
~ Implementation Notes
An implementation exists (it's a very trivial to modify '''dict filter'''
... '''value''' to work this way: just add an inner loop over the list of
patterns); see SF path #2370575.
What might be tricky is the case of '''dict filter''' ... '''key''', since
this currently has an optimisation for the case of a pattern without glob
metacharacters that would be very desirable to keep for the motivating
use-case of selecting specific keys from a dictionary. The natural way to
do that would be to make the loop over patterns the outer loop and the
loop over dictionary entries the inner loop, which is only entered if the
current pattern contains metacharacters. Such an optimisation would
however have the script-level-visible consequence of having the keys show
up in the order of the patterns rather than the order of the original
dictionary, so it may be a good idea to also explicitly specify that
'''dict filter''' does not guarantee keys in the result to be in the same
order as in the input dictionary.
Indeed, a '''dict filter''' ... '''key''' that reorders keys according to
its pattern arguments could sometimes be useful in interactive situations,
as a way of getting selected keys up from in a dictionary:
| set D {-baz 0 -bar 1 -foo 2}
| dict filter $D key -foo -bar *
On the other hand, this effect can mostly be obtained through use of
'''dict merge''' already:
| dict merge {-foo x -bar x} $D
~ Copyright
This document has been placed in the public domain.