Tcl Source Code

View Ticket
Login
Ticket UUID: d2323d6c284a5081e818dc02f0bbe695a7aa1c61
Title: add parameter ?lastIndex? to [string first]
Type: Patch Version: 8.6.4
Submitter: anonymous Created on: 2016-02-25 14:33:02
Subsystem: 18. Commands M-Z Assigned To: nobody
Priority: 5 Medium Severity: Minor
Status: Open Last Modified: 2016-03-08 07:47:49
Resolution: None Closed By: nobody
    Closed on:
Description:
The dubious implementation of tcllib's ::textutil::trim::trimPrefix caught my eye. Why scan the full string when actually looking at the start only?

% set haystack [string repeat abc 1000]d
% set needle abcd
% string first $needle $haystack
2997
% time {string first $needle $haystack} 1000
10.149 microseconds per iteration
% time {string first $needle $haystack 0 0} 1000
0.712 microseconds per iteration

In this form, I cannot attach my patch file. Will attach or post after creating this ticket. The patch is for the implementation and the tests, not for the documentation.
User Comments: anonymous (claiming to be heinrichmartin) added on 2016-03-08 07:47:49:
> to compose [string first] with [string range]

This was my first assumption when looking at the implementation of trimPrefix. Then I thought that someone might have chosen to _not_ create a copy of the prefix of size of needle. And [string first $needle [string range $haystack 0 [string length $needle]-1]] is somewhat lengthy.

% time {string first $needle $haystack} 1000
13.128 microseconds per iteration
% time {string first $needle [string range $haystack 0 [string length $needle]-1]} 1000
1.435 microseconds per iteration
% # other machine than the above measurement: assuming +50% to +100% slower than [string first $needle $haystack 0 0];
% # also note that a string comparison (not integer as after [string first]) is still left to trimPrefix ...

In the end, I looked at the C code and found that it was an easy task to add this feature. It is an extension that might or might not be interesting for the community. I package our custom version from pristine sources and can therefore easily provide the patch as-is. I added tests, but no doc.

OT: I also considered another patch to allow multiple + and - in the index, e.g. consider the postfix scenario [string range $haystack end-[string length $needle]+1 end].

dgp added on 2016-03-07 18:22:48:
I did not (yet) look at the patch, but if I understand the
motivating example of limiting a [string first] search to
a subrange of a string, why wouldn't the right answer
be to compose [string first] with [string range] ?

Attachments: