Tcl Source Code

View Ticket
Login
Ticket UUID: 1280817
Title: regexp hangs on some valid REs
Type: Bug Version: None
Submitter: nobody Created on: 2005-09-02 16:50:09
Subsystem: 43. Regexp Assigned To: dgp
Priority: 5 Medium Severity: Important
Status: Closed Last Modified: 2014-07-23 14:50:13
Resolution: Out of Date Closed By: dgp
    Closed on: 2014-07-23 14:50:13
Description:
Certain valid regular expressions cause [regexp]
to hang, apparently in the expression-compilation phase.
The following one-line command hangs Tcl, with memory
requirements increasing apparently without limit:

  regexp -about {(\m.*\M)*}

If the final * quantifier is removed, the bug does 
not occur.

If the grouping parentheses are removed, the bug does
not occur.

If the grouping is made noncapturing with ?:, the
bug still occurs.

Observed with Tcl 8.4.9.1 snd also 8.3.4, on Win2000.

Submitted by Jonathan Bromley,
[email protected]
User Comments: dgp added on 2014-07-23 14:50:13:
Turns out this is a dup of bugs 1810038 and 1810264
(or rather, they Duped this one!).  Both fixed long
ago in time for the 8.5.0 release.

dgp added on 2013-10-17 19:05:01:
Possibly connected to 8f245009b0

nobody added on 2005-12-08 21:03:12:
Logged In: NO 

Note:

% regexp -about {(\m\M)}
1 {REG_UNONPOSIX REG_ULOCALE REG_UIMPOSSIBLE}

If the IMPOSSIBLE is genuine (and usable), then perhaps 
nfatree() in regcomp.c should 'accumulate' its return value 
from nfanode so that it retains any IMPOSSIBLE flag. 
Something like:

int ret = 0;
if (t->left != NULL) {
  ret |= nfatree(v, t->left, f);
}
if (t->right != NULL) {
  ret |= nfatree(v, t->right, f);
}
ret |= nfanode(v, t, f);
return ret;


Then compile() can check the return value:

/* build compacted NFAs for tree and lacons */
re->re_info |= nfatree(v, v->tree, debug);
if (re->re_info & REG_UIMPOSSIBLE) {
 /* throw an error */
}


More investigation needed :-)