Tcl Source Code

View Ticket
Login
Ticket UUID: 1810038
Title: DoS via infinite loop in regex NFA optimization
Type: Bug Version: obsolete: 8.5b1
Submitter: mmaslano Created on: 2007-10-09 09:24:13
Subsystem: 43. Regexp Assigned To: dgp
Priority: 9 Immediate Severity:
Status: Closed Last Modified: 2007-12-02 03:49:57
Resolution: Fixed Closed By: dgp
    Closed on: 2007-11-15 22:01:36
Description:
Occure in version: 8.0 and higher, I try it also in devel version 8.5b1 and the problem is still here.

Reproducer:
TCL version:
  regexp '($|^X)*' 'foo'

The problem is in regular expressions code, which results in a non-terminating infinite loop during nondeterministic finite automata optimization. These loops remain even after the client disconnects and can be used to hog CPU and connections to the database until a full-fledged denial of service condition occurs.

I try to attach patch soon, but at the moment I've no idea, where in code is the problem.
User Comments: dgp added on 2007-11-16 05:01:36:
Logged In: YES 
user_id=80530
Originator: NO


Fix committed to both branches.

dgp added on 2007-11-16 04:32:24:

File Added - 254493: 1810038.patch

Logged In: YES 
user_id=80530
Originator: NO


Attached patch (for HEAD) is a fix.
File Added: 1810038.patch

dgp added on 2007-11-09 01:05:28:
Logged In: YES 
user_id=80530
Originator: NO


Possibly useful observations...

Changing the parens from capturing to
non-capturing stops the infinite loop
in the simplified test...

% regexp (?:$|^)* {}
1

...but not in the original test...

% regexp (?:$|^X)* {}
<hangs>

dgp added on 2007-11-07 03:01:16:
Logged In: YES 
user_id=80530
Originator: NO


Slightly simpler test that still
demos the problem:

% regexp ($|^)* {}

dgp added on 2007-11-07 00:42:01:
Logged In: YES 
user_id=80530
Originator: NO


Enabling the dumpnfa(), here's the
first few iterations:

Original:

pre 1, post 0, bos [2], bol [3], eos [4], eol [5]
0@      no out arcs
1>      ^0->4   ^1->4   [1]->4  [0]->4
4.      $0->0   $1->0   [1]->0  [0]->0  ^1->10
        $1->4
10.     [1]->4

After first loop:

pre 1, post 0, bos [2], bol [3], eos [4], eol [5]
0@      no out arcs
1>      ^0->4   ^1->4   [1]->4  [0]->4  ^1->10
4.      $0->0   $1->0   [1]->0  [0]->0  $1->4
        ^1->13
10.     [1]->4
13.     $1->10

After second loop:

0@      no out arcs
1>      ^0->4   ^1->4   [1]->4  [0]->4  ^1->10
        ^1->13
4.      $0->0   $1->0   [1]->0  [0]->0  $1->4
        ^1->15
10.     [1]->4
13.     $1->10
15.     $1->13

...and the pattern continues.

dgp added on 2007-11-06 23:35:21:
Logged In: YES 
user_id=80530
Originator: NO


Things get stuck in the do {...} while ()
loop inside the pullback() routine at
about line 750 in regc_nfa.c .

The stuck loop starts with nfa->nstates == 12
and each iteration adds 2 to nfa->nstates,
and sets progress to 1.

dgp added on 2007-11-06 04:29:42:
Logged In: YES 
user_id=80530
Originator: NO


Just to correct the report, this
bug does not appear in Tcl 8.0.5:

% info patch
8.0.5
% regexp ($|^X)* foo
couldn't compile regular expression pattern: *+ operand could be empty

The reported bug arrived in Tcl 8.1.0
with the new regexp engine.

Attachments: