Ticket UUID: | 1810038 | |||
Title: | DoS via infinite loop in regex NFA optimization | |||
Type: | Bug | Version: | obsolete: 8.5b1 | |
Submitter: | mmaslano | Created on: | 2007-10-09 09:24:13 | |
Subsystem: | 43. Regexp | Assigned To: | dgp | |
Priority: | 9 Immediate | Severity: | ||
Status: | Closed | Last Modified: | 2007-12-02 03:49:57 | |
Resolution: | Fixed | Closed By: | dgp | |
Closed on: | 2007-11-15 22:01:36 | |||
Description: |
Occure in version: 8.0 and higher, I try it also in devel version 8.5b1 and the problem is still here. Reproducer: TCL version: regexp '($|^X)*' 'foo' The problem is in regular expressions code, which results in a non-terminating infinite loop during nondeterministic finite automata optimization. These loops remain even after the client disconnects and can be used to hog CPU and connections to the database until a full-fledged denial of service condition occurs. I try to attach patch soon, but at the moment I've no idea, where in code is the problem. | |||
User Comments: |
dgp added on 2007-11-16 05:01:36:
Logged In: YES user_id=80530 Originator: NO Fix committed to both branches. dgp added on 2007-11-16 04:32:24: File Added - 254493: 1810038.patch Logged In: YES user_id=80530 Originator: NO Attached patch (for HEAD) is a fix. File Added: 1810038.patch dgp added on 2007-11-09 01:05:28: Logged In: YES user_id=80530 Originator: NO Possibly useful observations... Changing the parens from capturing to non-capturing stops the infinite loop in the simplified test... % regexp (?:$|^)* {} 1 ...but not in the original test... % regexp (?:$|^X)* {} <hangs> dgp added on 2007-11-07 03:01:16: Logged In: YES user_id=80530 Originator: NO Slightly simpler test that still demos the problem: % regexp ($|^)* {} dgp added on 2007-11-07 00:42:01: Logged In: YES user_id=80530 Originator: NO Enabling the dumpnfa(), here's the first few iterations: Original: pre 1, post 0, bos [2], bol [3], eos [4], eol [5] 0@ no out arcs 1> ^0->4 ^1->4 [1]->4 [0]->4 4. $0->0 $1->0 [1]->0 [0]->0 ^1->10 $1->4 10. [1]->4 After first loop: pre 1, post 0, bos [2], bol [3], eos [4], eol [5] 0@ no out arcs 1> ^0->4 ^1->4 [1]->4 [0]->4 ^1->10 4. $0->0 $1->0 [1]->0 [0]->0 $1->4 ^1->13 10. [1]->4 13. $1->10 After second loop: 0@ no out arcs 1> ^0->4 ^1->4 [1]->4 [0]->4 ^1->10 ^1->13 4. $0->0 $1->0 [1]->0 [0]->0 $1->4 ^1->15 10. [1]->4 13. $1->10 15. $1->13 ...and the pattern continues. dgp added on 2007-11-06 23:35:21: Logged In: YES user_id=80530 Originator: NO Things get stuck in the do {...} while () loop inside the pullback() routine at about line 750 in regc_nfa.c . The stuck loop starts with nfa->nstates == 12 and each iteration adds 2 to nfa->nstates, and sets progress to 1. dgp added on 2007-11-06 04:29:42: Logged In: YES user_id=80530 Originator: NO Just to correct the report, this bug does not appear in Tcl 8.0.5: % info patch 8.0.5 % regexp ($|^X)* foo couldn't compile regular expression pattern: *+ operand could be empty The reported bug arrived in Tcl 8.1.0 with the new regexp engine. |
Attachments:
- 1810038.patch [download] added by dgp on 2007-11-16 04:32:23. [details]