ReDoS Explained: Catastrophic Backtracking, Vulnerable Patterns & Safe Regex (2026)
Regular Expression Denial of Service (ReDoS) is one of the few vulnerability classes where a single short string — often under 50 bytes — can pin a CPU core at 100% for seconds, minutes, or effectively forever. The root cause is not a bug in the regex engine itself but a property of how most backtracking engines evaluate certain patterns: under specific input shapes, the number of paths the engine explores grows exponentially with the input length. A pattern that matches a benign string in microseconds can take longer than the age of the universe on a crafted 30-character string.
ReDoS matters because vulnerable patterns hide everywhere: input validators (email, URL, phone), log parsers, WAF rules, search highlighters, Markdown renderers, and dependency code you never wrote. Unlike injection bugs, ReDoS needs no authentication bypass and no special privileges — it just needs a request parameter that flows into a fragile regex. This guide covers the engine internals, how to find and confirm vulnerable patterns, how to build payloads, and how to rewrite regex to be provably linear. Everything here assumes you are testing systems you are authorized to assess.
How Catastrophic Backtracking Works
Most mainstream engines — PCRE, Java's java.util.regex, Python's re, JavaScript's RegExp, .NET, Ruby — use a backtracking NFA. When a quantifier like * or + can match in more than one way, the engine remembers each choice point and, on failure, rewinds to try the next combination. For most patterns this is fine. The problem appears when two quantifiers overlap on the same characters so that the engine has to try every possible way to partition the input before it can declare a non-match.
The canonical example is "nested quantifiers" — a repeated group that itself repeats:
^(a+)+$ # nested quantifier
^(a|a)+$ # alternation with overlapping branches
^(\w+\s?)*$ # ambiguous group, very common in real code
^(.*a){20}$ # repetition of a greedy match
Feed ^(a+)+$ the string "aaaaaaaaaaaaaaaaaaaaaaaaX". The trailing X guarantees the overall match fails. But before failing, the engine must try every way to split the run of as between the inner a+ and the outer (...)+. There are 2^n such partitions for n characters, so the work doubles with every extra a. At ~30 characters you are already into seconds; at ~40 you have a hung worker. This is exponential ReDoS.
A milder but still dangerous class is polynomial backtracking, where two adjacent greedy tokens can consume the same characters, e.g. \d+\d+$ or .*.*=.*. Cost grows as O(n²) or O(n³) — not exponential, but a 100KB input against an O(n²) pattern is still ten billion operations.
Finding Vulnerable Patterns
Start by hunting the regex itself, then reason about whether attacker input can reach it. Source review is the highest-signal approach. Grep the codebase for risky shapes — nested quantifiers, overlapping alternations, and greedy tokens before optional separators:
# Find candidate ReDoS patterns in source
grep -rnE '\(\S*[+*]\S*\)[+*]' --include='*.js' --include='*.ts' .
grep -rnE '\([^)]*\|[^)]*\)[+*]' --include='*.py' .
grep -rnE '\\w[+*]\\s[?*][)]?[+*]' .
# Node.js projects: scan dependencies for known-bad regexes
npx eslint --no-eslintrc --rule '{"redos":"error"}' \
--plugin redos src/ # eslint-plugin-redos / eslint-plugin-security
For a quick mental checklist, a pattern is suspicious when it contains any of:
- A quantified group whose body is also quantified:
(X+)+,(X*)*,(X+)*. - Alternation where branches can match the same string, then quantified:
(a|ab)+,(\d|\d\d)+. - Two adjacent open-ended tokens matching overlapping character classes:
\w+\w+,.*.*,[\s\S]*[\s\S]*. - A greedy
.*or\s*followed by an optional element and then an anchor or required literal.
You can validate a candidate quickly in a sandbox before ever touching the target. The Regex Tester lets you paste a pattern and watch match behaviour and capture groups in real time, which is a fast way to eyeball whether two quantifiers overlap. For a curated list of dangerous constructs and their safe rewrites, the regex security cheatsheet is a useful reference while you triage.
Static Analysis with Specialized Tools
Manual review scales poorly, so lean on tooling that models the engine. recheck (JS/TS) and rxxr2 perform automata analysis and report exponential or polynomial complexity with a witness string. The Node ecosystem ships redos-detector and the RegExp Static Analyzer behind eslint-plugin-redos:
# Analyze a single pattern with recheck (returns vulnerable + attack string)
npx recheck check '^(a+)+$' ''
# => status: vulnerable, complexity: exponential, attack: "aaaa...\x00"
# Python: detect with the 'dangerous' regex linter
pip install regex-redos && python -m regex_redos "(\w+\s?)*"
Crafting ReDoS Payloads
A working payload has three parts: a prefix that the pattern accepts and starts matching, a pump string repeated to inflate the search space, and a suffix that forces the match to fail so the engine backtracks through every combination. The failing suffix is what turns a fast match into a hang — a string that matches cleanly returns immediately.
# Pattern: ^(a+)+$ (exponential)
# pump = "a", suffix = "!" (any char outside [a])
python3 -c 'print("a"*40 + "!")' # ~seconds to minutes
# Pattern: ^(\d+)*$ (exponential)
python3 -c 'print("1"*45 + "x")'
# Pattern: ([a-zA-Z]+)*$ on a long alpha run
python3 -c 'print("z"*38 + "1")'
# Polynomial: .*foo.*bar -> long string lacking 'bar'
python3 -c 'print("foo " * 5000 + "no match")'
To confirm impact safely, measure timing locally against the exact engine and pattern before sending anything to a live system. Bisect the input length to find where time blows up — this both proves the vulnerability and avoids accidentally hard-hanging a shared service:
for n in 20 24 28 32 36; do
payload=$(python3 -c "print('a'*$n + '!')")
/usr/bin/time -f "n=$n %es" node -e \
'const s=process.argv[1];/^(a+)+$/.test(s)' "$payload" 2>&1
done
# Look for the doubling: each +4 chars ~ 16x the time.
When the regex runs server-side on a request field, delivery is just normal traffic — a query string, JSON body, header, or filename — but you should rate-limit yourself to one probe and watch response latency rather than firing a flood. A single request that takes 8 seconds when the baseline is 40ms is sufficient evidence; you do not need to actually exhaust the target's worker pool to demonstrate the finding in a report.
Where ReDoS Hides in Real Apps
The most common sinks in web applications:
- Input validation. Hand-rolled email and URL validators are notorious. The widely copied "RFC-ish" email regex with
(\.[a-zA-Z0-9-]+)*segments is exponential on long subdomains. - User-Agent and header parsing. Analytics and bot-detection libraries parse
User-Agentwith greedy patterns; an attacker fully controls that header. - Markdown / BBCode / template rendering. Link and emphasis matching often uses nested quantifiers across the whole document body.
- Search and highlight features. Patterns built from user search terms, or that scan large result text.
- Log ingestion and SIEM rules. A malicious log line can ReDoS the parser that is supposed to be watching for attacks.
- WAF rules themselves. Ironically, a poorly written WAF signature can be turned into a DoS against the WAF; if you study evasion you will see how brittle some vendor rules are in the WAF Evasion Studio.
Remediation and Defenses
ReDoS is fixable for good — the goal is to make every regex run in time linear to its input. The defenses, roughly in order of preference:
1. Rewrite the pattern to remove ambiguity. Eliminate nested quantifiers and overlapping alternation. Make the inner and outer repetitions match disjoint characters so there is exactly one way to partition any input:
# Vulnerable Safe rewrite
^(a+)+$ -> ^a+$
^(\w+\s?)*$ -> ^(?:\w+\s)*\w*$ # separator is required between repeats
(.*)* -> .*
^(\d+)*$ -> ^\d*$
2. Use atomic groups and possessive quantifiers. In PCRE, Java, Ruby, and .NET, atomic groups (?>...) and possessive quantifiers a++, \w*+ tell the engine "once you match this, never give it back." That removes the backtracking points entirely: ^(?>a+)+$ and ^a++$ cannot blow up.
3. Anchor and bound everything. Add ^ and $ so the engine fails fast, and cap repetition with explicit bounds — {1,64} instead of + — so the worst case is finite and small.
4. Switch to a non-backtracking engine. Google's RE2 (and the re2 bindings for Node, Python, Ruby) guarantees linear-time matching by using a real DFA/NFA simulation. It cannot ReDoS by construction. The trade-off is no backreferences or lookaround. For untrusted input this is usually the right call:
// Node.js: swap RegExp for RE2 on any pattern touching user input
const RE2 = require('re2');
const re = new RE2('^(a+)+$'); // safe — RE2 runs in O(n), no backtracking
re.test('a'.repeat(100000) + '!'); // returns instantly
5. Add a timeout / budget. Defense in depth for code you cannot rewrite. .NET supports new Regex(pattern, options, TimeSpan.FromMilliseconds(100)). In Node, run regex in a worker with a kill timer; in Java 9+ no native timeout exists, so isolate untrusted matching in an interruptible thread.
6. Validate input before it reaches the regex. Cap field length aggressively (a 64KB header has no business hitting an O(n²) parser) and reject obviously hostile shapes early.
Bake detection into CI: run recheck or eslint-plugin-redos on every commit and audit dependencies, since most real-world ReDoS lands through a transitive package, not first-party code. Treat any regex that touches attacker-controlled input as a security boundary. If you cannot prove it is linear, either rewrite it or move it to RE2.
Level up your security testing
Install the CLI
npx payload-playgroundExplore All Tools
Encoding, hashing, JWT & more
Browse Cheat Sheets
Quick-reference payload guides