URL Encoding Explained: Special Characters and How to Handle Them
Every character in a URL has a meaning. Spaces, ampersands, question marks, and non-ASCII characters must be encoded or they'll break your links. This guide explains how URL encoding works and when to use it.
Why URLs Need Encoding
URLs were designed to be read by computers, not humans. A URL like https://example.com/search?q=hello world contains a space โ which is technically illegal in URLs. Browsers display it for convenience, but the actual encoded URL is https://example.com/search?q=hello%20world.
The %20 is URL encoding. Every character outside the "safe" set gets replaced with %XX where XX is its hexadecimal ASCII value.
The Safe Character Set
These characters don't need encoding in URLs:
A-Z a-z 0-9 - _ . ~
Everything else โ spaces, punctuation, non-ASCII characters, special symbols โ must be percent-encoded.
Common Encodings You Should Know
space โ %20 ! โ %21 " โ %22 # โ %23
$ โ %24 % โ %25 & โ %26 ' โ %27 ( โ %28
) โ %29 * โ %2A + โ %2B , โ %2C / โ %2F
: โ %3A ; โ %3B = โ %3D ? โ %3F @ โ %40
[ โ %5B ] โ %5D { โ %7B } โ %7C | โ %7C
Query Parameters vs Path Segments
Different parts of a URL have different encoding rules:
- Path segments (e.g.,
/blog/my post) โ encode everything except unreserved characters - Query string keys and values (e.g.,
?q=hello) โ encode using application/x-www-form-urlencoded - Query string separators โ
?,&, and=are reserved and should not be encoded when they're serving as separators
Common Mistakes to Avoid
Mistake 1: Double Encoding
If a parameter value already contains encoded data, encode it again and you'll get double-encoding:
User's input: "hello%20world"
Wrong (double encode): "hello%2520world"
Right (single encode): "hello%20world"
Use your framework's built-in URL encoding functions rather than manual string replacement.
Mistake 2: Encoding Already Encoded URLs
Never encode a full URL โ only encode the dynamic parts (query values, path segments with user content). The protocol, host, and structural characters must remain unencoded.
Mistake 3: Forgetting Non-ASCII Characters
Chinese characters, emoji, accented letters โ all must be UTF-8 encoded then percent-encoded:
"ไฝ ๅฅฝ" โ UTF-8 bytes โ %E4%BD%A0%E5%A5%BD
JavaScript's encodeURIComponent() handles this correctly. Plain encodeURI() does not encode most non-ASCII characters.
URL Encoding in Different Languages
JavaScript: encodeURIComponent(str) // for query values
encodeURI(str) // for full URLs
Python: urllib.parse.quote(s) // RFC 3986
urllib.parse.urlencode(dict) // query string
Node.js: encodeURIComponent(str) // same as JS
qs.stringify(obj) // for query objects
Base64URL โ URL-Safe Base64
Standard Base64 uses + and /, which are unsafe in URLs. Base64URL replaces these:
Base64: + โ + / โ / = (padding)
Base64URL: + โ - / โ _ = (removed)
Used in JWT tokens and URL-safe data transmission.
Try It Yourself
Use our URL Encoder tool to encode or decode any URL component. The tool automatically handles special characters, query strings, and non-ASCII text.