URL Encoding Explained: Special Characters and How to Handle Them

Every character in a URL has a meaning. Spaces, ampersands, question marks, and non-ASCII characters must be encoded or they'll break your links. This guide explains how URL encoding works and when to use it.

Why URLs Need Encoding

URLs were designed to be read by computers, not humans. A URL like https://example.com/search?q=hello world contains a space โ€” which is technically illegal in URLs. Browsers display it for convenience, but the actual encoded URL is https://example.com/search?q=hello%20world.

The %20 is URL encoding. Every character outside the "safe" set gets replaced with %XX where XX is its hexadecimal ASCII value.

The Safe Character Set

These characters don't need encoding in URLs:

A-Z a-z 0-9 - _ . ~

Everything else โ€” spaces, punctuation, non-ASCII characters, special symbols โ€” must be percent-encoded.

Common Encodings You Should Know

space     โ†’ %20    ! โ†’ %21    " โ†’ %22    # โ†’ %23
$ โ†’ %24    % โ†’ %25    & โ†’ %26    ' โ†’ %27    ( โ†’ %28
) โ†’ %29    * โ†’ %2A    + โ†’ %2B    , โ†’ %2C    / โ†’ %2F
: โ†’ %3A    ; โ†’ %3B    = โ†’ %3D    ? โ†’ %3F    @ โ†’ %40
[ โ†’ %5B    ] โ†’ %5D    { โ†’ %7B    } โ†’ %7C    | โ†’ %7C

Query Parameters vs Path Segments

Different parts of a URL have different encoding rules:

  • Path segments (e.g., /blog/my post) โ€” encode everything except unreserved characters
  • Query string keys and values (e.g., ?q=hello) โ€” encode using application/x-www-form-urlencoded
  • Query string separators โ€” ?, &, and = are reserved and should not be encoded when they're serving as separators

Common Mistakes to Avoid

Mistake 1: Double Encoding

If a parameter value already contains encoded data, encode it again and you'll get double-encoding:

User's input: "hello%20world"
Wrong (double encode): "hello%2520world"
Right (single encode): "hello%20world"

Use your framework's built-in URL encoding functions rather than manual string replacement.

Mistake 2: Encoding Already Encoded URLs

Never encode a full URL โ€” only encode the dynamic parts (query values, path segments with user content). The protocol, host, and structural characters must remain unencoded.

Mistake 3: Forgetting Non-ASCII Characters

Chinese characters, emoji, accented letters โ€” all must be UTF-8 encoded then percent-encoded:

"ไฝ ๅฅฝ" โ†’ UTF-8 bytes โ†’ %E4%BD%A0%E5%A5%BD

JavaScript's encodeURIComponent() handles this correctly. Plain encodeURI() does not encode most non-ASCII characters.

URL Encoding in Different Languages

JavaScript:  encodeURIComponent(str)    // for query values
              encodeURI(str)                    // for full URLs

Python:      urllib.parse.quote(s)              // RFC 3986
              urllib.parse.urlencode(dict)      // query string

Node.js:     encodeURIComponent(str)           // same as JS
              qs.stringify(obj)                 // for query objects

Base64URL โ€” URL-Safe Base64

Standard Base64 uses + and /, which are unsafe in URLs. Base64URL replaces these:

Base64:    + โ†’ +   / โ†’ /   = (padding)
Base64URL: + โ†’ -   / โ†’ _   = (removed)

Used in JWT tokens and URL-safe data transmission.

Try It Yourself

Use our URL Encoder tool to encode or decode any URL component. The tool automatically handles special characters, query strings, and non-ASCII text.

โ† Back to Blog