URL Percent-Encoding and Form Data: Why + and %20 Are Not the Same

If you've ever built an API client, debugged a broken search query, or wondered why curl output looks different from what your browser sends — you've bumped into URL percent-encoding. The confusing part isn't the encoding itself, it's that there are two systems that look nearly identical but produce different output. Mix them up and your server gets garbage it silently misreads.

What Percent-Encoding Actually Does

Percent-encoding replaces any byte that isn't "safe" in a URL with a % followed by two uppercase hex digits representing that byte's value. The letter é in UTF-8 is two bytes: 0xC3 and 0xA9, so it encodes to %C3%A9. A space is 0x20, so it encodes to %20.

RFC 3986 (the current URL standard) lists the characters that are allowed unencoded in a URL:

Unreserved: A–Z, a–z, 0–9, -, _, ., ~ — always safe, never encoded.
Reserved: : / ? # [ ] @ ! $ & ' ( ) * + , ; = — these have structural meaning in URLs. They must be encoded when they appear as data values, not as URL syntax.

A common mistake is forgetting that & and = are reserved. If a query parameter value contains a literal &, you must encode it as %26. Otherwise the URL parser splits your value at the & and treats the rest as a new parameter.

Example:

Input value: price=10&currency=USD

Correct query string: ?filter=price%3D10%26currency%3DUS

Broken query string: ?filter=price=10&currency=USD ← the server parses this as two separate parameters: filter=price and currency=USD.

I've seen this exact bug in production ETL pipelines where a filter string was concatenated directly into a URL without encoding. The API returned wrong data for months before anyone noticed.

The Two Encoding Contexts: Component vs. Form

Here's where the real confusion starts. There are two different percent-encoding flavors, and they differ on exactly one character: the space.

RFC 3986 component encoding (`encodeURIComponent` in JavaScript)

Spaces become %20. All reserved characters in values are percent-encoded. This is the correct encoding to use for path segments, fragment identifiers, and modern API query parameters.

`application/x-www-form-urlencoded` (HTML form encoding)

Spaces become +. The + sign itself must then be encoded as %2B to avoid ambiguity. This encoding is what browsers use when submitting a <form method="post"> and what curl -d sends by default. It was standardized in the HTML 2.0 specification from 1995 and has never been updated.

Real input/output comparison:

| Input string | encodeURIComponent | Form encoding | |---|---|---| | hello world | hello%20world | hello+world | | price: 10+tax | price%3A%2010%2Btax | price%3A+10%2Btax | | C# | C%23 | C%23 | | café | caf%C3%A9 | caf%C3%A9 |

The space character is the only difference in most cases — but it's enough to break a server that decodes one way and receives the other.

Where Things Go Wrong in Practice

I tested this with a simple FastAPI endpoint that reads a query parameter q and logs the raw value. Sending q=hello+world as a URL path query string (not a form body) makes FastAPI decode + literally — the server receives the string hello+world, not hello world. When I switched to q=hello%20world, the server correctly received hello world.

This matters when:

You copy a query string from a URL bar and paste it into a fetch() call.
A third-party library builds form bodies but you're hitting a REST API that expects RFC 3986.
You serialize a JSON payload as URL params and a value contains a + (common in Base64).

According to the WHATWG URL Standard (living standard, 2024 revision), URLSearchParams in JavaScript uses form encoding — + for spaces — while manually building a path or calling encodeURIComponent uses %20. That means new URLSearchParams({q: "hello world"}).toString() gives q=hello+world, while ` ?q=${encodeURIComponent("hello world")} gives ?q=hello%20world`. Both look like valid query strings to a human but will behave differently server-side.

Reserved Characters You Must Not Forget

Beyond the space issue, these reserved characters trip up developers regularly:

| Character | Meaning in URL | Encoded form | |---|---|---| | # | Fragment separator | %23 | | ? | Query start | %3F | | & | Parameter separator | %26 | | = | Key–value separator | %3D | | + | Space in form encoding | %2B | | / | Path separator | %2F |

The # character is especially dangerous. If your parameter value contains a hash — say, a color hex like #ff0000 — and you don't encode it, the browser stops processing the URL at # and treats everything after it as a fragment. The server never sees the hash or anything following it.

For a quick reference on which characters need encoding in which URL component, the URL Encoder / Decoder tool lets you paste any string and see the correct RFC 3986 output instantly, including which specific characters were encoded and why.

Decoding Incoming Data Correctly

When a server receives a request, it must know which encoding was used to decode it. HTTP doesn't carry a flag saying "I used form encoding" vs. "I used RFC 3986". The convention is:

A Content-Type: application/x-www-form-urlencoded body → form decoding (+ = space).
A query string appended to a URL in JavaScript code → depends on how it was built. If built with URLSearchParams, it's form encoding. If built with template literals and encodeURIComponent, it's RFC 3986.

Most frameworks handle this transparently for standard form submissions. The problems arise when you build URLs programmatically or chain services together. If service A uses URLSearchParams to forward a value to service B, and service B calls a function that treats + as a literal character, you've introduced a silent data corruption.

To check how your API actually receives values, you can use the JSON to Query String converter to build a query string from structured data and verify the encoded output matches what your server expects.

One Rule That Covers 90% of Cases

If you're writing modern JavaScript or Python:

Use encodeURIComponent (JS) or urllib.parse.quote with safe='' (Python) for individual query parameter values.
Use URLSearchParams (JS) or urllib.parse.urlencode (Python) for form-encoded bodies when the API explicitly expects application/x-www-form-urlencoded.
Never construct a URL by concatenating raw user input without encoding — SQL-injection-style bugs exist in URLs too.

The RFC 3986 vs. form-encoding distinction is a 30-year-old design quirk that the web never cleaned up. Understanding it takes five minutes; not understanding it costs hours of debugging wrong decoded values in server logs.

Made by Toolora · Updated 2026-06-27