Skip to main content

URL Percent-Encoding: Which Characters to Encode, encodeURI vs encodeURIComponent, and Query-String Best Practices

A practical guide to URL percent-encoding: the RFC 3986 safe character set, when to use encodeURI versus encodeURIComponent, and how to build clean query strings without double-encoding bugs.

Published
#url #encoding #javascript #web-development

URL Percent-Encoding: Which Characters to Encode, encodeURI vs encodeURIComponent, and Query-String Best Practices

Every URL you build in JavaScript eventually hits the same question: do I need to encode this character, and which function do I use? Get it wrong and you end up with double-encoded %2520 nightmares or a server that never receives the & your query string needed. This guide walks through the exact character set, the two JavaScript encoding functions, and the query-string patterns that cause the most production bugs.

The RFC 3986 Character Set: What Never Needs Encoding

RFC 3986 — the document that defines how URLs are structured — lists exactly 66 unreserved characters that are always safe to use in a URL without encoding: the 26 uppercase letters (A–Z), 26 lowercase letters (a–z), 10 digits (0–9), and four punctuation marks: - (hyphen), _ (underscore), . (period), and ~ (tilde).

Every other character is either a reserved character (like /, ?, #, &, =, :) that carries structural meaning, or an unsafe character (like spaces, brackets, non-ASCII bytes) that must be encoded as % followed by two hex digits.

The percent sign itself (%) is special: if it appears in a value and is not already part of a valid %XX sequence, it must be encoded as %25. Missing this rule is the most common cause of double-encoding bugs.

A few reserved characters worth memorising:

| Character | Encoded form | Common role | |-----------|-------------|-------------| | Space | %20 | Path or value separator | | & | %26 | Query-string delimiter | | = | %3D | Key-value separator | | + | %2B | Literal plus sign (not a space) | | # | %23 | Fragment identifier | | ? | %3F | Query start |

Note the + row carefully. In the application/x-www-form-urlencoded format used by HTML forms, + is an alternative encoding of a space — but only in that specific format. In a bare URL, + means a literal plus sign. Mixing these up sends wrong data silently.

encodeURI vs encodeURIComponent: The Practical Difference

JavaScript ships two encoding functions, and the choice between them is not arbitrary.

encodeURI(uri) encodes a full URL. It leaves the reserved structural characters untouched because they need to stay functional. Given encodeURI('https://example.com/path?q=hello world'), the space becomes %20 but the ://, /, ?, and = remain as-is. Use this when you have a complete URL and want to make it safe to embed or log.

encodeURIComponent(value) encodes a single value within a URL — a query parameter value, a path segment, a hash fragment. It encodes almost everything except the 66 unreserved characters. Crucially, it encodes /, ?, &, =, and #, which is exactly what you need when those characters are data rather than structure.

The table below shows how both functions treat the string café & tea:

| Input | encodeURI | encodeURIComponent | |-------|-------------|---------------------| | café & tea | caf%C3%A9%20&%20tea | caf%C3%A9%20%26%20tea |

Notice that encodeURI leaves & unencoded. If you paste café & tea as a query-parameter value using encodeURI, the & splits your query string into two parameters — the server reads q=café and a mysterious second key tea with no value. encodeURIComponent encodes & to %26, keeping it as data.

The rule is simple: never use encodeURI on a value; only use it on a complete URL.

A Real Encoding Walkthrough

I was building a search page that accepts a user-supplied phrase and passes it to an API. The phrase was Søren Kierkegaard & philosophy. Here is exactly what each encoding step produces:

Input:
  Søren Kierkegaard & philosophy

encodeURIComponent(input):
  S%C3%B8ren%20Kierkegaard%20%26%20philosophy

Full URL assembled:
  https://api.example.com/search?q=S%C3%B8ren%20Kierkegaard%20%26%20philosophy&lang=en

The non-ASCII ø becomes %C3%B8 (its UTF-8 bytes), the spaces become %20, and the & becomes %26. The server's query-string parser sees q = Søren Kierkegaard & philosophy and lang = en — exactly what was intended.

I tested the round-trip with decodeURIComponent('S%C3%B8ren%20Kierkegaard%20%26%20philosophy') and got back Søren Kierkegaard & philosophy with no data loss. If you want to verify your own strings without writing code, the URL Encoder / Decoder on Toolora handles both directions instantly in-browser.

Query-String Best Practices

Build queries programmatically, not by string concatenation. The correct pattern in the browser is URLSearchParams:

const params = new URLSearchParams();
params.set('q', 'Søren Kierkegaard & philosophy');
params.set('lang', 'en');
const url = 'https://api.example.com/search?' + params.toString();
// → https://api.example.com/search?q=S%C3%B8ren+Kierkegaard+%26+philosophy&lang=en

You will notice URLSearchParams uses + for spaces rather than %20. Both are valid in the application/x-www-form-urlencoded format, and most servers handle them interchangeably. If your backend is strict about %20, call params.toString().replace(/\+/g, '%20') after.

For complex nested data in a query string — objects, arrays, or deeply structured filters — consider serialising to JSON and encoding the result as a single parameter value, or use a library like qs. The URL Query String Parser & Builder lets you paste a full URL and inspect every decoded parameter in a table, which is helpful for debugging what your serialiser actually produces.

Never double-encode. If you receive a value that is already encoded (for example, a URL pulled from a database that was stored encoded), decode it first with decodeURIComponent, then re-encode if needed. Encoding %20 produces %2520%25 (the encoding of %) followed by 20 — which servers will see as a literal %20 in the data, not a space.

Path segments need encoding too. A filename like Q1 Report (Final).pdf in a URL path must be encoded: Q1%20Report%20(Final).pdf. Parentheses are technically allowed unencoded in paths by RFC 3986, but encoding them avoids ambiguity in older parsers. Spaces in paths are never safe.

Common Pitfalls Summary

  • Using encodeURI on a parameter value — the &, =, and # characters go through unencoded and corrupt the query string.
  • Double-encoding — encoding an already-encoded string. Always decode before re-encoding.
  • Confusing + and %20 — safe in form data, but + means literal plus in a raw URL path.
  • Ignoring non-ASCII in path segments/résumé.pdf is not a valid URL; the é must be encoded as %C3%A9.
  • Forgetting the % character itself — a percent sign in data must be encoded as %25 before any other encoding is applied.

Percent-encoding is one of those areas where the rules look trivial and the bugs are subtle. Encoding at the right layer — values with encodeURIComponent, full URLs with encodeURI, and assembled query strings with URLSearchParams — eliminates almost every class of URL bug before it reaches production.


Made by Toolora · Updated 2026-07-01