Skip to main content

How to URL Encode Special Characters Without Breaking Your Links

A practical guide to URL encoding: component vs full-URL mode, which characters must be escaped, the + vs %20 trap, and how to spot a double-encoded string before it ships.

Published By 李雷
#url encode #percent encoding #encodeURIComponent #web development #query strings

How to URL Encode Special Characters Without Breaking Your Links

The first time a URL silently broke on me, it took an hour to find. A campaign name with an ampersand — "Summer & Fall 2026" — went straight into a utm_campaign query parameter. The link looked fine. But every click recorded the campaign as just "Summer", because the raw & told the browser a new parameter started. Nothing errored. The data was just quietly wrong for a week.

That is the whole reason URL encoding exists. A URL is not free-form text; certain characters carry structural meaning, and when your data contains those characters you have to escape them or the parser misreads where one piece ends and the next begins. This guide walks through how to URL encode correctly, where most people slip, and how to check your work.

What URL Encoding Actually Does

URL encoding — also called percent-encoding — replaces a character with a % followed by its byte value in hexadecimal. A space becomes %20, an ampersand becomes %26, an equals sign becomes %3D.

For anything outside the ASCII range, the character is first converted to its UTF-8 bytes, and each byte is then percent-encoded. The Chinese character , for example, is three UTF-8 bytes (E4 BD A0), so it encodes to %E4%BD%A0. Emoji work the same way — they just span four bytes.

The authoritative rules live in RFC 3986, the URI specification. It defines a small set of "unreserved" characters that never need escaping — letters, digits, and the four marks - _ . ~ — and a set of "reserved" characters like / ? # [ ] @ ! $ & ' ( ) * + , ; = that mean something structural and must be escaped when they appear as literal data rather than as delimiters.

Component Mode vs Full-URL Mode

This is the single distinction that trips up most people, so it is worth being precise. There are two encoding jobs, and they are not interchangeable.

Full-URL mode (JavaScript's encodeURI) is for an entire URL. It leaves the structural characters & = # / ? : @ alone, because in a whole URL those characters are doing their job as separators. It only escapes things that are unambiguously unsafe, like spaces.

Component mode (JavaScript's encodeURIComponent) is for a single piece — one query parameter value, one path segment. It escapes everything except the unreserved set, including & = #, because inside a single value those characters are just data and must not be mistaken for delimiters.

The classic mistake is encoding a whole URL with component mode. Run https://a.com/page through it and you get https%3A%2F%2Fa.com%2Fpage — the :// and slashes are gone, and the link is no longer clickable. Rule of thumb: full-URL mode for whole links, component mode for one param.

A Real Encode Before and After

Here is the OAuth case, which is where encoding mode matters most. You need to pass a callback URL as a single query parameter:

redirect_uri=https://app.example.com/cb?next=/dashboard

That inner ? is the problem. If you leave it raw, the OAuth provider reads it as the start of a new query string and truncates everything after it — your next=/dashboard vanishes. Component-encode the value and it becomes:

redirect_uri=https%3A%2F%2Fapp.example.com%2Fcb%3Fnext%3D%2Fdashboard

Now the ?, /, and : are all %xx sequences, so the provider treats the entire callback as one opaque string and hands it back intact. You can run this exact transformation in the URL Encoder / Decoder — paste the value, pick component mode, and copy the result.

The + vs %20 Trap

Two encodings can represent a space, and the difference is a genuine footgun. In a query string built as application/x-www-form-urlencoded data — which is what HTML forms and most query strings use — a literal + means a space. So ?q=a+b is read by the server as "a b", not "a+b".

That has two consequences:

  • A space in a query string is often written as + rather than %20. Both decode to a space in form context, but %20 is the safer, universally correct choice.
  • A literal plus sign in a query value must be escaped to %2B, or it will be read as a space. Sending 1+1=2 as ?eq=1+1=2 arrives at the server as 1 1=2. Encode it to 1%2B1%3D2 and the plus survives.

This is why component mode turns + into %2B: it cannot know whether you meant a literal plus, so it escapes it to be safe. Full-URL mode leaves + alone, since the bare URL grammar permits it.

The Double-Encoding Trap

The nastiest bug in this whole area is double encoding, because it produces a string that looks encoded and passes every casual glance. It happens when a value that already contains %xx sequences gets run through an encoder a second time.

The tell is %25. That is the percent-encoding of the % sign itself. So when you encode something twice, every %20 (a space) becomes %2520, every %26 becomes %2526, and so on. If you spot %2520 where a space should be, you are looking at a double-encoded value. The fix is to decode it exactly once: %2520%20 → a space.

The prevention is a habit: encode a value exactly once, right before you assemble the final URL, and never re-encode a string that already contains %xx. If a framework or library is already encoding for you, do not also encode by hand.

Beyond Percent-Encoding

Percent-encoding is not the only escaping scheme a URL touches. Internationalized domain names use a separate algorithm — café.com is not percent-encoded in the host portion but converted to ASCII via Punycode (xn--caf-dma.com), which you can inspect with the Punycode Converter. And once a URL lands inside an HTML attribute, characters like & may also need HTML entity escaping, a different layer handled by the HTML Entities Encoder. Knowing which layer you are in — URL, DNS, or HTML — is half the battle when something looks escaped but still breaks.

Get the mode right, escape exactly once, and watch for %25. Do those three things and the silent broken-link hour never happens again.


Made by Toolora · Updated 2026-06-13