Skip to main content

Unicode Normalizer (NFC NFD NFKC NFKD)

Normalize text to NFC, NFD, NFKC or NFKD, see code-point and byte counts shift, spot which characters changed, copy in one click, all in your browser

  • Runs locally
  • Category Text
  • Best for Removing repetitive cleanup work from everyday writing and operations.
Form

Canonical Composition — combine marks into single code points where one exists (é stays one character). The web default; use this for storage and comparison.

code pointsInput0Output0(change 0)
UTF-8 bytesInput0Output0(change 0)
Normalized output
Normalized text and its code points appear here.

What this tool does

Free online Unicode normalizer that converts any text into NFC, NFD, NFKC or NFKD using your browser's built-in normalize() engine. The same visible letter can be stored two ways: é can be one code point (U+00E9) or two (e plus a combining acute U+0301). They look identical but are not equal as strings, so search, dedupe and database keys silently miss matches. This tool collapses both into one canonical shape and shows the code-point count and the UTF-8 byte count before and after, so you can watch combining marks merge under NFC or split under NFD. NFC and NFD are canonical and reversible, while NFKC and NFKD add compatibility folding: fullwidth ABC becomes ABC, circled ① becomes 1, the fi ligature becomes fi. An optional per-character view lists every code point and highlights the ones that changed. Everything runs client-side, with one-click copy and a shareable URL that reopens your exact text and form. No upload.

Tool details

Input
Text
The page exposes text boxes, numeric controls, file pickers, or structured inputs depending on the tool.
Output
Live result + Copy
The result area focuses on usable output, with copy, download, or preview actions when supported.
Privacy
Browser-side processing
The main tool logic does not call an external API, so inputs normally stay in the current tab.
Save / share
Shareable URL state
Key settings are encoded in the URL so another person can reopen the same setup.
Performance budget
Initial JS <= 9 KB
No WASM budget is declared, keeping the tool quick to open on mobile.
Best fit
Text · Developer
Category and role tags drive related tools, internal links, and quick fit checks.

How to use

  1. 1. Input

    Paste or drop your content into the tool panel.

  2. 2. Process

    Click the button. All processing is local in your browser.

  3. 3. Copy / Download

    Copy the result or download to disk in one click.

How Unicode Normalizer fits into your work

Use it to clean, compare, reshape, or extract plain text before it goes into a document, CMS, spreadsheet, or prompt.

Text jobs

  • Removing repetitive cleanup work from everyday writing and operations.
  • Making text easier to compare, paste, publish, or feed into another tool.
  • Working with content locally when the text is private or unfinished.

Text checks

  • Scan for unintended whitespace, duplicate lines, and lost punctuation.
  • For long text, test the first few lines before applying the whole change.
  • Copy the final output only after checking the preview.

Good next steps

These links move the current task into a more complete workflow.

  1. 1 Unicode Character Inspector Inspect any text character-by-character: code points, UTF-8/UTF-16 bytes, HTML entities, JS escapes, names, and hidden zero-width / confusable glyphs. Open
  2. 2 Text to Hex Converter Text ⇄ hexadecimal by UTF-8 bytes — Chinese and emoji safe, picks your separator and case, decodes messy pasted hex — runs in your browser Open
  3. 3 Traditional ⇄ Simplified Chinese Converter Traditional ⇄ Simplified Chinese — fast, character-level, no API. Open

Real-world use cases

  • Fix a search that misses accented names

    Your app stores José typed on a Mac (often NFD: e plus combining accent) but the search box sends José from a Windows keyboard (NFC, one code point). The query never matches and the user swears the record is there. Paste both strings here, normalize each to NFC, and watch the code-point counts line up. Then normalize on input in your code so the mismatch can never happen again.

  • Dedupe a contact or product list

    A CSV merged from two sources holds Café and Café that look identical but compare as different rows, so your dedupe leaves both. Run the column through NFC (or NFKC if fullwidth or ligature variants also appear) and the visually equal entries become byte-equal, collapsing into one. The byte-count delta shown here tells you at a glance which rows were actually decomposed.

  • Clean fullwidth and ligature noise before indexing

    User-submitted text arrives with fullwidth ABC, circled ①②③, and fi fl ligatures pasted from PDFs. None of that should change what a search matches. NFKC folds all of it to plain ABC, 123, fi fl in one pass, so your search index and your equality checks see the same characters a human reads, not the formatting variant underneath.

  • Teach or debug Unicode equivalence

    Explaining why "é === é" is false in JavaScript lands better when the class can see it. Type the precomposed é, switch to NFD, and the code-point count jumps from 1 to 2 while the glyph stays the same. Toggle the per-character view to point at the combining acute U+0301 that appeared. Share the URL and the example reopens exactly as built.

Common pitfalls

  • Comparing strings without normalizing first. Two visually identical names can be one NFC code point versus a base letter plus a combining mark, so a plain equality test returns false. Normalize both sides to the same form before any compare, index or dedupe.

  • Reaching for NFKC when you only need NFC. NFKC is lossy because it folds fullwidth, ligatures and circled digits away for good. If you must preserve exact appearance (display text, a stored original), use NFC, which is reversible, and keep NFKC for the search key only.

  • Assuming CJK is untouched. Chinese ideographs have no canonical decomposition, but NFKC still folds fullwidth Latin and half-width kana inside mixed text, so a string that looks all-Chinese can still change. Check the byte-count delta before assuming nothing happened.

Privacy

Normalization runs entirely in your browser tab through the standard String.prototype.normalize call, so your text is never uploaded and nothing is logged. The one caveat: the shareable link encodes your input and the chosen form into the URL query string, so a link pasted into chat will record that text in the recipient server's access log. For anything private, use the copy button and paste the result rather than sharing the URL.

FAQ

Tool combos

Folks in your role tend to reach for these alongside this tool.

Made by Toolora · 100% client-side · Updated 2026-06-13