Skip to main content

Unicode Code Point Explorer — UTF-8, UTF-16, Category & Script

Inspect any character — code point, UTF-8 bytes, UTF-16 encoding, Unicode category, script, and block — instant, browser-only

  • Runs locally
  • Category Encoding & Crypto
  • Best for Checking small payloads, tokens, hashes, and encoded values quickly.

Type or paste text above to inspect each character.

What this tool does

Free online Unicode code point explorer. Paste any text or enter a code point like U+1F600 and instantly see the Unicode code point (U+XXXX), official character name, Unicode general category (Lu, Ll, Nd…), script (Latin, Han, Arabic…), Unicode block, UTF-8 byte sequence, UTF-16 encoding with surrogate pairs, HTML entity, JavaScript escape (\u or \u{...}), and CSS escape. Handles emoji, CJK ideographs, Arabic, Devanagari, and all 1.1 million Unicode code points. 100% client-side — nothing is sent to any server.

Tool details

Input
Text + Numbers
The page exposes text boxes, numeric controls, file pickers, or structured inputs depending on the tool.
Output
Live result + Copy
The result area focuses on usable output, with copy, download, or preview actions when supported.
Privacy
Browser-side processing
The main tool logic does not call an external API, so inputs normally stay in the current tab.
Save / share
Shareable URL state
Key settings are encoded in the URL so another person can reopen the same setup.
Performance budget
Initial JS <= 28 KB
No WASM budget is declared, keeping the tool quick to open on mobile.
Best fit
Encoding & Crypto · Developer
Category and role tags drive related tools, internal links, and quick fit checks.

How to use

  1. 1. Input

    Paste or drop your content into the tool panel.

  2. 2. Process

    Click the button. All processing is local in your browser.

  3. 3. Copy / Download

    Copy the result or download to disk in one click.

How Unicode Code Point Explorer fits into your work

Use it for quick browser-side encoding, decoding, hashing, token checks, and share-safe transformations.

Encoding jobs

  • Checking small payloads, tokens, hashes, and encoded values quickly.
  • Preparing values for APIs, URLs, docs, or support tickets.
  • Avoiding account-based tools when the input might be sensitive.

Encoding checks

  • Do not paste live secrets unless you are comfortable with local browser handling.
  • Confirm whether the operation is reversible before sharing the result.
  • For hashes, compare the exact algorithm and casing expected by the receiver.

Good next steps

These links move the current task into a more complete workflow.

  1. 1 URL Encoder / Decoder Encode and decode URL-unsafe characters — query strings, path segments, full URLs — instant, browser-only Open
  2. 2 HTML Entities Encoder Encode/decode HTML entities — &amp; &lt; &gt; &quot; &#39; and all numeric refs — browser-only Open
  3. 3 Base64 Encoder & Decoder Encode or decode Base64 — text, files, and Data URLs. Runs entirely in your browser. Open

Real-world use cases

  • Debugging why a character breaks a JSON or SQL query

    A curly apostrophe (U+2019, RIGHT SINGLE QUOTATION MARK, UTF-8: E2 80 99) looks identical to an ASCII apostrophe but breaks string literals in SQL and JSON parsers that expect U+0027. Paste the suspicious character into this tool, confirm the code point, then replace it with the correct ASCII equivalent — or use the HTML entity &#x2019; for HTML-safe rendering.

  • Understanding emoji encoding for mobile app development

    Emoji like 😀 (U+1F600) live in Unicode's supplementary planes and need a 4-byte UTF-8 sequence (F0 9F 98 80) and a UTF-16 surrogate pair (D83D DE00). iOS Swift, Android Kotlin, and JavaScript each handle these differently. Enter any emoji here to see the exact byte sequences and surrogate pair values you need for your target platform.

  • Verifying CJK character encoding in Chinese/Japanese/Korean text

    Chinese, Japanese, and Korean characters (U+4E00–U+9FFF and extensions) each take 3 bytes in UTF-8. If a database column stores them as latin1 instead of utf8mb4, every Chinese character corrupts. Paste suspect characters here to see their exact UTF-8 encoding and confirm what collation your table must use.

Common pitfalls

  • Confusing "Unicode code point" with "UTF-8 byte value". U+00E9 (é) is one code point but encodes as two UTF-8 bytes (0xC3 0xA9). Always check the byte sequence separately from the code point number.

  • Assuming every JavaScript string character is one code point. JS strings are UTF-16, so supplementary characters (U+10000+) have .length 2 (surrogate pair). Use for...of or Array.from to iterate by real code points.

  • Using \uXXXX escape for supplementary plane characters in JavaScript. \uXXXX only handles U+0000–U+FFFF. For emoji and other high code points, use \u{1F600} (ES6 template) or the explicit surrogate pair.

Privacy

All analysis runs entirely in your browser using the built-in TextEncoder API and JavaScript's Unicode property escapes. The text you paste or any code point you enter is never sent to any server and is not stored anywhere. The URL state encodes your input in the query string for shareability — avoid sharing links if your input contains sensitive identifiers.

FAQ

Tool combos

Folks in your role tend to reach for these alongside this tool.

Made by Toolora · 100% client-side · Updated 2026-07-01