Skip to main content

UUID v4 vs v7: Which Format Your Database Actually Wants

A practical guide to UUID formats — when v4 random beats v7 time-ordered, the real collision math behind 122 random bits, why v1 is now deprecated, and how to pick keys that won't fragment your index.

Published By 李雷
#uuid #database #primary-keys #developer-tools

UUID v4 vs v7: Which Format Your Database Actually Wants

I once watched a write-heavy Postgres table slow to a crawl over three months, and the culprit was the primary key. Someone had reached for uuid_generate_v4() because "UUIDs are unique," and every insert was landing at a random spot in the B-tree index. The fix was not more hardware. It was choosing the right UUID version. This guide walks through the formats that matter — v4, v7, v1, NIL, and Short — so you pick the right one before it costs you an index rebuild.

The four formats worth knowing

A UUID is 128 bits, usually shown as 36 characters: 8-4-4-4-12 hex digits with hyphens. The version digit (the first character of the third group) tells you how those bits were filled.

  • v4 (random): 122 bits of cryptographic randomness, 6 bits reserved for version and variant. No structure, no order. This is what crypto.randomUUID() returns in the browser.
  • v7 (time-ordered): The first 48 bits are a big-endian millisecond Unix timestamp, the rest is random. New values sort chronologically.
  • v1 (timestamp + MAC): Encodes a 100-nanosecond timestamp and the host's MAC address. Largely retired — more on that below.
  • NIL: All zeros, 00000000-0000-0000-0000-000000000000. An explicit "no value" sentinel.
  • Short: Not a spec version, just an 8-character hex string (32 bits) for compact demo IDs.

You can generate any of these in the UUID generator, batch up to 1000 at a time, and toggle hyphens off when you want a clean 32-character value.

v4 vs v7: the decision that affects your index

The short rule: use v4 when insertion order does not matter, use v7 when it does.

Reach for v4 for user IDs, API tokens, session identifiers, and idempotency keys — anything that lives in a hash map or a column you look up by exact match. The randomness is a feature: nobody can guess the next ID, and there is no timestamp leaking when the record was created.

Reach for v7 the moment a UUID becomes a primary key on a high-write table in PostgreSQL, MySQL, or SQLite. A B-tree index keeps entries sorted. When you insert random v4 values, each new row splits a random page somewhere in the middle of the tree, and the database rewrites that page. This is write amplification, and it fragments the index over time, slowing both inserts and range scans. v7 values carry that millisecond timestamp up front, so each new row appends to the tail of the index — exactly like an auto-increment integer, but globally unique with no central counter. You get the sharding freedom of a UUID without surrendering the locality of a serial column.

A neat bonus: because the first 48 bits of v7 are a real timestamp, you can read the creation time back out. Take the first 12 hex characters, drop the hyphens, and parse them as a hexadecimal integer. 0190b8e5-7c00-7000-8000-000000000000 starts with 0190b8e57c00, which is 1,719,331,000,320 milliseconds — roughly 2024-06-25. Handy for rough debugging when you don't have a created_at column. v4 carries none of this; those 48 bits are pure noise.

Why v1 fell out of favor

UUID v1 was the original time-based design, and it has two problems that aged badly. First, it embeds the generating machine's MAC address directly in the ID. That is a privacy and security leak — it can fingerprint the physical host, and it has been used in real investigations (the author of the Melissa worm was traced partly through a v1 UUID). Second, even at 100-nanosecond resolution, two UUIDs minted in the same tick on the same machine can collide.

In 2024 the IETF published RFC 9562, which updated the original RFC 4122 standard, formally marked v1 and v2 as not recommended, and blessed v4 for randomness and v7 for time-ordered use. If you are starting fresh today, there is no reason to pick v1.

Will UUIDs collide? The actual math

This is the question that makes people nervous about dropping auto-increment. The honest answer: collisions are not a practical concern, and the numbers are reassuring once you see them.

A v4 UUID has 122 bits of randomness (RFC 4122 spends the other 6 bits on the version and variant fields). The birthday paradox says you need roughly 2^61 ≈ 2.3 × 10^18 UUIDs before the collision probability hits 50%. At a sustained one billion inserts per second, reaching that pile would take over 73 years. For v7, the random portion is 62 bits per millisecond window, so you'd need about 2.1 billion v7 UUIDs inside a single millisecond before a collision became likely. No real workload comes close. Add a UNIQUE constraint as a seatbelt and move on.

The format that genuinely can collide is Short UUID. With only 32 bits, the birthday bound reaches 50% near 77,000 IDs — not billions. That is fine for a URL-shortener demo or a throwaway lookup key, but never use it as a primary key at scale.

A real example and storage tips

Here is a single v4 UUID straight out of the generator:

3f29c4d7-9b1e-4a6c-bf08-2d5e7a91c004

Notice the 4 opening the third group (the version) and the b opening the fourth (the variant). Strip the hyphens and you get a clean 3f29c4d79b1e4a6cbf082d5e7a91c004 — 32 characters, ideal for filenames, cache keys, and compact URLs.

On storage: the hyphens are display-only. In a native uuid column (PostgreSQL), the value is 16 bytes on disk whether or not you strip them. If you store UUIDs as text/varchar, the canonical 36-character form costs 36 bytes, the hyphen-free form 32, and a true BINARY(16) column just 16. For high-volume tables, store the raw 16 bytes and format on read.

When you need many at once — say 500 fixture IDs for a load test — batch-generate them, strip hyphens, and download as a .txt file rather than hand-rolling a uuidgen shell loop. If you want even shorter, non-spec IDs for client-facing slugs, compare against a nanoid generator, which trades the UUID layout for a tighter, URL-safe alphabet.

Pick v7 for ordered keys, v4 for everything random, and keep Short for demos. Get that one choice right and your index will thank you for years.


Made by Toolora · Updated 2026-06-13