How to Extract JWT Tokens From Logs, Headers, and Messy Text
A practical guide to extract JWT tokens from logs, HAR files, and request dumps by their three-dot base64url shape, then dedupe and decode them safely.
How to Extract JWT Tokens From Logs, Headers, and Messy Text
When an auth bug lands on my desk, the token is almost never sitting neatly on its own line. It is buried in a 4,000-line request log, glued to a Authorization: Bearer prefix, wrapped inside a JSON body, or copied out of a browser HAR export with three layers of escaping around it. The first job is always the same: find the token in the noise, pull it out cleanly, and only then start decoding and reasoning about it.
That sounds tedious, and by hand it is. But a JWT has a very recognizable fingerprint, and once you know what that fingerprint looks like you can fish every token out of a wall of text in one pass.
The three-dot shape that gives a JWT away
A JSON Web Token is three base64url segments joined by dots: header.payload.signature. That structure is what makes extraction reliable. The header and payload are base64url-encoded JSON, the signature is a base64url MAC or signature, and the two dots between them are the giveaway. base64url uses A-Z, a-z, 0-9, plus - and _, with no padding = in the usual case, so a token reads as a long run of those characters interrupted by exactly two dots.
That is the pattern the JWT Token Extractor hunts for. It scans your pasted text or an uploaded local file, finds every chunk that matches the three-segment base64url shape, and pulls each one out while dropping the surrounding prose, timestamps, log levels, and markup. The result is a deduplicated table: one row per unique token, with the source line number, the normalized value, a validity flag, and a reason. A two-segment string or a chunk with an illegal character does not silently vanish; it gets listed with the reason so you can tell a real token from log garbage that only looked like one.
A worked example: one log line, one token
Here is a single line straight out of an access log:
2026-06-13T09:41:22Z INFO auth.middleware req_id=8c21 user=42 status=200 bearer=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiI0MiIsIm5hbWUiOiJMaSBMZWkiLCJpYXQiOjE3ODA5MTI0ODJ9.3Vq8mP_kU2nW7yR0fJ4tLb6cQxZ1aH9sD2gK5eN0oI latency=14ms
Everything in that line except the token is noise for this task. Feed the line in, and the extractor reduces it to exactly:
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiI0MiIsIm5hbWUiOiJMaSBMZWkiLCJpYXQiOjE3ODA5MTI0ODJ9.3Vq8mP_kU2nW7yR0fJ4tLb6cQxZ1aH9sD2gK5eN0oI
The bearer= prefix, the request id, the latency, the timestamp all fall away. Paste the whole log file and you get every token at once, one per row, duplicates collapsed. Decode that payload segment and it reads {"sub":"42","name":"Li Lei","iat":1780912482}, which tells you the request was authenticated as user 42, exactly the kind of confirmation you want when chasing a 200 that should have been a 403.
Where the tokens actually live
In practice the tokens come from a handful of recurring sources, and the extractor is built for all of them:
- Auth middleware logs. A full day of access logs can carry thousands of
Bearertokens. You want them deduplicated, because the same session token repeats on every request. - HAR exports. Open the network tab, save the session as a HAR, and every request header and JSON body is in one file. The tokens are wrapped in escaped JSON, but the three-dot shape survives the escaping, so they still extract cleanly.
- Request dumps and curl transcripts. Copied straight from a terminal, complete with
-H 'Authorization: Bearer ...'and surrounding shell quoting. - Support tickets and chat pastes. A user pastes "here's what I got back" and the token is sitting in the middle of a sentence.
The point is that you should not have to reformat any of these before extraction. Paste the raw artifact, let the parser do the shape-matching, and read off the clean list.
Dedupe, then decide what to keep
A real log re-sends the same token over and over, so the raw matches are full of duplicates. The extractor collapses identical tokens into a single row by default, which is usually what you want when you are asking "how many distinct sessions are in this log?" rather than "how many requests fired?" If you specifically need the count of occurrences or the full unfiltered list, you can keep every row instead.
This is also where the supporting tools earn their place. Once you have a clean list, the JWT Token List Validator checks each token's structure in bulk so you can separate well-formed tokens from the near-misses, and the JWT Token Deduplicator is the right tool when deduping is the whole job and you do not need the surrounding extraction step. Extraction, validation, and deduplication are deliberately split so you can run only the stage you need.
Treat the output as sensitive
Here is the caution that matters more than any feature. A JWT is a credential. The signature segment is what makes it valid, and anyone holding the full token can replay it until it expires. That changes how you handle the extractor's output.
Decoding the header and payload locally to read the claims is safe and routine; that is just base64url, not decryption, and it reveals nothing the token holder did not already carry. What is not safe is letting that token leave your control. Do not paste a live production token into a random online decoder that round-trips it to a server. Do not drop the extracted list into a shared doc, a ticket comment, or a Slack channel without thinking about who can read it. If you need to attach a token to a bug report, redact the signature or pull an expired sample instead of a live one.
This is exactly why the extraction runs entirely in your browser tab. The scan that fishes the three-segment tokens out of your log text never sends that text anywhere. Uploaded files are read locally with the File API, not posted to a server. For a category as sensitive as tokens, the value masks in the output while still giving you the validation signal, so you can confirm shape and structure without parading the secret across your screen. And because the work is local, the practical limit is your machine: a few megabytes scans instantly, and for a multi-gigabyte log you should grep the token lines out locally first and paste only those.
The workflow, start to finish
The loop I run for an auth investigation is short: grab the raw artifact (log slice, HAR, request dump), paste it into the extractor, read off the deduplicated token list, decode the one or two tokens that matter to inspect their claims, and then close the tab so nothing lingers. Knowing the three-dot base64url fingerprint is what makes the first step trustworthy: you are not eyeballing a thousand lines hoping to spot a token, you are letting a precise shape-match do it, with invalid near-misses flagged rather than dropped. The rest is just careful handling of something that happens to be a live credential.
Made by Toolora · Updated 2026-06-13