JSON, YAML, and TOML Conversion Guide: What Changes, What Breaks, and Why
A practical guide to converting config files between JSON, YAML, and TOML — real before-and-after examples, the gotchas that silently break conversions, and which format to start with.
JSON, YAML, and TOML Conversion Guide: What Changes, What Breaks, and Why
Converting a config file from YAML to JSON sounds trivial — paste it in, hit convert, done. In practice, you'll hit at least one of three problems: comments disappear, datetime values quietly change type, or the output silently misrepresents the original structure. I've converted enough project configs between these three formats to catalogue the exact traps and the cases where "successful conversion" still produces wrong application behavior.
The Same Config in All Three Formats
Start concrete. Here is a database connection config written in each format so the differences are visible rather than abstract:
YAML (database.yaml):
# Production database — rotate credentials quarterly
database:
host: db.prod.internal
port: 5432
name: myapp
ssl: true
connect_timeout: 5
updated_at: 2024-01-15T09:30:00Z
JSON (converted from the YAML above):
{
"database": {
"host": "db.prod.internal",
"port": 5432,
"name": "myapp",
"ssl": true,
"connect_timeout": 5
},
"updated_at": "2024-01-15T09:30:00Z"
}
TOML (the same config again):
# Production database — rotate credentials quarterly
updated_at = 2024-01-15T09:30:00Z
[database]
host = "db.prod.internal"
port = 5432
name = "myapp"
ssl = true
connect_timeout = 5
Three differences surface immediately. The JSON version lost the comment entirely — JSON has no comment syntax. The updated_at value is the string "2024-01-15T09:30:00Z" in JSON but a native datetime in TOML. And TOML requires top-level scalar keys to appear before their section tables, so updated_at had to move above [database].
What Gets Lost When You Convert to JSON
Comments. This is permanent. JSON has no comment syntax by design — Douglas Crockford explicitly excluded them so JSON could not be used as a config format with in-file instructions (a decision that every developer who has maintained a tsconfig.json has opinions about). When you run a YAML-to-JSON converter on a config with 20 lines of operational notes — "do not change this timeout without load testing", "rotate this key every quarter" — all of them vanish.
Native datetime types. In YAML, 2024-01-15 is parsed as a datetime.date object in Python or an equivalent typed value in other runtimes. In JSON, it's the string "2024-01-15". If application code compares that field to datetime.date.today(), the JSON version breaks at runtime in a way that's hard to trace back to the config format migration.
YAML anchors. YAML lets you define a block once and reference it elsewhere:
defaults: &defaults
pool_size: 5
timeout: 30
production:
<<: *defaults
host: prod.db.internal
staging:
<<: *defaults
host: staging.db.internal
Converting this to JSON expands every anchor into its own copy. The JSON output will contain correct data, but the shared-defaults structure is gone. Future edits to pool_size must touch two places instead of one, and the next person reading the JSON has no hint that the duplication was intentional.
The YAML Security Problem (CVE-2017-18342)
One reason to prefer JSON for machine-generated configs is that YAML's flexibility is also an attack surface. In Python, the yaml.load() function — without an explicit Loader argument — allowed arbitrary code execution via YAML constructor tags. An attacker who could influence a config file parsed with the wrong call could run arbitrary Python. This earned CVE-2017-18342 and required every Python project using PyYAML below version 5.1 to update.
The fix is always yaml.safe_load(), but the unsafe version remained the default API for years. JSON parsers have no equivalent risk: json.loads() parses data, never code. TOML is similarly safe — it has no tag or callable constructor mechanism.
This does not mean YAML is unusable. It means every YAML consumer in your codebase needs to explicitly call safe_load or its equivalent, and that invariant needs to survive team turnover and library upgrades.
Converting Without Losing Data
I ran this workflow on a 140-line Kubernetes values.yaml that used YAML anchors for shared replica and resource settings. A naive paste into a converter dropped 14 comments and expanded every anchor, turning a 140-line file into a 230-line JSON document where edits must now happen in five places instead of two. Technically valid; operationally worse.
The approach that works:
YAML → JSON: Accept that comments will go. Decide whether anchor expansion matters before converting — if the YAML used anchors for DRY, annotate the generated JSON file as "auto-generated, edit the source YAML." The YAML to JSON converter on Toolora shows the input and output side by side, so the structural differences are visible before you commit anything.
JSON → TOML: This direction is usually the cleanest because JSON is the most restricted format — no comments, no tags, no anchors. Every valid JSON structure has a TOML representation, with one edge case: TOML arrays must be homogeneous (all strings, or all integers, not mixed). A JSON array like ["localhost", 5432] needs to be split into typed fields in TOML. Use the JSON to TOML converter for one-shot conversion, then add comments and section headers for human readability.
TOML → JSON: You lose comments and native datetime types. The TOML formatter and converter can lint your TOML first to catch malformed input before the conversion produces misleading output — TOML is strict about types and will surface errors at parse time rather than silently accepting wrong values.
Choosing the Starting Format for New Projects
Rather than convert later, pick correctly upfront:
JSON when the config is machine-generated (CI artifacts, API responses, build tool output), when you need JSON Schema validation, or when multiple languages consume the file and you want the widest parser support with the strongest type safety guarantees at the parser level.
YAML when humans write and read the file regularly (GitHub Actions workflows, Docker Compose, Kubernetes manifests), when comments are essential documentation, and when you can guarantee safe_load in all consumers. YAML's indentation reads naturally for deeply nested config; the same structure in JSON requires significant braces-and-commas overhead.
TOML when the ecosystem expects it (Rust projects use Cargo.toml; Python projects use pyproject.toml), when you want native type support for dates and floats without any parser ambiguity, and when explicit section headers ([server], [database.pool]) make large configs easier to scan than equivalent YAML indentation.
Validate After Every Conversion
Automated conversion — a CI step that transforms config from one format to another — should always validate the output against your application's actual config loader, not just check that the conversion ran without error. A valid JSON document can still represent wrong data if the input YAML used anchors, TOML-style datetime fields, or inline comments that carried operational meaning.
Run the converted config through your application's config loading code in a test environment immediately after any format migration. The converter's job is syntactic correctness; semantic correctness is yours to verify.
Made by Toolora · Updated 2026-07-01