Why another Elasticsearch cheat sheet?

Three reasons. (1) Each entry carries a real production pitfall: heap > 32GB falls off the compressed-oops cliff and GC pressure doubles; dynamic:true on user JSON causes mapping explosion and full GC; from + size deep paging at 10000+ pulls (from + size) docs from every shard and blows heap; default 1s refresh on a log workload spends most CPU on segment creation; disk over 95% locks every index on the node to read- only; a term query on a text field returns 0 hits because text was lowercased + tokenized at index time; fielddata:true on text fields OOMs nodes; two master-eligible nodes is the worst possible topology. (2) Searchable across command + description + example + pitfall in one box, so typing "search_after" pulls every deep-paging entry, and "聚合" / "分词" works from the Chinese side. (3) Nine categories cover everything from index lifecycle to Query DSL to aggregations to analyzers to _cat APIs to snapshots and ILM, with a dedicated pitfalls chip for the nine ways teams lose data, lose latency, or fill up disk. EN and ZH copy are independently written, not machine translated.

I am brand new to Elasticsearch — where do I start?

Filter to "Index management" first and learn six things: PUT / with explicit shard count (not default 5+), GET / to inspect mapping + settings, POST /_aliases for zero-downtime reindex, _reindex with ?wait_for_completion=false for anything > 100k docs, _refresh and refresh_interval (default 1s is wrong for log workloads — raise to 30s), and PUT /_index_template so daily / weekly indices share the same mapping. Then walk "Mapping" to learn text vs keyword (string fields almost always want both via a multi-field), nested vs flattened object (use nested when "all conditions on the same array element" matters), and dynamic:strict (always, on any user-supplied JSON). After that, go to "Query DSL" for match (full-text), term (exact, on keyword), bool with filter for cacheable WHERE clauses, range with date math for time windows, and search_after for paging past 10000. Save aggregations, analyzers, and ILM / SLM for after you have shipped something with the basics.

How do I avoid the "term query returns nothing" trap?

term is an exact, NOT-analyzed query — and text fields are lowercased + tokenized at index time. So term: {"name":"Apple"} on a text field misses every "apple" document. Three correct patterns. (1) Map strings as text with a keyword sub-field — the standard recipe is {"type":"text","fields":{"keyword":{"type":"keyword", "ignore_above":256}}}. Then query exact on name.keyword and full-text on name. (2) If you need full-text matching, use match (not term) — it analyzes the query with the same analyzer as the field. (3) When unsure, run GET / /_mapping to see what each field is actually mapped as, and POST /_analyze with the same analyzer to see how your query string gets tokenized. These three together prevent 90% of "elasticsearch query returns nothing" bug reports.

Why is my cluster always yellow?

Yellow means primaries are OK but some replicas are unassigned. Three common causes. (1) Single-node cluster with replicas > 0 — replicas cannot live on the same node as their primary, so a one-node setup with the default number_of_replicas=1 is permanently yellow. Either set replicas to 0 (test only) or add a second node. (2) Disk watermark — at 85% full ES stops allocating new shards to a node, and existing unallocated replicas stay yellow. Free disk or raise the watermark for the cluster. (3) Allocation filtering / awareness — node attributes like rack / zone / hot/warm/cold can refuse a replica anywhere it would violate the rule. The diagnosis command is always GET /_cluster/allocation/explain — it tells you which shard, which node, and the exact reason in plain English. Combine with GET /_cat/shards?h=index,shard, prirep,state,unassigned.reason&v for a sweep.

How do I do zero-downtime reindex when I need to change a field type?

Once a field is mapped (say as keyword), you cannot change its type in place. The canonical pattern: (1) Create the new index with the correct mapping — products_v2 with name as text instead of keyword. (2) POST /_reindex {"source":{"index":"products_v1"}, "dest":{"index":"products_v2"}} with ?wait_for_ completion=false for any non-trivial size, then poll GET /_tasks/ . Tune slices=auto for parallelism. (3) During reindex, keep writing to products_v1 — your app reads / writes via the "products" alias. (4) After reindex completes, run a small catch-up reindex on docs modified after step 2 started (use the @timestamp / updated_at field). (5) Atomic alias swap: POST /_aliases with one remove + one add action in the same payload — the app never sees a moment without a "products" alias. (6) Once you have monitored products_v2 for a day, DELETE /products_v1. The full recipe is one of the most useful skills in an ES operator's toolkit.

How do I size shards correctly?

Two principles. (1) Each shard ≈ 10-50GB for time- series / log workloads, 20-40GB for search workloads. Smaller shards = more overhead (cluster state, query coordination); bigger shards = slower recovery and worse parallelism. (2) Total shard count per node should be roughly 20 × heap-in-GB. A node with 30GB heap should hold around 600 shards, not 6000. With those two numbers you can back into the right number_of_shards for any index: take your projected index size, divide by 30GB, that is your shard count. For time-series / logs, use _rollover + ILM so each rolled index hits the target size automatically. The number-one mistake is oversharding — picking 50 shards for "what if we grow", ending up with 50 × N nodes × tiny segments and a cluster state in megabytes. Default to fewer, larger shards. Reindex to merge later if you actually do grow.

When should I reach for aggregations vs SQL on a copy of the data?

Aggregations are the right answer when (a) the data already lives in ES, (b) the query is bucket-style (group by X, optionally by Y, with one or more metric aggs), and (c) you want sub-second response on millions to billions of docs. Use terms agg for group-by, date_histogram for time-series, range / histogram for distributions, and stats / percentiles for the metrics. They are NOT the right answer when (a) you need exact distinct counts beyond ~100k — cardinality is approximate, switch to a database; (b) you need joins across documents — ES has no joins (nested is per-doc, parent-child is hot-shard); (c) you need windowed analytics that touch many docs per row (LAG, LEAD, rolling averages over big windows) — at that point export to a columnar store. For "rank top 10 sellers per category with running totals" mixing terms + top_hits + bucket_script gets you most of the way; beyond that, push to a real OLAP engine.

Does this site send my queries anywhere?

No. The whole cheat sheet is a single static page — search runs entirely in your browser against an in- memory array. There is no Elasticsearch connection, no upload, nothing leaves the tab. Open DevTools → Network while you type and you will see zero requests. Safe behind bastion-only ES clusters, corporate proxies, and air-gapped environments where installing kibana is not an option.

A term filter returns zero hits on a 40M-doc product index in prod

A "status:active" filter that worked in staging returns nothing in prod because the field got mapped as text, not keyword, so it was lowercased and tokenized. You filter the term query entries plus the "term returns nothing" pitfall, confirm the recipe is term on status.keyword, and run POST /_analyze in Dev Tools to prove how the value tokenized. Fix shipped in ten minutes instead of an afternoon of guessing.

Cutting a 6-shard 80GB index over to a new mapping with no downtime

Marketing needs name searchable as full text, but it was mapped keyword and you cannot change type in place. You pull the zero-downtime reindex entry, create products_v2, run _reindex with slices=auto and wait_for_completion=false, poll _tasks, then do the atomic _aliases remove+add swap in one payload. The 80GB cutover happens while the app keeps reading and writing through the products alias.

A 12-node log cluster goes red after a node hits 96% disk

Every index on that node flips read-only and ingest stalls. You grab the disk watermark flood_stage entry, confirm 95% triggers the read-only lock, run the _cat/allocation and allocation/explain commands from the cluster ops section to find the hot node, free space, then clear the read_only_allow_delete block. The cheat sheet hands you the exact PUT _settings call so you are not editing YAML at 2am.

Building a top-10-sellers-per-category dashboard panel

You need group-by category, then top sellers each with their revenue, in sub-second time over 30M orders. You filter to aggregations, copy the terms agg nested with top_hits and a sum sub-agg, paste it into Dev Tools, and tune size and shard_size from the pitfall note about terms-agg accuracy. The panel ships against live ES instead of a nightly export to a separate analytics database.

Elasticsearch Cheatsheet — 80+ Query DSL, Mapping, Analyzer, Aggregation, Replication with Real Pitfalls

Elasticsearch cheat sheet — 80+ Query DSL, mapping, analyzer, aggregation, replication, with real examples.

Runs locally
Category Developer & DevOps
Best for Formatting, validating, shrinking, or inspecting code-adjacent text.

172 commands

Index management (19)

PUT /<index>

Create an index. With no body, you get default 1 primary / 1 replica and dynamic mapping — fine for prototyping, almost never right for production.

⚠ Common pitfall: Default 1 shard is fine for < 50GB indices; oversharding ("100 shards for 1GB of logs") wrecks cluster state and recovery time. Size on actual data volume, not "what if we grow".

Examples

PUT /products

PUT /logs-2026.05 {"settings":{"number_of_shards":3,"number_of_replicas":1}}

DELETE /<index>

Delete an index and free disk space immediately. Permanent — no soft delete, no recycle bin. Combine with action.destructive_requires_name in elasticsearch.yml to refuse wildcards.

⚠ Common pitfall: DELETE /* on a cluster with default settings instantly wipes every index. Set action.destructive_requires_name: true so explicit names are required.

Examples

DELETE /products

DELETE /logs-2025.*

GET /<index>

Inspect an index — returns mapping, settings, aliases in one shot. Use ?include_defaults=true to see what every unset setting actually resolves to.

Examples

GET /products

GET /products?include_defaults=true

POST /<index>/_close · _open

Close an index to free heap (mappings stay, shards unload) and reopen it later. Settings that require a closed index (analyzers, similarity) only apply after close → update → open.

⚠ Common pitfall: A closed index occupies disk but cannot be searched or written. Closing the wrong production index = full-blown outage. Use index name explicitly, not a pattern.

Examples

POST /products/_close

POST /products/_open

POST /_aliases

Atomically add/remove aliases across multiple indices. The canonical zero-downtime reindex pattern: write to old index, build new index, swap alias in one atomic action.

Examples

POST /_aliases {"actions":[{"remove":{"index":"products_v1","alias":"products"}},{"add":{"index":"products_v2","alias":"products"}}]}

POST /_aliases {"actions":[{"add":{"index":"logs-2026.05","alias":"logs-current","is_write_index":true}}]}

PUT /_index_template/<name>

Index template (composable, 7.8+). Auto-applies mappings + settings to any new index whose name matches the pattern — the right way to standardize daily log indices, time-based rollovers, and tenant indices.

⚠ Common pitfall: Templates only apply to indices created AFTER you save the template. Existing indices keep their old mapping — reindex if you need the new shape applied retroactively.

Examples

PUT /_index_template/logs-template {"index_patterns":["logs-*"],"template":{"settings":{"number_of_shards":1,"number_of_replicas":1},"mappings":{"properties":{"@timestamp":{"type":"date"},"message":{"type":"text"}}}}}

POST /<index>/_rollover

Roll an alias to a new index when size / age / doc-count thresholds are hit. Pair with an ILM policy so daily/weekly indices stay bounded and old ones move to warm/cold tier automatically.

Examples

POST /logs-current/_rollover {"conditions":{"max_age":"7d","max_size":"50gb","max_docs":100000000}}

POST /_reindex

Copy all documents from one index to another, optionally remapping fields or filtering by query. The only way to change a field type or add a new analyzer to existing data.

⚠ Common pitfall: Default reindex runs synchronously and blocks. For > 100k docs, use ?wait_for_completion=false to get a task id, then poll GET /_tasks/<id>. Tune slices=auto for parallelism.

Examples

POST /_reindex {"source":{"index":"products_v1"},"dest":{"index":"products_v2"}}

POST /_reindex?wait_for_completion=false&slices=auto {"source":{"index":"logs-2025.*"},"dest":{"index":"logs-archive"}}

POST /<index>/_forcemerge

Merge segments down to N (typically max_num_segments=1) — reduces shard segment count, frees deleted-doc space, speeds searches on read-only indices.

⚠ Common pitfall: NEVER force-merge an actively written index. It generates massive merge IO and the next refresh creates new segments anyway. Only run on indices that have stopped receiving writes.

Examples

POST /logs-2025.12/_forcemerge?max_num_segments=1

POST /<index>/_refresh

Manually refresh an index — makes recent writes visible to search immediately. Default auto-refresh is every 1s; for high-throughput write workloads, raise refresh_interval to 30s and call _refresh on demand.

Examples

POST /products/_refresh

PUT /products/_settings {"index":{"refresh_interval":"30s"}}

POST /<index>/_flush

Flush the translog and commit Lucene segments to disk. Routine flushes happen automatically; manual flush is mostly useful before snapshot or shutdown.

Examples

POST /products/_flush

POST /_flush

PUT /<index>/_settings

Update dynamic index settings on a live index — refresh_interval, number_of_replicas, max_result_window, blocks. Static settings (number_of_shards, the analysis section) cannot be changed here; they need a closed index or a reindex.

⚠ Common pitfall: number_of_shards is fixed at creation and can never be updated — plan it up front or use _split / _shrink to change it. Only number_of_replicas is freely adjustable on a live index.

Examples

PUT /products/_settings {"index":{"number_of_replicas":2,"refresh_interval":"30s"}}

PUT /products/_settings {"index":{"max_result_window":50000}}

POST /<index>/_clone

Clone an index into a new one with the same mapping and the same shard count, hard-linking segments so it is near-instant and uses almost no extra disk. The source must be read-only (index.blocks.write) first.

Examples

PUT /products/_settings {"settings":{"index.blocks.write":true}}

POST /products/_clone/products_copy

POST /<index>/_shrink

Shrink an index to fewer primary shards (target count must divide the source). Used to consolidate an over-sharded index after rollover so old read-only indices stop wasting cluster-state overhead.

⚠ Common pitfall: Before _shrink the index must be read-only and all its primary shards must sit on the SAME node — set index.routing.allocation.require._name first. Forgetting this leaves the shrink stuck.

Examples

PUT /logs-old/_settings {"settings":{"index.blocks.write":true,"index.routing.allocation.require._name":"node-1"}}

POST /logs-old/_shrink/logs-old-shrunk {"settings":{"index.number_of_shards":1,"index.blocks.write":null}}

POST /<index>/_split

Split an index into MORE primary shards without reindexing. Source must be read-only; target shard count must be a multiple of the source. The escape hatch when an index outgrew its original shard count.

Examples

PUT /events/_settings {"settings":{"index.blocks.write":true}}

POST /events/_split/events_split {"settings":{"index.number_of_shards":6}}

GET /<index>/_stats

Per-index operational stats — doc count, store size, indexing/search/merge/refresh/flush rates, segment count, query cache hit ratio. The first stop when one index feels slow.

Examples

GET /products/_stats

GET /products/_stats/search,indexing,merge

GET /<index>/_segments

List the Lucene segments inside each shard — count, size, doc count, deleted-doc count. A high segment count or many deleted docs is the signal that a force-merge (on a read-only index) would help.

Examples

GET /products/_segments

GET /logs-2025.12/_segments?verbose=false

PUT /_component_template/<name>

Reusable building block for composable index templates — define mappings or settings once, then reference it from many index templates via composed_of. Keeps shared field definitions DRY across log/metric/trace templates.

Examples

PUT /_component_template/base-settings {"template":{"settings":{"number_of_shards":1,"number_of_replicas":1}}}

PUT /_index_template/logs {"index_patterns":["logs-*"],"composed_of":["base-settings"]}

PUT /<data-stream> (data stream)

A data stream is an append-only abstraction over time-series backing indices, auto-rolling via its index template + ILM. The modern replacement for "manage your own daily index + alias" — you write to one name, ES handles rollover.

⚠ Common pitfall: Data streams only accept create (append) ops — you cannot update or delete a single doc by id through the stream name; you must target the concrete backing index (.ds-<name>-<date>-NNNN).

Examples

PUT /_index_template/metrics {"index_patterns":["metrics-*"],"data_stream":{},"template":{"mappings":{"properties":{"@timestamp":{"type":"date"}}}}}

PUT /_data_stream/metrics-app

Documents (17)

PUT /<index>/_doc/<id>

Index a document with a known id (creates or fully replaces). To insert only if absent, use op_type=create and you get a 409 if the id exists.

Examples

PUT /products/_doc/1 {"name":"Headphones","price":99,"in_stock":true}

PUT /products/_doc/1?op_type=create {"name":"Headphones"}

POST /<index>/_doc

Index a document without specifying id — Elasticsearch auto-generates a base64 id. Slightly faster than PUT/_doc/<id> because it skips the "is this an update?" check.

Examples

POST /products/_doc {"name":"Mouse","price":25}

GET /<index>/_doc/<id>

Fetch a single document by id. Returns _source plus metadata. Add _source_includes / _source_excludes to project only the fields you care about — much cheaper on big docs.

Examples

GET /products/_doc/1

GET /products/_doc/1?_source_includes=name,price

POST /<index>/_update/<id>

Partial update — merges the supplied fields into the existing _source via a single atomic get-modify-index cycle. Use doc for simple merges, script for conditional logic.

⚠ Common pitfall: ES has no true in-place update; every "update" rewrites the whole document and marks the old one as deleted. Heavy update workloads bloat segments — reindex or use rollover indices.

Examples

POST /products/_update/1 {"doc":{"price":89}}

POST /products/_update/1 {"script":{"source":"ctx._source.views += params.n","params":{"n":1}}}

POST /<index>/_update_by_query

Update every document that matches a query in one server-side pass — no round-trip per doc. Combine with a script to backfill or migrate fields without writing a client program.

Examples

POST /products/_update_by_query {"script":{"source":"ctx._source.active = true"},"query":{"term":{"in_stock":true}}}

DELETE /<index>/_doc/<id>

Delete a single document by id. Disk is only reclaimed after Lucene segment merge — so disk usage does not drop immediately after a delete spike.

Examples

DELETE /products/_doc/1

POST /<index>/_delete_by_query

Delete every document matching a query. Safer than DROP TABLE because the index stays, mapping stays, only the data goes — and you can preview with a search first.

⚠ Common pitfall: Marks docs as deleted; disk is only reclaimed after segment merge. For "wipe the index", DELETE /<index> + recreate is faster than _delete_by_query.

Examples

POST /logs/_delete_by_query {"query":{"range":{"@timestamp":{"lt":"now-30d"}}}}

POST /_bulk

Batch index / update / delete in one request — the only way to hit ES write throughput at scale. NDJSON body: one action line + one source line per op.

⚠ Common pitfall: Sweet spot: 5-15MB per bulk request. < 1MB = too many round-trips; > 100MB = timeouts and HTTP 413. Set request timeout > 30s for big bulks.

Examples

POST /_bulk
{"index":{"_index":"products","_id":"1"}}
{"name":"A"}
{"index":{"_index":"products","_id":"2"}}
{"name":"B"}

GET /_mget

Multi-get — fetch many documents by id across one or more indices in a single round-trip. Much faster than N separate GET /_doc calls.

Examples

GET /_mget {"docs":[{"_index":"products","_id":"1"},{"_index":"products","_id":"2"}]}

GET /<index>/_count

Lightweight count of documents matching a query. Cheaper than search.size=0 + total because no scoring, no source loading.

Examples

GET /products/_count

GET /products/_count {"query":{"term":{"in_stock":true}}}

GET /<index>/_source/<id>

Fetch ONLY the _source of a document, skipping the metadata envelope (_index, _id, _version). Slightly lighter than GET /_doc/<id> when you just need the raw stored JSON.

Examples

GET /products/_source/1

GET /products/_source/1?_source_includes=name,price

HEAD /<index>/_doc/<id>

Existence check — returns 200 if the document exists, 404 if not, with no body transferred. The cheapest way to ask "is this id already indexed?" before deciding to create vs update.

Examples

HEAD /products/_doc/1

PUT /<index>/_doc/<id>?if_seq_no=&if_primary_term=

Optimistic concurrency control — only apply the write if the doc still has the seq_no + primary_term you last read. If another writer changed it first, you get a 409 instead of a silent lost update.

⚠ Common pitfall: ES has no row locks. Without if_seq_no / if_primary_term, two concurrent read-modify-write clients silently clobber each other. Always pass both for compare-and-set updates.

Examples

PUT /products/_doc/1?if_seq_no=12&if_primary_term=2 {"name":"Headphones","stock":5}

POST /<index>/_update/<id> (upsert)

Update-or-insert in one atomic call — run the script/doc if the id exists, otherwise index the upsert body. The idiomatic "increment a counter, creating it at 0 if absent" pattern.

Examples

POST /counters/_update/page-1 {"script":{"source":"ctx._source.hits += params.n","params":{"n":1}},"upsert":{"hits":1}}

POST /products/_update/1 {"doc":{"price":89},"doc_as_upsert":true}

POST /<index>/_doc?routing=<value>

Custom routing — pin a document to a shard chosen by your routing value (default is a hash of _id). Co-locate related docs (all of one tenant) on one shard so queries with the same routing hit a single shard.

⚠ Common pitfall: Custom routing can create hot shards if one routing value owns most of the data. And you MUST pass the same routing on GET/DELETE/update or ES looks on the wrong shard and reports 404.

Examples

POST /orders/_doc?routing=tenant-42 {"tenant":"tenant-42","total":99}

GET /orders/_search?routing=tenant-42 {"query":{"match_all":{}}}

GET /<index>/_explain/<id>

Explain exactly why one specific document does or does not match a query, and how its BM25 score was computed term by term. The right tool for "why is this result ranked here?" relevance debugging.

Examples

GET /products/_explain/1 {"query":{"match":{"description":"wireless"}}}

POST /<index>/_termvectors/<id>

Return the term vectors of a document — per-field terms with frequency, position, offset, and payload. Used to debug analysis, build "more like this" features, or inspect what actually got indexed.

Examples

POST /products/_termvectors/1 {"fields":["description"],"term_statistics":true}

Mapping (18)

PUT /<index>/_mapping

Add new fields or sub-fields to an existing index mapping. Adding a new field is allowed; changing a field type is NOT — reindex into a new index for type changes.

⚠ Common pitfall: Once a field is mapped as keyword, you cannot turn it into text in place — you have to create a new index with the right mapping and reindex.

Examples

PUT /products/_mapping {"properties":{"tags":{"type":"keyword"},"description":{"type":"text","analyzer":"english"}}}

GET /<index>/_mapping

Inspect the current mapping of an index — every property, its type, its analyzer, its sub-fields. Use this before writing any query so you query the right field type.

Examples

GET /products/_mapping

GET /products/_mapping/field/name

type: text vs keyword

text = analyzed into tokens for full-text search (match queries); keyword = stored exactly as one token for filtering, sorting, aggregations. Most string fields want BOTH via a multi-field.

⚠ Common pitfall: You CANNOT aggregate or sort on a text field by default (fielddata is off). Always declare strings as text with a keyword sub-field: {"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}}.

Examples

{"properties":{"name":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}}}}

type: date

Date field. Accepts ISO 8601, millis-since-epoch, or any pattern declared in format. Internally stored as long millis — range queries are extremely fast.

Examples

{"properties":{"@timestamp":{"type":"date","format":"strict_date_optional_time||epoch_millis"}}}

type: nested

Nested object — preserves the relationship between sub-fields within an array of objects. Required when you need to query "any one element matches BOTH fields A and B at once".

⚠ Common pitfall: Plain object arrays flatten — {"tags":[{"k":"x","v":1},{"k":"y","v":2}]} becomes {tags.k:[x,y], tags.v:[1,2]} and a bool query "k=x AND v=2" matches falsely. Use nested when AND inside one object matters.

Examples

{"properties":{"variants":{"type":"nested","properties":{"sku":{"type":"keyword"},"price":{"type":"double"}}}}}

type: geo_point

Geo-point field — supports lat/lon, [lon,lat] array, "lat,lon" string, geohash. Enables geo_distance, geo_bounding_box, geo_polygon queries and geo_distance sorts.

Examples

{"properties":{"location":{"type":"geo_point"}}}

POST /stores/_doc {"location":{"lat":40.71,"lon":-74.0}}

dynamic: true · false · strict

Controls what happens when a doc with an unknown field arrives. true = auto-create (default, dangerous); false = ignore unknown fields; strict = reject the whole document with an error.

⚠ Common pitfall: Default dynamic:true on user-controlled input causes "mapping explosion" — millions of auto-created fields blow up cluster state. Always set dynamic:strict on user-supplied JSON.

Examples

{"mappings":{"dynamic":"strict","properties":{"name":{"type":"text"}}}}

multi-field (fields)

Index one source field as multiple sub-fields with different analyzers/types — full-text search on name, exact match on name.keyword, ngram autocomplete on name.autocomplete.

Examples

{"properties":{"name":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256},"english":{"type":"text","analyzer":"english"}}}}}

index: false

Store but do not index a field — saves disk and memory, but the field becomes unsearchable. Use for fields you only ever return in _source (display labels, raw HTML, blobs).

Examples

{"properties":{"raw_html":{"type":"text","index":false}}}

numeric types (long, integer, double, scaled_float)

Pick the smallest numeric type that fits — byte/short/integer/long for ints, float/double/half_float for reals. scaled_float stores a float as a scaled long (price × 100), which is smaller and faster than double for fixed-precision money.

⚠ Common pitfall: Do NOT map an identifier like an order_id or phone number as a numeric type just because it looks like digits — you will never do range math on it, and keyword indexes and aggregates it far more efficiently.

Examples

{"properties":{"price":{"type":"scaled_float","scaling_factor":100}}}

{"properties":{"order_id":{"type":"keyword"}}}

type: ip

IP field — stores IPv4 and IPv6, and supports CIDR range queries directly ("everything in 10.0.0.0/8"). Far better than storing IPs as keyword strings, which cannot do subnet matching.

Examples

{"properties":{"client_ip":{"type":"ip"}}}

{"query":{"term":{"client_ip":"10.0.0.0/8"}}}

type: boolean

Boolean field — accepts true/false, plus the JSON strings "true"/"false". Indexed as a single term, so term/filter queries on it are extremely cheap and cacheable.

Examples

{"properties":{"in_stock":{"type":"boolean"}}}

{"query":{"term":{"in_stock":true}}}

type: object vs flattened

A plain object maps every sub-key as its own field (good for known shapes, bad for unbounded keys). flattened indexes an entire JSON object as ONE field of keyword-like leaves — perfect for arbitrary metadata/labels that would otherwise explode the mapping.

⚠ Common pitfall: flattened fields are exact-match only — no full-text analysis, no per-leaf type, no numeric range. Use it to TAME unbounded keys, not to search them like text.

Examples

{"properties":{"labels":{"type":"flattened"}}}

{"query":{"term":{"labels.env":"prod"}}}

type: completion (suggester)

Purpose-built type backing the completion suggester — an in-memory FST optimized for fast prefix autocomplete with weights. Use it for type-ahead search boxes instead of edge_ngram when you want ranked suggestions.

Examples

{"properties":{"suggest":{"type":"completion"}}}

POST /products/_search {"suggest":{"s":{"prefix":"head","completion":{"field":"suggest","size":5}}}}

type: dense_vector (kNN)

Store float vectors for semantic / kNN search (8.0+). With index:true and an HNSW similarity (cosine, dot_product, l2_norm) you get approximate nearest-neighbour search — the backbone of vector / embedding retrieval in ES.

⚠ Common pitfall: For dot_product similarity ES requires unit-length (normalized) vectors; feeding un-normalized vectors silently skews relevance. Normalize embeddings before indexing or use cosine.

Examples

{"properties":{"embedding":{"type":"dense_vector","dims":768,"index":true,"similarity":"cosine"}}}

type: alias

Field alias — a virtual name that points at a real field for queries and aggregations, without copying data. Lets you rename a field in your query layer (e.g. "ts" → "@timestamp") without reindexing.

⚠ Common pitfall: Aliases work for read paths (query, agg, sort) only — you cannot index INTO an alias field, and they cannot point at an object or another alias.

Examples

{"properties":{"@timestamp":{"type":"date"},"ts":{"type":"alias","path":"@timestamp"}}}

copy_to

Copy several source fields into one combined searchable field at index time — search "John Smith" across first_name + last_name without a multi_match. The combined field is searchable but not returned in _source.

Examples

{"properties":{"first_name":{"type":"text","copy_to":"full_name"},"last_name":{"type":"text","copy_to":"full_name"},"full_name":{"type":"text"}}}

runtime field

Define a field computed by a Painless script AT QUERY TIME, not stored on disk (schema-on-read). Add or fix a field on existing data with zero reindex — pay a small per-query CPU cost instead of disk.

⚠ Common pitfall: Runtime fields are computed per matching doc on every query — cheap for filtering a few results, expensive when sorted/aggregated over millions. Promote a hot runtime field to an indexed field once the shape stabilizes.

Examples

PUT /logs/_mapping {"runtime":{"status_class":{"type":"keyword","script":{"source":"emit(doc['status'].value >= 500 ? '5xx' : 'ok')"}}}}

Query DSL (31)

match

Full-text query — analyzes the query string with the field analyzer, matches any token, scores by BM25. The default tool for searching text fields.

Examples

GET /products/_search {"query":{"match":{"description":"wireless headphones"}}}

GET /_search {"query":{"match":{"description":{"query":"wireless headphones","operator":"and","minimum_should_match":"75%"}}}}

match_phrase

Phrase query — tokens must appear in the same order, with no gaps (or up to slop). Use for "exact phrase" search; combine with slop=2 to allow a couple of words between.

Examples

{"query":{"match_phrase":{"description":"wireless noise cancelling"}}}

{"query":{"match_phrase":{"description":{"query":"wireless headphones","slop":2}}}}

multi_match

Run the same query string against multiple fields at once, optionally with per-field boosts. Type "best_fields" picks the best-scoring field; "cross_fields" treats them as one big field.

Examples

{"query":{"multi_match":{"query":"wireless","fields":["name^3","description"],"type":"best_fields"}}}

term

Exact value query — NOT analyzed. Use on keyword / numeric / boolean / date fields. Searching "Apple" with term against a text field will miss everything because text was lowercased on index.

⚠ Common pitfall: term on a text field almost always returns 0 hits — text was lowercased and tokenized at index time, so "Apple" became "apple". Either query against name.keyword or use a match query.

Examples

{"query":{"term":{"status":"active"}}}

{"query":{"term":{"name.keyword":"Apple"}}}

terms

Match if the field equals ANY of the supplied values — the ES equivalent of SQL IN (...). Default cap is 65,536 terms; raise index.max_terms_count if you really need more.

Examples

{"query":{"terms":{"status":["active","pending","review"]}}}

range

Range query for numeric, date, or ip fields. Date math is supported — now-1d, now/d (rounded to day), now+5m, etc. Inclusive (gte/lte) and exclusive (gt/lt) bounds.

Examples

{"query":{"range":{"price":{"gte":10,"lte":100}}}}

{"query":{"range":{"@timestamp":{"gte":"now-7d/d","lt":"now/d"}}}}

bool (must / should / must_not / filter)

Boolean combinator. must = AND (scored), should = OR (scored, contributes to score), must_not = NOT (unscored), filter = AND (unscored, cacheable — fastest). Use filter wherever scoring does not matter.

⚠ Common pitfall: Putting equality / range clauses in must instead of filter wastes scoring CPU AND skips the filter cache. Move every "is this true?" clause to filter and only put "rank these" clauses in must.

Examples

{"query":{"bool":{"must":[{"match":{"description":"wireless"}}],"filter":[{"term":{"in_stock":true}},{"range":{"price":{"lte":200}}}],"must_not":[{"term":{"discontinued":true}}]}}}

wildcard

Wildcard match with * (zero or more chars) and ? (single char). Runs on keyword (or text with caveats). Leading-wildcard ("*foo") forces a full index scan — slow on large indices.

⚠ Common pitfall: Leading wildcard ("*foo*") is the slowest possible query — it cannot use the inverted index. For "contains" search, use a wildcard field (7.9+) or ngram analyzer instead.

Examples

{"query":{"wildcard":{"sku.keyword":{"value":"SKU-*-RED"}}}}

regexp

Regex match against a keyword field. Supports Lucene flavor regex (not full PCRE). Always anchored to the full term — no need for ^ or $.

⚠ Common pitfall: Regex queries can be exponentially slow on long terms; cap with max_determinized_states. For very high-cardinality fields, an ngram analyzer is faster than regex.

Examples

{"query":{"regexp":{"sku.keyword":{"value":"[A-Z]{3}-[0-9]{4}","max_determinized_states":10000}}}}

prefix

Prefix match — find terms starting with the given string. Cheap on keyword fields because the inverted index is sorted. Use this for type-ahead, not wildcard.

Examples

{"query":{"prefix":{"name.keyword":{"value":"Apple"}}}}

fuzzy

Levenshtein-distance fuzzy match — tolerates spelling mistakes. AUTO picks 0/1/2 edits based on term length. Slow on long terms; cap with prefix_length and max_expansions.

Examples

{"query":{"fuzzy":{"name":{"value":"helo","fuzziness":"AUTO","prefix_length":1}}}}

exists

Match documents where the field exists (has a non-null, non-empty value). The closest thing to IS NOT NULL in ES. Combine in must_not for IS NULL.

Examples

{"query":{"exists":{"field":"email"}}}

{"query":{"bool":{"must_not":[{"exists":{"field":"email"}}]}}}

ids

Fetch a small set of documents by id from one or more indices in a single query. Faster than separate _mget if you want the same query interface (scoring, highlighting).

Examples

{"query":{"ids":{"values":["1","2","42"]}}}

nested query

Query inside a nested field — required to enforce "all conditions match the same nested object". Use inner_hits to return which nested element actually matched.

Examples

{"query":{"nested":{"path":"variants","query":{"bool":{"must":[{"term":{"variants.color":"red"}},{"range":{"variants.price":{"lte":50}}}]}},"inner_hits":{}}}}

geo_distance

Match documents within distance D of a point. Pair with sort: _geo_distance for "nearest first". Default distance unit is meters; supports km, mi.

Examples

{"query":{"geo_distance":{"distance":"5km","location":{"lat":40.71,"lon":-74.0}}}}

function_score

Custom scoring — modify BM25 by recency, popularity, geo distance, random, or a script. Use score_mode and boost_mode to blend the original score with the modifier.

Examples

{"query":{"function_score":{"query":{"match":{"description":"wireless"}},"functions":[{"gauss":{"@timestamp":{"origin":"now","scale":"30d"}}}],"boost_mode":"multiply"}}}

highlight

Return per-hit snippets with matched terms wrapped in <em>...</em> (configurable). Three highlighters: unified (default, balanced), plain (slow but accurate), fvh (fast, needs term_vector=with_positions_offsets).

Examples

{"query":{"match":{"description":"wireless"}},"highlight":{"fields":{"description":{"fragment_size":150,"number_of_fragments":3}}}}

search_after (deep pagination)

Stable deep pagination — pass the sort values of the last hit as search_after on the next request. Replaces from + size for paging past 10,000.

⚠ Common pitfall: from + size deep paging breaks at index.max_result_window=10000 by default — and even before that, every coordinator pulls (from + size) docs from every shard. Use search_after or PIT.

Examples

{"size":20,"sort":[{"@timestamp":"desc"},{"_id":"asc"}],"search_after":["2026-05-26T10:00:00Z","abc123"]}

scroll · point_in_time

Scroll = legacy snapshot for offline export. PIT (point-in-time, 7.10+) = the modern replacement, pairs with search_after for resumable, snapshot-consistent paging.

Examples

POST /products/_pit?keep_alive=5m

GET /_search {"pit":{"id":"<pit-id>","keep_alive":"5m"},"size":100,"sort":[{"_shard_doc":"asc"}]}

match_all · match_none

match_all returns every document with a constant score of 1 — the default query and the right way to "give me everything" (paginated). match_none returns nothing, useful as a placeholder in templated queries.

Examples

{"query":{"match_all":{}}}

{"query":{"match_none":{}}}

terms_set

Match if a minimum number of the supplied terms are present, where the threshold comes from a field or script — "match if at least N of these skills are listed". The DSL way to express "any 3 of 5".

Examples

{"query":{"terms_set":{"skills":{"terms":["es","sql","python"],"minimum_should_match_field":"required_matches"}}}}

constant_score

Wrap a filter so every match gets the same fixed score (boost), skipping BM25 entirely. Use it when you want filter-cache speed but still need the clause inside a should to contribute a flat boost.

Examples

{"query":{"constant_score":{"filter":{"term":{"in_stock":true}},"boost":1.2}}}

dis_max

Disjunction-max — run several queries, take the single best-scoring one as the result score (plus tie_breaker × the rest). Stops a document from being double-counted when the same term matches multiple fields.

Examples

{"query":{"dis_max":{"queries":[{"match":{"title":"wireless"}},{"match":{"body":"wireless"}}],"tie_breaker":0.3}}}

query_string · simple_query_string

Parse a Lucene-syntax string ("wireless AND (headphones OR earbuds) -wired") into a query. query_string is powerful but throws on bad syntax; simple_query_string never errors — safe for raw end-user input.

⚠ Common pitfall: Never expose raw query_string to untrusted users — a malformed or hostile query (deep wildcards, huge boolean trees) can error or hammer the cluster. Use simple_query_string for public search boxes.

Examples

{"query":{"simple_query_string":{"query":"wireless +headphones -wired","fields":["name^2","description"]}}}

more_like_this (MLT)

Find documents similar to a given text or set of seed documents, based on shared significant terms. The classic "related articles" / "more like this product" feature without embeddings.

Examples

{"query":{"more_like_this":{"fields":["title","body"],"like":[{"_index":"articles","_id":"42"}],"min_term_freq":1,"max_query_terms":12}}}

knn search (vector)

Approximate k-nearest-neighbour search over a dense_vector field using HNSW (8.x top-level knn). Returns the k closest vectors to a query vector — the retrieval half of semantic / RAG search. Combine with a filter to restrict the candidate set.

Examples

POST /docs/_search {"knn":{"field":"embedding","query_vector":[0.12,0.83],"k":10,"num_candidates":100,"filter":{"term":{"lang":"zh"}}}}

distance_feature

Boost documents by how close a date or geo_point field is to an origin, decaying with distance — promote recent or nearby results cheaply inside a bool query. Faster than function_score for the common recency/proximity boost.

Examples

{"query":{"bool":{"must":{"match":{"description":"coffee"}},"should":{"distance_feature":{"field":"@timestamp","origin":"now","pivot":"7d"}}}}}

sort (multi-field, missing, mode)

Sort hits by one or more fields, each asc/desc, with missing:_first/_last for null handling and mode (min/max/avg/sum/median) to collapse array fields. Sorting by a field switches off score computation unless you ask for track_scores.

Examples

{"sort":[{"price":{"order":"asc","missing":"_last"}},{"_score":"desc"}]}

{"sort":[{"ratings":{"order":"desc","mode":"avg"}}]}

_source filtering · stored_fields

Control what each hit returns — _source:false drops the body entirely, _source:{includes,excludes} projects fields, fields:[...] returns formatted values (good for runtime fields and dates). Trims payload on wide documents.

Examples

{"_source":{"includes":["name","price"]},"query":{"match_all":{}}}

{"_source":false,"fields":["name","@timestamp"],"query":{"match_all":{}}}

GET /_search/template

Run a stored Mustache search template with runtime params — keep the query shape on the server, pass only values from the client. Cleaner and safer than building query JSON by string concatenation.

Examples

POST /_scripts/find_by_status {"script":{"lang":"mustache","source":{"query":{"term":{"status":"{{s}}"}}}}}

GET /products/_search/template {"id":"find_by_status","params":{"s":"active"}}

GET /_search?profile=true

Profile a query — get a per-shard, per-component breakdown of where time went (which subquery, rewrite, collector). The definitive tool for "why is this search slow?" once you have ruled out mapping mistakes.

Examples

GET /products/_search {"profile":true,"query":{"bool":{"must":[{"match":{"description":"wireless"}}]}}}

Aggregations (22)

terms agg

Group by field value, like SQL GROUP BY. Returns the top N buckets by doc count (default size=10). The single most-used aggregation — every facet, every "top sellers" report goes through this.

⚠ Common pitfall: terms agg on a high-cardinality field with default size=10 misses the long tail AND is approximate — see doc_count_error_upper_bound. Use composite agg when you need ALL buckets paginated.

Examples

{"size":0,"aggs":{"by_status":{"terms":{"field":"status","size":20}}}}

avg · sum · min · max · stats

Single-value metric aggregations on numeric fields. stats returns all five (count, min, max, avg, sum) in one pass — almost always preferable to running them separately.

Examples

{"size":0,"aggs":{"price_stats":{"stats":{"field":"price"}}}}

{"size":0,"aggs":{"avg_price":{"avg":{"field":"price"}}}}

cardinality

Approximate distinct-count using HyperLogLog++ — bounded memory regardless of cardinality. Tunable via precision_threshold (default 3000, max 40000) to trade accuracy for RAM.

⚠ Common pitfall: cardinality is APPROXIMATE — typical error 1-6%. If you need exact counts < 100k, use a terms agg + count buckets. Never use it for billing or compliance numbers.

Examples

{"size":0,"aggs":{"unique_users":{"cardinality":{"field":"user_id","precision_threshold":40000}}}}

date_histogram

Bucket docs by date interval — calendar_interval (1d, 1M, 1y, handles DST and month length) or fixed_interval (30s, 1h, 7d). The backbone of every time-series dashboard.

Examples

{"size":0,"aggs":{"daily":{"date_histogram":{"field":"@timestamp","calendar_interval":"1d","time_zone":"Asia/Shanghai"}}}}

histogram

Numeric histogram — bucket docs by fixed-width interval on a numeric field. Used for price ranges, latency distributions, etc.

Examples

{"size":0,"aggs":{"price_dist":{"histogram":{"field":"price","interval":50,"min_doc_count":1}}}}

range

Custom range buckets on numeric or date fields — define your own thresholds. Useful when histogram-fixed widths do not match business buckets (e.g. price tiers $0-49, $50-199, $200+).

Examples

{"size":0,"aggs":{"price_tier":{"range":{"field":"price","ranges":[{"to":50},{"from":50,"to":200},{"from":200}]}}}}

percentiles · percentile_ranks

Approximate percentile aggregations using t-digest or HDR algorithms. percentiles gives p50/p95/p99; percentile_ranks gives "what % is below X". The backbone of latency SLO dashboards.

Examples

{"size":0,"aggs":{"latency_p":{"percentiles":{"field":"latency_ms","percents":[50,95,99]}}}}

composite agg (paginate all buckets)

Walk EVERY bucket of one or more fields in stable order via after_key — the only safe way to "list all unique combinations" without loading them into memory at once.

Examples

{"size":0,"aggs":{"all_combos":{"composite":{"size":1000,"sources":[{"status":{"terms":{"field":"status"}}},{"day":{"date_histogram":{"field":"@timestamp","calendar_interval":"1d"}}}]}}}}

sub-aggregations (nested aggs)

Aggregations nest — every bucket can host metric or bucket sub-aggs. "Average price per status, broken down by day" is two levels deep and reads naturally.

Examples

{"size":0,"aggs":{"by_status":{"terms":{"field":"status"},"aggs":{"by_day":{"date_histogram":{"field":"@timestamp","calendar_interval":"1d"},"aggs":{"avg_price":{"avg":{"field":"price"}}}}}}}}

filter agg

Restrict a sub-aggregation to a subset of docs — like a one-off WHERE without affecting the outer query. Cheaper than re-running the query for each subset.

Examples

{"size":0,"aggs":{"in_stock_avg":{"filter":{"term":{"in_stock":true}},"aggs":{"avg_price":{"avg":{"field":"price"}}}}}}

top_hits agg

Inside each bucket, return the top N actual documents — by score or custom sort. Used for "show 3 sample docs per category" or "latest event per user" patterns.

Examples

{"size":0,"aggs":{"per_cat":{"terms":{"field":"category"},"aggs":{"sample":{"top_hits":{"size":3,"sort":[{"@timestamp":"desc"}]}}}}}}

significant_terms

Find terms that are statistically over-represented in a subset compared to the full corpus — surface "what makes this group different" without manual tuning.

Examples

{"size":0,"query":{"term":{"status":"fraud"}},"aggs":{"why_fraud":{"significant_terms":{"field":"ip_country"}}}}

value_count

Count how many documents have a value for a field — the metric counterpart of SQL COUNT(field). Unlike _count, it ignores docs where the field is missing, so it differs from the bucket doc_count.

Examples

{"size":0,"aggs":{"with_email":{"value_count":{"field":"email"}}}}

extended_stats

Like stats but adds variance, standard deviation, sum of squares, and standard-deviation bounds — the metrics you need for outlier detection and statistical anomaly thresholds.

Examples

{"size":0,"aggs":{"lat":{"extended_stats":{"field":"latency_ms","sigma":3}}}}

date_range

Bucket dates into named, explicit ranges (with date-math bounds like now-1M/M) — "this month vs last month vs older". Cleaner than date_histogram when you only need a handful of meaningful buckets.

Examples

{"size":0,"aggs":{"buckets":{"date_range":{"field":"@timestamp","ranges":[{"key":"last_7d","from":"now-7d/d"},{"key":"older","to":"now-7d/d"}]}}}}

nested agg

Step INTO a nested field so its sub-aggregations see one bucket per nested object, not per parent doc. Required to aggregate "average price across all variants" correctly when variants is a nested type. Pair with reverse_nested to climb back out.

Examples

{"size":0,"aggs":{"variants":{"nested":{"path":"variants"},"aggs":{"avg_price":{"avg":{"field":"variants.price"}}}}}}

filters agg (multi-bucket)

Define several named filters and get one bucket per filter in a single pass — "errors vs warnings vs info" counted together without three separate queries. The plural sibling of the single filter agg.

Examples

{"size":0,"aggs":{"levels":{"filters":{"filters":{"errors":{"term":{"level":"error"}},"warnings":{"term":{"level":"warn"}}}}}}}

global agg

Break a sub-aggregation out of the surrounding query so it sees ALL documents, not just the matched ones — compute "average price of everything" alongside "average price of the search results" in one request.

Examples

{"query":{"match":{"name":"wireless"}},"aggs":{"all":{"global":{},"aggs":{"avg_price":{"avg":{"field":"price"}}}}}}

bucket_script (pipeline)

A pipeline aggregation that computes a value FROM other sibling aggregations — e.g. conversion rate = sales ÷ visits per bucket. Lets you derive ratios and deltas in ES instead of post-processing in the client.

Examples

{"size":0,"aggs":{"by_day":{"date_histogram":{"field":"@timestamp","calendar_interval":"1d"},"aggs":{"sales":{"sum":{"field":"sale"}},"visits":{"sum":{"field":"visit"}},"cvr":{"bucket_script":{"buckets_path":{"s":"sales","v":"visits"},"script":"params.v > 0 ? params.s / params.v : 0"}}}}}}

derivative · cumulative_sum (pipeline)

Pipeline aggs over an ordered histogram — derivative gives the change between consecutive buckets ("daily growth"), cumulative_sum gives a running total ("total signups to date"). Both require a parent date_histogram or histogram.

Examples

{"size":0,"aggs":{"daily":{"date_histogram":{"field":"@timestamp","calendar_interval":"1d"},"aggs":{"signups":{"sum":{"field":"new_users"}},"growth":{"derivative":{"buckets_path":"signups"}},"total":{"cumulative_sum":{"buckets_path":"signups"}}}}}}

moving_fn (pipeline)

Run a window function (moving average, min, max, stdDev, linear-weighted) over a sliding window of histogram buckets — smooth a noisy time series or compute a 7-day trailing average right in the query.

Examples

{"size":0,"aggs":{"daily":{"date_histogram":{"field":"@timestamp","calendar_interval":"1d"},"aggs":{"v":{"sum":{"field":"sale"}},"ma7":{"moving_fn":{"buckets_path":"v","window":7,"script":"MovingFunctions.unweightedAvg(values)"}}}}}}

geohash_grid · geotile_grid

Bucket geo_point docs into a grid of cells at a chosen precision — the backend for heatmaps and clustered map markers. geotile_grid aligns to standard web-map tiles (z/x/y), so cells line up perfectly with map tiles.

Examples

{"size":0,"aggs":{"heat":{"geohash_grid":{"field":"location","precision":5}}}}

{"size":0,"aggs":{"tiles":{"geotile_grid":{"field":"location","precision":10}}}}

Analyzers (17)

standard analyzer

Default text analyzer — Unicode word boundaries (UAX #29), lowercase. Solid for most Western languages. NOT for CJK — Chinese / Japanese / Korean need ik / kuromoji / nori.

Examples

POST /_analyze {"analyzer":"standard","text":"Quick brown FOX 2026!"}

POST /_analyze

Test how a given analyzer tokenizes a string — the single most useful debugging tool when "my match query returns nothing". Always run this before opening a ticket.

Examples

POST /_analyze {"analyzer":"english","text":"Running quickly through the parks"}

POST /products/_analyze {"field":"name","text":"Wireless Headphones"}

language analyzers (english, french, ...)

Built-in language analyzers — stemming, stop words, lowercase per locale. english turns "running" into "run", "parks" into "park". Use these on user-facing text search for the right language.

Examples

{"properties":{"description":{"type":"text","analyzer":"english"}}}

custom analyzer

Compose char_filter + tokenizer + token_filter into your own analyzer. The recipe for autocomplete, search-as-you-type, dialect handling, and any non-default tokenization need.

Examples

PUT /demo {"settings":{"analysis":{"analyzer":{"my_a":{"type":"custom","tokenizer":"standard","filter":["lowercase","asciifolding","stop"]}}}}}

edge_ngram (autocomplete)

Generate prefixes of each token at index time — index "wireless" as ["wi","wir","wire",...,"wireless"]. Lets a simple match query power autocomplete with no leading wildcard.

⚠ Common pitfall: Use edge_ngram on the index analyzer ONLY. Set search_analyzer to standard / lowercase, or every search-time token also explodes into prefixes and you match way too much.

Examples

{"settings":{"analysis":{"filter":{"edge":{"type":"edge_ngram","min_gram":2,"max_gram":15}},"analyzer":{"ac_idx":{"tokenizer":"standard","filter":["lowercase","edge"]},"ac_search":{"tokenizer":"standard","filter":["lowercase"]}}}}}

ngram

Generate all substrings of length min_gram..max_gram from each token — enables "contains" search without wildcards. Costs disk and index time; reserve for short fields.

Examples

{"settings":{"analysis":{"filter":{"ng":{"type":"ngram","min_gram":3,"max_gram":4}},"analyzer":{"ng_a":{"tokenizer":"standard","filter":["lowercase","ng"]}}}}}

synonym filter

Map words to synonyms at index or search time — "tv" ⇄ "television", "laptop" ⇄ "notebook". Search-time synonyms cost no extra disk; index-time synonyms cost no extra query work but require reindex to update.

Examples

{"settings":{"analysis":{"filter":{"syn":{"type":"synonym","synonyms":["tv,television","laptop,notebook"]}},"analyzer":{"syn_a":{"tokenizer":"standard","filter":["lowercase","syn"]}}}}}

stop filter

Remove common stop words ("the", "a", "is") at index or search time. Saves index size and improves relevance — but breaks phrase queries that contain stop words ("to be or not to be").

Examples

{"settings":{"analysis":{"analyzer":{"my_a":{"tokenizer":"standard","filter":["lowercase","english_stop"]}},"filter":{"english_stop":{"type":"stop","stopwords":"_english_"}}}}}

asciifolding filter

Convert non-ASCII characters to their ASCII equivalents — "café" → "cafe", "naïve" → "naive". Lets users find accented terms by typing un-accented forms.

Examples

{"settings":{"analysis":{"analyzer":{"folding":{"tokenizer":"standard","filter":["lowercase","asciifolding"]}}}}}

CJK analyzers (ik, kuromoji, nori)

Standard analyzer treats CJK text as one big token — useless for Chinese/Japanese/Korean search. Install the right plugin: ik for Chinese, kuromoji for Japanese, nori for Korean.

⚠ Common pitfall: On managed services (AWS OpenSearch, Elastic Cloud), check whether the analyzer plugin is preinstalled before you ship — installing custom plugins may require a different tier.

Examples

PUT /zh_demo {"settings":{"analysis":{"analyzer":{"ik_a":{"type":"custom","tokenizer":"ik_smart"}}}}}

keyword analyzer

A no-op analyzer that emits the whole input as one token — no splitting, no lowercasing. Used on a text field when you actually want exact, un-tokenized matching but still want it to behave like text (e.g. for highlighting).

Examples

POST /_analyze {"analyzer":"keyword","text":"Wireless Headphones X1"}

whitespace · pattern · simple analyzers

Built-in lightweight analyzers — whitespace splits only on spaces (keeps case and punctuation), pattern splits on a regex you supply, simple splits on non-letters and lowercases. Handy for log tokens, codes, and CSV-like fields.

Examples

POST /_analyze {"analyzer":"whitespace","text":"ERROR 500 /api/v1"}

PUT /demo {"settings":{"analysis":{"analyzer":{"by_comma":{"type":"pattern","pattern":","}}}}}

search_as_you_type field

A field type that builds the edge-ngram sub-fields for you (._index_prefix, ._2gram, ._3gram) so a single multi_match powers as-you-type search without hand-wiring an edge_ngram analyzer.

Examples

{"properties":{"q":{"type":"search_as_you_type"}}}

{"query":{"multi_match":{"query":"head","type":"bool_prefix","fields":["q","q._2gram","q._3gram"]}}}

char_filter (mapping, html_strip, pattern_replace)

Pre-process the raw text BEFORE tokenization — html_strip removes tags, mapping swaps characters ("&" → "and"), pattern_replace runs a regex substitution. Runs before the tokenizer, so it can fix input the tokenizer would mangle.

Examples

PUT /demo {"settings":{"analysis":{"analyzer":{"clean":{"tokenizer":"standard","char_filter":["html_strip"]}}}}}

normalizer (keyword case-insensitive)

A token-filter-only pipeline for keyword fields — no tokenizer, just lowercase/asciifolding so "USA" and "usa" match exactly as one term. The right way to get case-insensitive exact match without switching to text.

Examples

PUT /demo {"settings":{"analysis":{"normalizer":{"lc":{"type":"custom","filter":["lowercase","asciifolding"]}}},"mappings":{"properties":{"country":{"type":"keyword","normalizer":"lc"}}}}}

analyzer vs search_analyzer

A field can use one analyzer at index time and a different one at search time. The classic pairing: edge_ngram on index, standard on search — so "wir" is indexed as prefixes but the query "wir" stays a single token.

⚠ Common pitfall: If you only set "analyzer", it applies to BOTH index and search. Forgetting to set search_analyzer with an edge_ngram index analyzer is the #1 cause of "autocomplete matches way too much".

Examples

{"properties":{"name":{"type":"text","analyzer":"ac_idx","search_analyzer":"standard"}}}

stemmer · kstem · porter

Token filters that reduce words to a root form — "running"/"ran" → "run". Choose algorithm by aggressiveness: kstem (light, keeps real words), porter/porter2 (classic), or a language-specific stemmer. Over-stemming hurts precision.

Examples

{"settings":{"analysis":{"filter":{"en_stem":{"type":"stemmer","language":"light_english"}},"analyzer":{"a":{"tokenizer":"standard","filter":["lowercase","en_stem"]}}}}}

Cluster & _cat (18)

GET /_cluster/health

Cluster health summary — status (green/yellow/red), node count, shard count, unassigned shards. green = all primaries + replicas assigned, yellow = primaries OK but some replicas unassigned, red = some primaries unassigned (data loss risk).

Examples

GET /_cluster/health

GET /_cluster/health?level=indices&wait_for_status=green&timeout=30s

GET /_cluster/stats

Cluster-wide stats — total docs, store size, JVM heap, OS load, indexing / search rates. The "top-level health snapshot" before drilling into nodes/indices.

Examples

GET /_cluster/stats

GET /_cluster/settings

Show cluster-wide dynamic and persistent settings. persistent survives a full cluster restart; transient is cleared on restart — almost always you want persistent.

Examples

GET /_cluster/settings?include_defaults=true

PUT /_cluster/settings {"persistent":{"cluster.routing.allocation.disk.watermark.low":"85%"}}

GET /_cluster/allocation/explain

Explain why a shard is or is not assigned to a node — the single best tool for debugging "yellow cluster forever" or "shard stuck in INITIALIZING". Returns the actual reason per node.

Examples

GET /_cluster/allocation/explain {"index":"products","shard":0,"primary":true}

GET /_cat/health

_cat APIs — human-friendly text tables, perfect for terminals and scripts. /_cat/health is the at-a-glance "is the cluster OK" check.

Examples

GET /_cat/health?v

GET /_cat/health?h=status,node.total,shards

GET /_cat/nodes

List every node — IP, role (m=master, d=data, i=ingest), heap %, CPU, load. Add ?v for a header row and ?h=name,role,heap.percent,ram.percent for tighter output.

Examples

GET /_cat/nodes?v

GET /_cat/nodes?h=name,role,heap.percent,ram.percent,cpu&v

GET /_cat/indices

Per-index summary — health, status, primary/replica counts, doc count, store size. Add ?s=store.size:desc to find your biggest indices fast.

Examples

GET /_cat/indices?v

GET /_cat/indices?v&s=store.size:desc&bytes=gb

GET /_cat/shards

List every shard with state (STARTED, INITIALIZING, RELOCATING, UNASSIGNED), node, size. Use ?h=index,shard,prirep,state,unassigned.reason to debug unassigned shards.

Examples

GET /_cat/shards?v

GET /_cat/shards?h=index,shard,prirep,state,unassigned.reason&v

GET /_cat/aliases · /_cat/templates · /_cat/recovery

More _cat helpers: aliases lists alias→index mappings, templates lists index templates by pattern, recovery shows shard recovery progress (useful during rolling restart).

Examples

GET /_cat/aliases?v

GET /_cat/templates?v

GET /_cat/recovery?v&active_only=true

GET /_nodes/stats

Detailed per-node stats — JVM, OS, process, fs, indices, thread_pool. Drill in with /_nodes/stats/thread_pool to find a saturated search / write thread pool.

Examples

GET /_nodes/stats/jvm,thread_pool

GET /_nodes/<node>/stats/indices/search

GET /_tasks

List long-running tasks (reindex, update_by_query, force-merge, snapshot) with progress. Cancel via POST /_tasks/<id>/_cancel — your kill-switch for runaway operations.

Examples

GET /_tasks?actions=*reindex&detailed=true

POST /_tasks/<task-id>/_cancel

GET /_nodes/hot_threads

Sample the hottest (most CPU-busy) threads across nodes — shows the actual stack traces eating CPU right now. The go-to when a node is pegged at 100% and you need to know which operation is to blame.

Examples

GET /_nodes/hot_threads?threads=3&interval=500ms

GET /_cat/thread_pool

Per-node thread-pool stats — active, queue, rejected counts for write, search, and other pools. A growing queue or non-zero rejected on the write/search pool is the clearest signal the cluster is overloaded.

Examples

GET /_cat/thread_pool/write,search?v&h=node_name,name,active,queue,rejected

POST /_cluster/reroute?retry_failed=true

Ask the cluster to re-attempt assigning shards that hit the max allocation-retry limit (5 by default) — the manual nudge after you have fixed the underlying cause (freed disk, restarted a node) of an UNASSIGNED shard.

⚠ Common pitfall: reroute?retry_failed only helps once the ROOT cause is gone — running it while disk is still full or a node is still down just burns the retry budget again. Check allocation/explain first.

Examples

POST /_cluster/reroute?retry_failed=true

GET /_cat/segments · /_cat/fielddata

More _cat diagnostics — segments shows per-shard Lucene segment counts/sizes (high count → consider force-merge on read-only indices), fielddata shows heap consumed by fielddata per field (find the field bloating heap).

Examples

GET /_cat/segments/products?v

GET /_cat/fielddata?v&s=size:desc

GET /_cat/count · /_cat/master · /_cat/allocation

Quick one-liners — count gives total docs across an index pattern, master shows the elected master node, allocation shows disk used/available and shard count per node (spot a lopsided node fast).

Examples

GET /_cat/count/logs-*?v

GET /_cat/master?v

GET /_cat/allocation?v

cluster.routing.allocation.enable

Temporarily control shard allocation — set to "none" before a rolling restart so the cluster does not waste effort rebalancing shards of a node you are about to bring right back. Set back to "all" afterwards.

⚠ Common pitfall: Forgetting to set it back to "all" after maintenance leaves new/replica shards permanently UNASSIGNED — a yellow/red cluster that no amount of waiting fixes. Always pair the disable with a re-enable.

Examples

PUT /_cluster/settings {"persistent":{"cluster.routing.allocation.enable":"none"}}

PUT /_cluster/settings {"persistent":{"cluster.routing.allocation.enable":"all"}}

GET /_cluster/pending_tasks

List cluster-state update tasks still waiting on the master — mapping updates, index creation, settings changes. A long pending-tasks queue means the master is a bottleneck (often from too many shards or rapid mapping churn).

Examples

GET /_cluster/pending_tasks

Replication & snapshots (14)

number_of_replicas

Replicas per primary shard — 1 is the production minimum (data survives one node loss), 2+ for higher availability or read throughput. Set 0 ONLY for throw-away test indices.

⚠ Common pitfall: replicas=0 means a single node loss = data loss. Lots of "test" indices accidentally become "kinda production" — set replicas=1 by default in every index template.

Examples

PUT /products/_settings {"index":{"number_of_replicas":2}}

PUT /_index_template/default {"index_patterns":["*"],"template":{"settings":{"number_of_replicas":1}}}

PUT /_snapshot/<repo>

Register a snapshot repository — S3, GCS, Azure, NFS, HDFS, or shared filesystem. Required before you can take or restore any snapshot. The only built-in disaster-recovery path.

Examples

PUT /_snapshot/s3_backups {"type":"s3","settings":{"bucket":"my-es-backups","region":"us-east-1"}}

PUT /_snapshot/<repo>/<snap>

Take a snapshot. Incremental — only new segments since the last snapshot in the same repo are uploaded. Safe to run on a live cluster.

Examples

PUT /_snapshot/s3_backups/2026-05-26?wait_for_completion=false {"indices":"products,logs-*","include_global_state":false}

POST /_snapshot/<repo>/<snap>/_restore

Restore a snapshot — optionally rename indices on restore so you can compare with current data before swapping aliases. The standard "oh shit" rollback path.

⚠ Common pitfall: Restore refuses to overwrite an existing OPEN index. Either close the index first or use rename_pattern + rename_replacement to restore to a side-by-side name.

Examples

POST /_snapshot/s3_backups/2026-05-26/_restore {"indices":"products","rename_pattern":"(.+)","rename_replacement":"restored_$1"}

GET /_snapshot/<repo>/_all

List all snapshots in a repo with state, start/end time, shard counts, failures. Required for any backup audit or scripted "find latest good snapshot" logic.

Examples

GET /_snapshot/s3_backups/_all

SLM (snapshot lifecycle management)

Built-in scheduler for periodic snapshots + retention. Beats hand-rolled cron because retention is policy-driven (keep 30 daily, 12 monthly, etc.) and ES enforces it.

Examples

PUT /_slm/policy/daily {"schedule":"0 30 1 * * ?","name":"<daily-{now/d}>","repository":"s3_backups","config":{"indices":["*"]},"retention":{"expire_after":"30d","min_count":7,"max_count":50}}

cross-cluster replication (CCR)

Continuously replicate indices from a leader cluster to one or more follower clusters — DR across regions, read-scaling, or geo-locality. Platinum/Enterprise license.

Examples

PUT /products-replica/_ccr/follow?wait_for_active_shards=1 {"remote_cluster":"leader_cluster","leader_index":"products"}

ILM (index lifecycle management)

Move indices through hot → warm → cold → frozen → delete phases automatically by age, size, or doc count. Pair with rollover for time-based logging at any scale.

Examples

PUT /_ilm/policy/logs_policy {"policy":{"phases":{"hot":{"actions":{"rollover":{"max_age":"7d","max_size":"50gb"}}},"warm":{"min_age":"30d","actions":{"forcemerge":{"max_num_segments":1}}},"delete":{"min_age":"90d","actions":{"delete":{}}}}}}

DELETE /_snapshot/<repo>/<snap>

Delete a snapshot. Because snapshots are incremental, ES only frees the segments not referenced by any other snapshot in the repo — deleting an old snapshot may free far less than its nominal size.

Examples

DELETE /_snapshot/s3_backups/2026-04-01

GET /_snapshot/<repo>/<snap>/_status

Detailed, real-time progress of a running or finished snapshot — per-shard bytes done vs total, file counts, state. Use this to watch a big snapshot, not _all (which only shows coarse state).

Examples

GET /_snapshot/s3_backups/2026-05-26/_status

POST /_snapshot/<repo>/_verify · _cleanup

verify checks that every node can read/write the repository (catches credential or network problems before a backup silently fails). cleanup removes orphaned data left in the repo by interrupted snapshot deletes.

Examples

POST /_snapshot/s3_backups/_verify

POST /_snapshot/s3_backups/_cleanup

searchable snapshots

Mount a snapshot directly as a searchable index without a full restore — the data stays in object storage (S3) and is fetched on demand. Powers the ILM cold/frozen tiers, cutting storage cost for rarely queried old data. Enterprise license.

Examples

POST /_snapshot/s3_backups/2026-01/_mount?wait_for_completion=true {"index":"logs-2026.01","renamed_index":"logs-2026.01-cold"}

wait_for_active_shards (write consistency)

On a write, require N copies of the target shard to be active before ES accepts it. Default 1 (just the primary); set "all" or a quorum when a single-copy ack is not durable enough for your data.

⚠ Common pitfall: Setting it higher than the number of available copies makes writes BLOCK until timeout, then fail — do not set "all" on an index with replicas=2 unless you can guarantee all replicas are up.

Examples

PUT /orders/_doc/1?wait_for_active_shards=2 {"total":99}

allocation awareness (rack / zone)

Tell ES which rack or availability-zone each node is in, and it will spread primary + replica across zones so one zone failure never takes out both copies of a shard. Essential for multi-AZ production clusters.

Examples

# elasticsearch.yml
node.attr.zone: zone-a
cluster.routing.allocation.awareness.attributes: zone

PUT /_cluster/settings {"persistent":{"cluster.routing.allocation.awareness.attributes":"zone"}}

Common pitfalls (16)

Heap > 32GB hits compressed-oops cliff

JVM compressed object pointers stop working above ~32GB heap — pointer size doubles, GC pressure spikes, throughput drops. Stay at OR below 30-31GB heap even on a 256GB box; run multiple nodes per host instead.

Examples

# jvm.options (good)
-Xms30g
-Xmx30g

# jvm.options (bad)
-Xms64g
-Xmx64g  # crossed the cliff

Mapping explosion from dynamic:true

Default dynamic:true on user JSON auto-creates a field per unique key. Bad client = millions of fields, gigabytes of cluster state, full GC, cluster unresponsive. The #1 reason production ES dies.

⚠ Common pitfall: Always set dynamic:strict on user input. Cap with index.mapping.total_fields.limit (default 1000) and refuse writes once approached. Audit GET /_cluster/state/metadata weekly for field count.

Examples

PUT /events_strict {"mappings":{"dynamic":"strict","properties":{"@timestamp":{"type":"date"},"event":{"type":"keyword"}}}}

Deep paging with from + size

Every from + size request asks every shard for (from + size) docs, then merges. from=10000 + size=10 on 5 shards = each shard ships 10010 docs = blown heap. Default cap: index.max_result_window=10000.

⚠ Common pitfall: Never raise max_result_window to "fix" deep paging — use search_after or a point_in_time + search_after. They scale with N regardless of how deep you are.

Examples

# wrong — will blow up
GET /products/_search {"from":100000,"size":10}

# right — sort + search_after
GET /products/_search {"size":10,"sort":[{"_id":"asc"}],"search_after":["abc123"]}

Refresh interval at default 1s

Default 1s refresh is fine for low-write apps but eats CPU on heavy write/log workloads — each refresh creates new segments, which later have to be merged. For log indices, raise refresh_interval to 30s.

Examples

PUT /logs-2026.05/_settings {"index":{"refresh_interval":"30s"}}

PUT /products/_settings {"index":{"refresh_interval":"-1"}}  # disable; manual _refresh only

Disk watermark blocks writes

At 85% disk full, ES stops allocating new shards to a node. At 90% it actively relocates. At 95% (flood_stage) it locks every index on that node into read-only. Free disk fast or remove the read-only block manually.

⚠ Common pitfall: Unlock with PUT /<index>/_settings {"index.blocks.read_only_allow_delete":null} — but ONLY after freeing disk, otherwise it locks again on the next watermark check.

Examples

PUT /_cluster/settings {"persistent":{"cluster.routing.allocation.disk.watermark.low":"85%","cluster.routing.allocation.disk.watermark.high":"90%","cluster.routing.allocation.disk.watermark.flood_stage":"95%"}}

PUT /*/_settings {"index.blocks.read_only_allow_delete":null}

Searching wrong field type

term query against a text field misses everything (text was lowercased + tokenized). match query against a keyword field misses anything not exact (keyword is not analyzed). Always check GET /<index>/_mapping before writing a query.

Examples

# wrong — term on text
{"query":{"term":{"name":"Apple"}}}  # 0 hits

# right — term on keyword sub-field
{"query":{"term":{"name.keyword":"Apple"}}}

Bulk size too small / too big

Bulk requests have a sweet spot around 5-15MB. Smaller = network round-trips dominate. Bigger = HTTP 413, OOM, or coordinating node falls over. Measure your payload — pick a doc count that lands in the band.

Examples

# python elasticsearch helpers
from elasticsearch.helpers import bulk
bulk(client, actions, chunk_size=2000, max_chunk_bytes=10*1024*1024)

Split brain (pre-7.x) / quorum loss (7.x+)

Pre-7.x: misconfigured discovery.zen.minimum_master_nodes could split a cluster into two "masters". 7.x+ uses voting config (auto-managed) — safer, but you still need ≥ 3 master-eligible nodes for quorum.

⚠ Common pitfall: Two master-eligible nodes is the WORST setup — any one going down loses quorum. Use 1 (test only), 3, 5, or 7 — odd numbers ≥ 3 for production.

Examples

# 3 master-eligible nodes, no voting-only nodes
node.roles: [ master, data, ingest ]

fielddata: true on text bloats heap

Enabling fielddata:true on a text field to "make it sortable" loads every term + every doc-id into heap. Easy way to OOM a node. Instead, add a keyword sub-field and aggregate/sort on that.

Examples

# wrong
{"properties":{"name":{"type":"text","fielddata":true}}}

# right
{"properties":{"name":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}}}}

Too many shards per node

Every shard carries fixed heap and cluster-state overhead regardless of how small it is. Thousands of tiny shards bloat the master, slow recovery, and exhaust heap. Rule of thumb: aim for shards of 10-50GB and keep well under ~20 shards per GB of heap.

⚠ Common pitfall: Oversharding usually comes from "1 index per customer per day" with default shard counts. Consolidate with data streams + rollover by size, and use _shrink on old read-only indices.

Examples

GET /_cat/allocation?v&h=node,shards,disk.indices

# consolidate: rollover by 50gb instead of 1 index/day
POST /logs/_rollover {"conditions":{"max_size":"50gb"}}

Aggregating on a high-cardinality keyword

A terms agg over millions of distinct keyword values builds a huge bucket map in heap and is still only approximate at default size. It is the second most common OOM source after fielddata-on-text.

⚠ Common pitfall: For "list ALL unique values" use a composite agg (paginated, bounded memory), not a terms agg with size:1000000. For distinct COUNT only, use cardinality (HLL) instead.

Examples

# wrong — builds a giant bucket map
{"aggs":{"u":{"terms":{"field":"user_id","size":1000000}}}}

# right — paginate with composite
{"aggs":{"u":{"composite":{"size":1000,"sources":[{"uid":{"terms":{"field":"user_id"}}}]}}}}

Wildcard / regexp / leading-wildcard on big fields

wildcard:"*foo*" and unanchored regexp cannot use the sorted inverted index — they scan every term in the field. On a high-cardinality field this turns a 5ms query into a multi-second cluster-stressing scan.

⚠ Common pitfall: For "contains" search use a wildcard field type (7.9+) or an ngram analyzer at index time; for "starts with" use prefix or a completion suggester. Reserve real wildcard queries for small, low-cardinality fields.

Examples

# slow on large fields
{"query":{"wildcard":{"msg.keyword":"*timeout*"}}}

# better — index with ngram and use match

Scripts in hot query paths

Painless in a script_score, script query, or runtime field runs per matching doc, every query — handy but CPU-heavy at scale. A script that touches doc-values across millions of hits will dominate your latency.

⚠ Common pitfall: Pre-compute the value at index time into a real field whenever the inputs are known at write time. Reserve scripts for genuinely dynamic, per-request logic — and always filter down the candidate set first.

Examples

# expensive
{"query":{"script_score":{"query":{"match_all":{}},"script":{"source":"Math.log(2 + doc['votes'].value)"}}}}

# cheaper — precompute log_votes at index time, sort on it

now in a filter kills the request cache

A range filter using bare now changes value every millisecond, so the shard request cache never gets a hit. Rounding to now/d (or now/h) makes the value stable within the day/hour and lets the cache work.

⚠ Common pitfall: Always round date-math in cacheable filters: now-7d/d instead of now-7d. The difference is the entire shard request cache being usable or dead for time-window dashboards.

Examples

# cache-busting
{"filter":{"range":{"@timestamp":{"gte":"now-7d"}}}}

# cache-friendly
{"filter":{"range":{"@timestamp":{"gte":"now-7d/d"}}}}

Index-time vs search-time analyzer mismatch

If a field was indexed with one analyzer and you query it expecting another, tokens never line up and match returns nothing. Common after changing an analyzer in the mapping WITHOUT reindexing the existing data.

⚠ Common pitfall: Changing an analyzer only affects docs indexed AFTER the change. Run POST /<index>/_analyze with both analyzers on the same text to confirm the tokens match, and reindex old data if they do not.

Examples

POST /products/_analyze {"field":"name","text":"Wireless"}

Returning huge _source per hit

Pulling a 50KB _source for every one of 1000 hits ships 50MB you may not need. If you only render a title and price, project them — the network and JSON-parse cost on big hit sets is real.

⚠ Common pitfall: Use _source includes/excludes, or fields + _source:false, to return only what the UI shows. For aggregation-only requests set size:0 so no hits are returned at all.

Examples

{"size":20,"_source":["title","price"],"query":{"match_all":{}}}

{"size":0,"aggs":{"by_cat":{"terms":{"field":"category"}}}}

What this tool does

Searchable Elasticsearch cheat sheet, 80+ entries SREs type into Kibana Dev Tools. Nine sections: index management (PUT/DELETE, aliases for zero-downtime reindex, composable templates, _rollover, _reindex with slices=auto, _forcemerge, refresh/flush), documents (_doc op_type=create, _update / _update_by_query, _bulk sweet-spot, _mget, _count), mapping (text vs keyword multi-fields, date, nested vs flattened object, geo_point, dynamic:strict to stop mapping explosion), Query DSL (match, match_phrase with slop, multi_match best_fields/cross_fields, term/terms, range with date math, bool must/should/must_not/filter, wildcard/regexp/ prefix/fuzzy, exists, nested with inner_hits, geo_distance, function_score with gauss decay, highlight, search_after, scroll vs point_in_time), aggregations (terms + composite, avg/sum/stats, cardinality HyperLogLog++, date_histogram with time_zone, histogram, range, percentiles, sub-aggs, filter agg, top_hits, significant_terms), analyzers (standard, language, custom char_filter + tokenizer + token_filter, edge_ngram for autocomplete, ngram, synonym, stop, asciifolding, CJK plugins ik/kuromoji/ nori), cluster ops (_cluster/health, allocation explain, every _cat API with the right columns, _tasks _cancel), replication and DR (number_of_replicas, snapshot repo, _snapshot create/restore with rename, SLM, CCR, ILM hot/warm/cold/frozen/delete), and production pitfalls (heap > 32GB cliff, mapping explosion from dynamic:true, deep paging with from + size, default 1s refresh on heavy writes, disk watermark flood_stage read-only lock, wrong field type, bulk size sweet spot, split-brain, fielddata:true heap OOM). Every entry: command + EN/ZH description + 1-3 pasteable JSON examples + common pitfall. Pure client-side — no connection, no upload. Pair with Redis, PostgreSQL, MongoDB and Docker cheat sheets.

Tool details

Input: Text; The page exposes text boxes, numeric controls, file pickers, or structured inputs depending on the tool.
Output: Live result + Copy + Preview; The result area focuses on usable output, with copy, download, or preview actions when supported.
Privacy: Browser-side processing; The main tool logic does not call an external API, so inputs normally stay in the current tab.
Save / share: No account required; Open the page and use it; whether results survive refresh depends on the tool.
Performance budget: Initial JS <= 32 KB; No WASM budget is declared, keeping the tool quick to open on mobile.
Best fit: Developer & DevOps · Developer; Category and role tags drive related tools, internal links, and quick fit checks.

How to use

1. Input

Paste or drop your content into the tool panel.
2. Process

Click the button. All processing is local in your browser.
3. Copy / Download

Copy the result or download to disk in one click.

How Elasticsearch Cheatsheet fits into your work

Use it in the small gaps between coding, reviewing, debugging, and shipping.

Developer jobs

Formatting, validating, shrinking, or inspecting code-adjacent text.
Preparing snippets for documentation, tickets, commits, or handoff.
Checking a small payload quickly without switching tools.

Developer checks

Run irreversible transforms like minify or obfuscate on a copy.
Keep secrets out of pasted snippets unless the tool explicitly stays local.
Use your normal tests or linter before shipping transformed code.

Good next steps

These links move the current task into a more complete workflow.

Real-world use cases

A term filter returns zero hits on a 40M-doc product index in prod
A "status:active" filter that worked in staging returns nothing in prod because the field got mapped as text, not keyword, so it was lowercased and tokenized. You filter the term query entries plus the "term returns nothing" pitfall, confirm the recipe is term on status.keyword, and run POST /_analyze in Dev Tools to prove how the value tokenized. Fix shipped in ten minutes instead of an afternoon of guessing.
Cutting a 6-shard 80GB index over to a new mapping with no downtime
Marketing needs name searchable as full text, but it was mapped keyword and you cannot change type in place. You pull the zero-downtime reindex entry, create products_v2, run _reindex with slices=auto and wait_for_completion=false, poll _tasks, then do the atomic _aliases remove+add swap in one payload. The 80GB cutover happens while the app keeps reading and writing through the products alias.
A 12-node log cluster goes red after a node hits 96% disk
Every index on that node flips read-only and ingest stalls. You grab the disk watermark flood_stage entry, confirm 95% triggers the read-only lock, run the _cat/allocation and allocation/explain commands from the cluster ops section to find the hot node, free space, then clear the read_only_allow_delete block. The cheat sheet hands you the exact PUT _settings call so you are not editing YAML at 2am.
Building a top-10-sellers-per-category dashboard panel
You need group-by category, then top sellers each with their revenue, in sub-second time over 30M orders. You filter to aggregations, copy the terms agg nested with top_hits and a sum sub-agg, paste it into Dev Tools, and tune size and shard_size from the pitfall note about terms-agg accuracy. The panel ships against live ES instead of a nightly export to a separate analytics database.

Common pitfalls

Running term on a text field and getting zero hits. Map strings as text plus a keyword sub-field and query term on field.keyword, or use match for full text.
Deep paging with from + size past 10000, which makes every shard return from+size docs and blows heap. Use search_after with a sort, or point_in_time for stable cursors.
Leaving refresh_interval at the 1s default on a heavy log ingest, burning most CPU on tiny segments. Raise it to 30s and force a _refresh only when you must read-your-write.

Privacy

This cheat sheet is a single static page. Search runs entirely in your browser against an in-memory array of entries, so nothing you type is sent anywhere, no Elasticsearch is contacted, and no index names, queries, or field values leave the tab. Nothing is written to the URL either, so a shared link carries no query text. Safe inside bastion-only, air-gapped, or proxied networks where installing Kibana is not an option.

FAQ

Tool combos

Folks in your role tend to reach for these alongside this tool.

Browse all tools for this role

Elasticsearch Cheatsheet — 80+ Query DSL, Mapping, Analyzer, Aggregation, Replication with Real Pitfalls

What this tool does

Tool details

How to use

1. Input

2. Process

3. Copy / Download

How Elasticsearch Cheatsheet fits into your work

Developer jobs

Developer checks

Good next steps

Real-world use cases

A term filter returns zero hits on a 40M-doc product index in prod

Cutting a 6-shard 80GB index over to a new mapping with no downtime

A 12-node log cluster goes red after a node hits 96% disk

Building a top-10-sellers-per-category dashboard panel

Common pitfalls

Privacy

FAQ

JSON Formatter & Validator

Redis Cheatsheet

PostgreSQL Cheatsheet

MongoDB Cheatsheet

Docker Cheatsheet

awk + sed Cheatsheet

AWS CLI Cheatsheet

Bash Cheatsheet

curl Cheatsheet

AI Eval Planner

Apache Cheatsheet

API Key Generator