use <db>Switch the current database context to <db>. Auto-creates the database lazily — it only appears in show dbs after the first document is written.
use myapp
use admin -- 切到管理库
MongoDB cheat sheet — 80+ shell commands & query operators for CRUD, aggregation, indexing, replication.
use <db>Switch the current database context to <db>. Auto-creates the database lazily — it only appears in show dbs after the first document is written.
use myapp
use admin -- 切到管理库
show dbs / show collectionsList every existing database, or every collection in the current database. show dbs hides empty databases — newly use-d databases without writes are not listed.
show dbs
show collections
show users -- 当前库里的用户
db.help() / db.<coll>.help()Print the available methods for the db object or for a collection. Cheap way to discover the API surface without leaving the shell.
db.help()
db.users.help()
db.users.find().help() -- cursor 上的方法
db.<coll>.stats()Return storage stats for one collection: document count, size, average doc size, total index size, indexes per name. Use as the first triage step on a hot collection.
db.orders.stats()
db.orders.stats({scale: 1024 * 1024}) -- 以 MB 显示db.serverStatus()Return server-wide stats: connections, opcounters, memory, network, replication lag, WiredTiger cache. The single most useful diagnostic command.
⚠ Common pitfall: On a busy cluster this returns hundreds of KB of JSON. Project the section you care about: db.serverStatus().connections, .wiredTiger.cache, .repl.
db.serverStatus().connections
db.serverStatus().wiredTiger.cache
db.serverStatus().repl
db.version() / db.hostInfo()Return the MongoDB server version, or detailed host info (OS, kernel, CPU, memory). Always log these at app startup for support tickets.
db.version() -- "7.0.5"
db.hostInfo().os
db.hostInfo().system
db.runCommand({...})Execute any database command via its raw document form. The lowest-level call — every helper (createIndex, collMod, etc.) is sugar over runCommand under the hood.
db.runCommand({ping: 1})db.runCommand({collStats: "orders"})db.runCommand({listIndexes: "orders"})db.<coll>.drop() / db.dropDatabase()Drop a single collection (including its indexes), or drop the entire current database. Both are irreversible without a recent backup or snapshot.
⚠ Common pitfall: No confirmation prompt. db.dropDatabase() on the wrong shell tab nukes prod silently. Always re-check the use line before typing drop.
db.tmp_import.drop()
db.dropDatabase() -- 当前库整个删
db.getCollectionNames() / db.getCollectionInfos()getCollectionNames() returns a plain string array of collection names; getCollectionInfos() returns full metadata (type, options, view definitions, UUID). Use the latter to spot views vs real collections.
db.getCollectionNames()
db.getCollectionInfos({type: "view"}) // 只列视图db.adminCommand({...})Run a command against the admin database regardless of the current db context. Required for cluster-wide commands like listDatabases, getParameter, setParameter, and currentOp.
db.adminCommand({listDatabases: 1})db.adminCommand({getParameter: 1, featureCompatibilityVersion: 1})db.adminCommand({setParameter: 1, logLevel: 2})db.currentOp() / db.killOp(opid)currentOp() lists in-progress operations on the server; killOp(opid) terminates one by its opid. The first-line tool for catching and killing a runaway query in production.
⚠ Common pitfall: killOp on a write that holds locks does not roll back instantly — it interrupts at the next safe point. A long-running $out or index build may take seconds to actually stop after killOp.
db.currentOp({"secs_running": {$gte: 5}, "op": "query"})db.killOp(12345)
db.<coll>.renameCollection("newName")Rename a collection within the same database (atomic metadata operation, no data copy). Pass {dropTarget: true} to overwrite an existing collection of the new name.
⚠ Common pitfall: renameCollection cannot move a collection across databases, and it does not work on sharded collections. For cross-db moves you must dump and restore.
db.orders_2026.renameCollection("orders")db.staging.renameCollection("orders", {dropTarget: true})db.createCollection(name, options)Explicitly create a collection with options most collections never need set at insert time: capped size, schema validator, time-series config, or default collation.
db.createCollection("logs", {capped: true, size: 104857600, max: 100000})db.createCollection("metrics", {timeseries: {timeField: "ts", metaField: "tags", granularity: "seconds"}})db.runCommand({collMod: "coll", ...})Modify collection options in place: change a schema validator, set validationLevel/validationAction, adjust a TTL expireAfterSeconds, or hide/unhide an index without dropping it.
db.runCommand({collMod: "sessions", index: {keyPattern: {created_at: 1}, expireAfterSeconds: 7200}})db.runCommand({collMod: "users", validationAction: "warn"})db.fsyncLock() / db.fsyncUnlock()fsyncLock() flushes pending writes to disk and blocks all further writes — used to take a consistent filesystem snapshot for backup. fsyncUnlock() releases the lock and resumes writes.
⚠ Common pitfall: A forgotten fsyncLock silently blocks every write to the node. Always pair it with fsyncUnlock in a finally block, and never run it on the primary of a busy production set.
db.fsyncLock()
db.fsyncUnlock()
db.<coll>.insertOne(doc)Insert one document. Auto-adds an _id (ObjectId) if missing. Returns {acknowledged, insertedId}. Subject to the collection write concern (default w: "majority" on 5.0+).
db.users.insertOne({name: "Alice", age: 30, tags: ["admin"]})db.users.insertOne({_id: "u-1001", name: "Bob"})db.<coll>.insertMany([doc, ...])Insert an array of documents in one round-trip. Default is ordered: stops at the first error. Pass {ordered: false} to keep going and report errors at the end.
⚠ Common pitfall: Ordered: false massively speeds up bulk loads but does NOT make duplicate _id errors disappear — failed docs are just collected and reported in the BulkWriteError at the end.
db.users.insertMany([{name: "Alice"}, {name: "Bob"}])db.users.insertMany(docs, {ordered: false}) -- 批量导入推荐db.<coll>.find(filter, projection)Return a cursor over matching documents. The filter is a match document; the projection limits the fields returned. Empty filter {} matches everything.
⚠ Common pitfall: find() without a limit on a 10M-doc collection streams everything to the client. Always pair with .limit(N) in app code; the mongo shell auto-cuts at 20 but drivers do not.
db.users.find({age: {$gte: 18}})db.users.find({}, {name: 1, email: 1, _id: 0}) -- 投影db.users.find({status: "active"}).sort({created_at: -1}).limit(20)db.<coll>.findOne(filter, projection)Return the FIRST matching document (or null) instead of a cursor. Equivalent to find().limit(1).next() but with cleaner ergonomics for single-record lookups.
db.users.findOne({_id: ObjectId("65...")})db.users.findOne({email: "alice@x.io"}, {password_hash: 0})db.<coll>.updateOne(filter, update, options)Update the FIRST matching document. The update must use operators ($set, $inc, $push) — bare field assignment is a replace, not an update. Add {upsert: true} for "create if missing".
⚠ Common pitfall: Forgetting $set is the #1 mistake: db.users.updateOne({_id: x}, {name: "Alice"}) REPLACES the entire document with {_id: x, name: "Alice"}, dropping every other field. Always use $set.
db.users.updateOne({_id: "u-1001"}, {$set: {age: 31}})db.counters.updateOne({name: "page_views"}, {$inc: {n: 1}}, {upsert: true})db.users.updateOne({_id: "u-1001"}, {$push: {tags: "vip"}})db.<coll>.updateMany(filter, update, options)Apply the update to EVERY matching document. Same operator rules as updateOne. Atomic per document, but not across the batch — a server crash mid-batch leaves some docs updated.
db.users.updateMany({status: "trial"}, {$set: {status: "expired"}})db.orders.updateMany({region: "us-east"}, {$inc: {tax_cents: 100}})db.<coll>.replaceOne(filter, replacement)Replace the FIRST matching document with a new one (preserving _id). The replacement is a plain doc — no $ operators. For partial updates use updateOne instead.
db.users.replaceOne({_id: "u-1001"}, {name: "Alice", age: 31, tags: []})db.<coll>.findOneAndUpdate(filter, update, options)Atomically find a document, update it, and return either the BEFORE or AFTER snapshot. Set {returnDocument: "after"} for the post-update version. Foundation of safe counter-and-read patterns.
db.counters.findOneAndUpdate({_id: "order_id"}, {$inc: {seq: 1}}, {returnDocument: "after", upsert: true})db.jobs.findOneAndUpdate({status: "pending"}, {$set: {status: "running", worker: "w-1"}}, {sort: {priority: -1}, returnDocument: "after"})db.<coll>.findOneAndDelete(filter, options)Atomically find a document, delete it, and return the deleted document. Pairs naturally with sort + filter to implement an at-most-once job queue.
db.jobs.findOneAndDelete({status: "pending"}, {sort: {priority: -1}})db.<coll>.deleteOne(filter) / deleteMany(filter)Delete the FIRST matching document, or every matching document. Use a precise filter — deleteMany({}) wipes the entire collection.
⚠ Common pitfall: deleteMany({}) silently succeeds and removes everything. There is no confirmation. For "delete all", drop() is faster (no per-doc oplog entries) — but neither is reversible.
db.users.deleteOne({_id: "u-1001"})db.users.deleteMany({status: "expired", expires_at: {$lt: new Date()}})db.<coll>.bulkWrite([ops...], options)Execute a heterogeneous batch of insertOne / updateOne / updateMany / replaceOne / deleteOne / deleteMany operations in a single round-trip. The right primitive for ETL and migrations.
⚠ Common pitfall: Default is ordered: true — a single failure stops the whole batch. For bulk loads, pass {ordered: false} to keep going; the BulkWriteError at the end lists every failed op.
db.users.bulkWrite([
{insertOne: {document: {name: "Alice"}}},
{updateOne: {filter: {_id: "u-1"}, update: {$set: {age: 30}}}},
{deleteOne: {filter: {status: "expired"}}}
], {ordered: false})db.<coll>.countDocuments(filter) / estimatedDocumentCount()Count documents matching a filter (accurate, uses an aggregation under the hood) or get the fast collection-level count from metadata (estimated, ignores filter).
⚠ Common pitfall: count() (no Documents suffix) is deprecated since 4.0. countDocuments() is accurate but O(matching); on huge collections without an index it scans everything. estimatedDocumentCount() is O(1) but stale during writes.
db.users.countDocuments({status: "active"})db.users.estimatedDocumentCount() -- 快速整表计数
db.<coll>.distinct(field, filter)Return an array of distinct values for a field across matching documents. Capped at 16 MB total response — switch to an aggregation $group for large cardinality.
db.users.distinct("country")db.users.distinct("country", {status: "active"})updateOne(f, {$unset: {field: ""}})Remove a field entirely from matching documents. The value next to the field name is ignored — convention is an empty string. Different from setting it to null, which keeps the key.
db.users.updateOne({_id: "u-1"}, {$unset: {temp_token: ""}})db.users.updateMany({}, {$unset: {legacy_field: ""}}) // 迁移后清字段updateOne(f, {$rename: {"old": "new"}})Rename a field on matching documents. Works on nested paths with dot notation. The new name must not collide with an existing field in the same document.
⚠ Common pitfall: $rename does not work across array elements — you cannot $rename "items.$.sku". For array element rewrites, read-modify-write or an aggregation pipeline update is required.
db.users.updateMany({}, {$rename: {"fullname": "name"}})db.users.updateMany({}, {$rename: {"address.zip": "address.postal_code"}})updateOne(f, {$addToSet: {tags: "x"}})Add a value to an array only if it is not already present — keeps the array set-like with no duplicates. Use $each to add multiple values at once.
db.users.updateOne({_id: "u-1"}, {$addToSet: {tags: "vip"}})db.users.updateOne({_id: "u-1"}, {$addToSet: {tags: {$each: ["vip", "beta"]}}})$push with $each / $slice / $sort / $positionA rich $push appends multiple values ($each), caps the array length ($slice), keeps it ordered ($sort), or inserts at an index ($position). Together they maintain a bounded, sorted leaderboard in one update.
⚠ Common pitfall: $slice with a negative number keeps the LAST N elements; a positive number keeps the FIRST N. To cap a leaderboard at the top 10, sort descending then $slice: 10.
db.scores.updateOne({game: "g1"}, {$push: {top: {$each: [{u: "a", s: 90}], $sort: {s: -1}, $slice: 10}}})db.feed.updateOne({_id: "f1"}, {$push: {items: {$each: [x], $position: 0, $slice: 100}}})$pull / $pullAll / $pop$pull removes every array element matching a condition; $pullAll removes all listed exact values; $pop removes the first (-1) or last (1) element. The standard array-removal trio.
db.users.updateOne({_id: "u-1"}, {$pull: {tags: "expired"}})db.users.updateOne({_id: "u-1"}, {$pull: {scores: {$lt: 60}}}) // 按条件删db.feed.updateOne({_id: "f1"}, {$pop: {items: 1}}) // 删末尾updateOne(f, {$set: {"arr.$[el].done": true}}, {arrayFilters: [...]})Filtered positional operator $[<id>] updates every array element matching an arrayFilters condition. The modern way to update specific nested array elements without read-modify-write.
⚠ Common pitfall: The $[] all-positional operator (no identifier) updates EVERY element of the array — easy to confuse with the filtered $[<id>] form. Use $[<id>] + arrayFilters when you want only some elements.
db.orders.updateOne({_id: "o-1"}, {$set: {"items.$[it].shipped": true}}, {arrayFilters: [{"it.sku": "abc"}]})db.users.updateMany({}, {$inc: {"scores.$[].value": 1}}) // 每个元素 +1updateOne(f, [{$set: {...}}]) // pipeline updateAn aggregation-pipeline update (4.2+) lets the update reference other fields of the same document using expressions — e.g. set total = price * qty. Impossible with classic operators.
db.orders.updateMany({}, [{$set: {total: {$multiply: ["$price", "$qty"]}}}])db.users.updateMany({}, [{$set: {name: {$toLower: "$name"}}}]) // 引用自身字段db.<coll>.watch(pipeline, options) // change streamOpen a change stream — a resumable cursor that emits insert/update/replace/delete events as they happen on a replica set or sharded cluster. The backbone of real-time sync without polling.
⚠ Common pitfall: Change streams require a replica set (even a single-node one) — they do not work on a standalone mongod. Persist the resumeToken so a restarted consumer picks up exactly where it left off.
const cs = db.orders.watch([{$match: {operationType: "insert"}}])db.orders.watch([], {resumeAfter: token, fullDocument: "updateLookup"})findAndModify (legacy helper)The older single-method form behind findOneAndUpdate / findOneAndDelete. Takes {query, update, remove, new, sort, upsert} in one document. Modern code should prefer the explicit findOneAnd* helpers.
db.counters.findAndModify({query: {_id: "seq"}, update: {$inc: {n: 1}}, new: true, upsert: true}){field: {$eq: value}}Match documents where field equals value. Equivalent to bare {field: value}; the $eq form is mainly useful inside $elemMatch and other compound operators.
db.users.find({age: {$eq: 30}})db.users.find({age: 30}) -- 等价{field: {$gt: v}} / $gte / $lt / $lteComparison operators. Work on numbers, dates, strings (lex order), and ObjectIds (which embed a creation timestamp). Combine with sort + index for fast time-range queries.
db.orders.find({total_cents: {$gte: 10000}})db.events.find({created_at: {$gte: ISODate("2026-05-01"), $lt: ISODate("2026-06-01")}})db.users.find({_id: {$gt: ObjectId.createFromTime(1716700000)}}){field: {$in: [v1, v2, ...]}}Match documents where field equals ANY of the listed values. Uses an index when one exists on the field. The most common multi-value lookup.
⚠ Common pitfall: Huge $in arrays (10k+ values) blow up memory on every match and miss index efficiency past a threshold. Page large ID lists, or use $lookup with a staging collection.
db.users.find({status: {$in: ["active", "trial"]}})db.products.find({_id: {$in: [ObjectId("..."), ObjectId("...")]}}){field: {$nin: [...]}} / {field: {$ne: v}}$nin matches docs where field is NOT in the list; $ne matches docs where field does not equal value. Both can be slow — they cannot use a forward index scan and may fall back to a collection scan.
⚠ Common pitfall: $ne and $nin also match documents where the field is MISSING. To exclude missing-field docs, combine with {field: {$exists: true, $ne: v}}.
db.users.find({country: {$ne: "US"}})db.users.find({role: {$nin: ["admin", "owner"]}}){field: {$regex: /pattern/, $options: "i"}}Match using a JavaScript regular expression. Anchored prefix patterns (/^foo/) can use an index on the field; unanchored or case-insensitive scans cannot.
⚠ Common pitfall: db.users.find({email: /alice/}) on a 10M-doc collection scans every document, even with an index on email. Add the ^ anchor, or use a text index for full-text search.
db.users.find({email: /^alice@/}) -- 锚定,能用索引db.users.find({name: {$regex: "^al", $options: "i"}}){field: {$exists: true|false}}Match documents where the field is present (or absent). $exists: true also matches fields explicitly set to null. Useful for sparse schemas during migration.
db.users.find({deleted_at: {$exists: false}})db.users.find({phone: {$exists: true, $ne: null}}){field: {$type: "string"|"int"|"date"|...}}Match by BSON type. Useful for finding documents where a field has the wrong type after a sloppy migration, or for distinguishing int vs double.
db.users.find({age: {$type: "string"}}) -- 找误存为字符串的年龄db.events.find({created_at: {$type: "date"}}){array_field: {$elemMatch: {a: 1, b: {$gt: 2}}}}Match documents where ONE array element satisfies ALL the listed conditions. Without $elemMatch, separate predicates can match across different array elements.
⚠ Common pitfall: Common bug: db.users.find({"scores.subject": "math", "scores.value": {$gt: 90}}) matches a user with math=70 AND english=95. Use $elemMatch to bind both predicates to the same array element.
db.users.find({scores: {$elemMatch: {subject: "math", value: {$gt: 90}}}}){$or: [{...}, {...}]} / {$and: [...]}Logical combinators. $or returns the union of sub-queries; $and is implicit when you list multiple fields, but explicit $and is required when combining multiple conditions on the SAME field.
⚠ Common pitfall: {age: {$gt: 18}, age: {$lt: 65}} silently discards the first clause — JS object keys are unique. Use {$and: [{age: {$gt: 18}}, {age: {$lt: 65}}]} or the shorthand {age: {$gt: 18, $lt: 65}}.
db.users.find({$or: [{status: "active"}, {created_at: {$gt: lastWeek}}]})db.users.find({age: {$gt: 18, $lt: 65}}) -- 同字段多条件简写{field: {$not: {$gt: 100}}}Logical negation of a single-operator expression. Like $ne and $nin, $not also matches documents where the field is missing — guard with $exists when needed.
db.users.find({age: {$not: {$gt: 100}}}) -- 100 岁及以下或字段缺失{array_field: {$all: [v1, v2]}} / {$size: N}$all matches arrays containing ALL listed values (in any order); $size matches arrays with exactly N elements. $size cannot use range — for "≥ N" use {"array_field.N-1": {$exists: true}}.
db.posts.find({tags: {$all: ["mongo", "tutorial"]}})db.users.find({roles: {$size: 0}}) -- 没角色的用户{field: {$mod: [divisor, remainder]}}Match numeric fields where value mod divisor equals remainder. Niche operator — useful for sampling (every 100th doc) without an extra modulo column.
db.events.find({user_id: {$mod: [100, 0]}}) -- 1% 采样{$expr: {$gt: ["$a", "$b"]}}Compare two fields of the SAME document using aggregation expressions. The only way to write "where col_a > col_b" in MongoDB query syntax.
db.orders.find({$expr: {$gt: ["$total_cents", "$paid_cents"]}})db.users.find({$expr: {$lt: ["$last_login", "$created_at"]}}) -- 数据异常排查{field: null} // null vs missingMatching {field: null} returns documents where the field is explicitly null AND documents where the field is absent. To match only explicit null, combine with {$exists: true}.
⚠ Common pitfall: This conflation surprises people migrating from SQL where NULL and "no column" are unrelated. {field: {$type: "null"}} matches ONLY explicit null and never the missing case.
db.users.find({deleted_at: null}) // null 或缺失db.users.find({deleted_at: {$type: "null"}}) // 仅显式 null{"a.b.c": value} // dot notation on nestedDot notation queries fields inside embedded documents. The path is a single string key — quote it. Indexes can be built on the same dotted path for fast nested lookups.
db.users.find({"address.city": "Tokyo"})db.users.createIndex({"address.city": 1}){field: {$bitsAllSet / $bitsAnySet: mask}}Bitwise query operators match integer fields by bit position. $bitsAllSet requires all mask bits set; $bitsAnySet requires at least one. Useful for permission flags packed into a single int.
db.users.find({perms: {$bitsAllSet: 6}}) // 第 1、2 位都置位db.users.find({perms: {$bitsAnySet: [0, 3]}}) // 第 0 或第 3 位置位{loc: {$geoWithin: {$centerSphere: [[lng, lat], radiusRad]}}}Match points within a spherical circle. $centerSphere takes a center and a radius in radians (distance ÷ 6378.1 km for Earth). Works without a geo index but far faster with a 2dsphere index.
db.places.find({loc: {$geoWithin: {$centerSphere: [[-73.97, 40.77], 5 / 6378.1]}}}) // 半径 5km{field: {$jsonSchema: {...}}}$jsonSchema validates a document against a JSON Schema inside a query — match docs that conform (or, negated, those that violate). Reuse the same schema attached to the collection validator.
db.users.find({$nor: [{$jsonSchema: {required: ["email"], properties: {email: {bsonType: "string"}}}}]}) // 找违规文档{$text: {$search: "a -b \"exact phrase\""}}Text search syntax: space-separated terms are OR-ed, a leading minus excludes a term, and a double-quoted run requires that exact phrase. Requires a text index on the collection.
db.posts.find({$text: {$search: "mongodb tutorial -beginner"}})db.posts.find({$text: {$search: "\"aggregation pipeline\""}}) // 精确短语{$comment: "trace-id-123"}Attach an arbitrary comment to a query — it shows up in the profiler, slow-query log, and currentOp output. The fastest way to trace which app code path issued a slow query in prod.
db.orders.find({status: "paid"}).comment("checkout-summary-v2")db.orders.find({$query: {status: "paid"}, $comment: "report-job"})cursor.hint({index: 1})Force the query planner to use a specific index instead of letting it choose. Use sparingly — usually to work around a planner that picks a worse plan, or to benchmark a candidate index.
⚠ Common pitfall: A hint that names a non-existent or ineligible index errors out rather than falling back. Hints also freeze the plan: a future better index will be ignored until you remove the hint.
db.orders.find({region: "us", status: "paid"}).hint({region: 1, status: 1})db.orders.find({}).hint({$natural: 1}) // 强制按磁盘顺序扫find({}, {field1: 1, field2: 1, _id: 0})Inclusion projection: return only the listed fields. _id is included by default unless explicitly set to 0. You cannot mix include and exclude (except _id).
⚠ Common pitfall: db.users.find({}, {name: 1, age: 0}) errors at the server with "Cannot do exclusion on field age in inclusion projection". Pick one mode (the _id field is the only exception).
db.users.find({}, {name: 1, email: 1, _id: 0})db.users.find({}, {password_hash: 0, ssn: 0}) -- 排除敏感字段find({}, {nested: {$elemMatch: {...}}})$elemMatch projection returns only the FIRST array element that matches the sub-filter. Use it to extract one matching review out of an array of reviews per product.
db.products.find({_id: pid}, {reviews: {$elemMatch: {user_id: uid}}})find({}, {array: {$slice: N}}) / $slice: [skip, limit]Limit the array elements returned. $slice: N returns the first (or last, if negative) N elements; $slice: [skip, limit] is array pagination. Server-side, no extra round-trip.
db.posts.find({}, {comments: {$slice: 10}}) -- 前 10 条评论db.posts.find({}, {comments: {$slice: [20, 10]}}) -- 跳 20 取 10find({}, {"array.$": 1})Positional projection returns the FIRST array element that matched the query filter. Requires a filter expression on the same array — otherwise no element is projected.
db.users.find({"scores.subject": "math"}, {"scores.$": 1})find({$text: {$search: "q"}}, {score: {$meta: "textScore"}})Project the text-search relevance score onto each result using {$meta: "textScore"}. Required for sort({score: {$meta: "textScore"}}) to rank results.
db.posts.find({$text: {$search: "mongodb tutorial"}}, {score: {$meta: "textScore"}}).sort({score: {$meta: "textScore"}})find({}, {computed: {$cond: [...]}}) // expressive projectionSince 4.4, find() projection accepts aggregation expressions, not just 0/1. Compute a derived field ($concat, $cond, $dateToString) without dropping into an aggregate() call.
db.users.find({}, {full: {$concat: ["$first", " ", "$last"]}, _id: 0})db.orders.find({}, {label: {$cond: [{$gte: ["$total", 100]}, "big", "small"]}})find({}, {field: {$meta: "indexKey"}})Project the raw index key used to satisfy a query via {$meta: "indexKey"} — purely a debugging aid to confirm which index served the read and what key values it matched on.
db.orders.find({region: "us"}, {k: {$meta: "indexKey"}}).hint({region: 1})projection excludes _id only: {_id: 0}A projection of just {_id: 0} is the one legal mix of exclusion with the implicit include-all — it returns every field EXCEPT _id. Handy for $out / export where _id collides on reinsert.
db.users.find({}, {_id: 0}) // 全字段去 _idaggregation $project: {_id: 0, keep: 1}In an aggregation $project you can freely mix include and exclude (unlike find projection), because each field is an independent expression. Exclude _id and include a computed field in the same stage.
db.orders.aggregate([{$project: {_id: 0, customer: 1, dollars: {$divide: ["$total_cents", 100]}}}])$match: {<filter>}Filter documents flowing through a pipeline. Equivalent to find()`s filter — same operators, same index usage. Always put $match as early as possible to shrink the data set.
db.orders.aggregate([
{$match: {status: "paid", created_at: {$gte: lastMonth}}}
])$group: {_id: <expr>, <field>: {$sum/$avg/...: ...}}Group documents by a key and apply accumulators ($sum, $avg, $min, $max, $push, $addToSet, $first, $last). The single most powerful stage — your SQL GROUP BY.
⚠ Common pitfall: $group buffers each group in memory by default; with 100M docs this OOMs. Pass {allowDiskUse: true} to the aggregate() call to spill to disk on huge groupings.
db.orders.aggregate([
{$group: {_id: "$user_id", total: {$sum: "$total_cents"}, n: {$sum: 1}}}
])db.events.aggregate([
{$group: {_id: {y: {$year: "$at"}, m: {$month: "$at"}}, count: {$sum: 1}}}
], {allowDiskUse: true})$project: {<field>: <expr>, ...}Reshape documents: include, exclude, rename, or compute new fields. Unlike find() projection, $project supports full aggregation expressions ($concat, $cond, $divide, …).
db.orders.aggregate([
{$project: {customer: 1, dollars: {$divide: ["$total_cents", 100]}, _id: 0}}
])$lookup: {from, localField, foreignField, as}Left outer join to another collection. Appends matching foreign docs as an array on each input doc under as. The Mongo equivalent of SQL LEFT JOIN.
⚠ Common pitfall: $lookup without an index on foreignField does an O(N*M) scan — every input doc triggers a full scan of the foreign collection. Always index foreignField.
db.orders.aggregate([
{$lookup: {from: "users", localField: "user_id", foreignField: "_id", as: "user"}},
{$unwind: "$user"}
])$unwind: "$arrayField"Expand each array element into its own document — fan-out by one row per array element. Pass {preserveNullAndEmptyArrays: true} to keep docs whose array is missing/empty.
db.orders.aggregate([{$unwind: "$items"}, {$group: {_id: "$items.sku", units: {$sum: "$items.qty"}}}])db.orders.aggregate([{$unwind: {path: "$items", preserveNullAndEmptyArrays: true}}])$sort: {<field>: 1|-1} / $limit / $skipStandard sort, limit, and skip stages. $sort uses an index ONLY if it appears before any non-index-friendly stage (like $group). Combine $sort + $limit for a fast top-N.
⚠ Common pitfall: $skip on deep pagination (skip: 100000) makes the server stream-and-discard every prior doc. Use a range filter on a unique index instead: {_id: {$gt: lastSeenId}}.
db.orders.aggregate([{$sort: {total_cents: -1}}, {$limit: 10}]) -- top 10$count: "fieldName"Replace the pipeline output with a single document {fieldName: N} containing the count of documents that reached this stage. Faster than $group + $sum: 1.
db.users.aggregate([{$match: {status: "active"}}, {$count: "activeUsers"}])$facet: {<name>: [<sub-pipeline>], ...}Run multiple sub-pipelines on the SAME input set in parallel and return all results as one document. Powers "search-result with per-category counts" UIs in one round-trip.
db.products.aggregate([
{$match: {q}},
{$facet: {
page: [{$skip: 0}, {$limit: 20}],
by_category: [{$group: {_id: "$category", n: {$sum: 1}}}],
total: [{$count: "n"}]
}}
])$addFields / $set: {<field>: <expr>}Add or overwrite fields without dropping any existing fields. Equivalent to $project but inclusion-only — does not require listing every field you want to keep.
db.orders.aggregate([{$addFields: {dollars: {$divide: ["$total_cents", 100]}}}])db.orders.aggregate([{$set: {pricedAt: "$$NOW"}}])$bucket / $bucketAutoGroup docs into explicit numeric buckets ($bucket boundaries: [...]) or evenly-sized auto-buckets ($bucketAuto buckets: N). The right primitive for histograms.
db.orders.aggregate([
{$bucket: {groupBy: "$total_cents", boundaries: [0, 1000, 10000, 100000, 1000000], default: "huge", output: {n: {$sum: 1}}}}
])$sum / $avg / $min / $max / $push / $addToSetStandard $group accumulators. $sum and $avg over a constant (1) give counts; $push collects every value into an array; $addToSet collects distinct values.
db.orders.aggregate([{$group: {_id: "$user_id", spent: {$sum: "$total_cents"}, items: {$push: "$items"}, paid_methods: {$addToSet: "$payment_method"}}}])$cond: [<if>, <then>, <else>] / $switchInline conditional expression. $cond is the ternary; $switch handles N branches. Both work anywhere an aggregation expression is valid ($project, $group accumulator args, $match $expr).
db.orders.aggregate([{$project: {tier: {$cond: [{$gte: ["$total_cents", 100000]}, "vip", "regular"]}}}])db.orders.aggregate([{$project: {tier: {$switch: {branches: [
{case: {$gte: ["$total_cents", 1000000]}, then: "diamond"},
{case: {$gte: ["$total_cents", 100000]}, then: "gold"}
], default: "regular"}}}}])$dateToString / $year / $month / $dayOfWeekDate-formatting and extraction expressions. $dateToString uses strftime-style format strings; the part-extractors return integers for fast grouping by year-month-day.
db.events.aggregate([{$project: {day: {$dateToString: {format: "%Y-%m-%d", date: "$at"}}}}])db.events.aggregate([{$group: {_id: {y: {$year: "$at"}, m: {$month: "$at"}}, n: {$sum: 1}}}])$merge / $outPersist pipeline output. $out replaces an entire collection atomically (drop + rename); $merge upserts into a target with configurable on-match behavior. Use for materialized views.
db.orders.aggregate([{$group: {_id: "$user_id", spent: {$sum: "$total_cents"}}}, {$out: "user_summary"}])db.orders.aggregate([..., {$merge: {into: "daily_revenue", on: "_id", whenMatched: "merge", whenNotMatched: "insert"}}])$setWindowFields: {partitionBy, sortBy, output}Window functions (5.0+): compute running totals, moving averages, rank, and lag/lead over an ordered partition — the equivalent of SQL OVER(PARTITION BY ... ORDER BY ...).
db.sales.aggregate([{$setWindowFields: {partitionBy: "$region", sortBy: {date: 1}, output: {running: {$sum: "$amount", window: {documents: ["unbounded", "current"]}}}}}])$replaceRoot / $replaceWith: {newRoot}Promote an embedded sub-document to be the new top-level document. $replaceWith is the terser 4.2+ alias. Common after $lookup + $unwind to flatten the joined doc to the root.
db.orders.aggregate([{$lookup: {from: "users", localField: "uid", foreignField: "_id", as: "u"}}, {$unwind: "$u"}, {$replaceWith: "$u"}])$unionWith: {coll, pipeline}Append the documents of another collection (optionally run through a sub-pipeline) to the current stream. The aggregation equivalent of SQL UNION ALL across two collections.
db.orders_2025.aggregate([{$unionWith: {coll: "orders_2026"}}, {$group: {_id: "$region", total: {$sum: "$amount"}}}])$lookup with let + pipeline (correlated join)The pipeline form of $lookup binds outer-doc fields into let variables, then runs a full sub-pipeline against the foreign collection referencing them with $$. Enables join-on-multiple-keys and range joins.
db.orders.aggregate([{$lookup: {from: "prices", let: {sku: "$sku", d: "$date"}, pipeline: [{$match: {$expr: {$and: [{$eq: ["$sku", "$$sku"]}, {$lte: ["$start", "$$d"]}]}}}], as: "price"}}])$graphLookup: {from, startWith, connectFromField, connectToField, as}Recursive graph traversal within a collection — follow parent/child or friend-of-friend edges to a configurable depth. Builds the ancestor chain or reachable set in a single stage.
db.employees.aggregate([{$graphLookup: {from: "employees", startWith: "$reports_to", connectFromField: "reports_to", connectToField: "_id", as: "chain", maxDepth: 5}}])$sortByCount: "$field"Shorthand for $group by a field then $sort by descending count. Returns {_id, count} per distinct value, most frequent first — the fastest "top values" query.
db.events.aggregate([{$sortByCount: "$event_type"}])db.orders.aggregate([{$match: {status: "paid"}}, {$sortByCount: "$region"}])$sample: {size: N}Return N pseudo-random documents. Uses an efficient random cursor when N is small relative to the collection and a 2dsphere/no-filter path applies; otherwise it sorts a random key.
⚠ Common pitfall: $sample can return duplicates and may scan the whole collection when N is large or it follows other stages. For reproducible sampling, seed with a $match on a hashed modulo of _id instead.
db.products.aggregate([{$sample: {size: 10}}]) // 随机 10 个$redact: {$cond / $$DESCEND / $$PRUNE / $$KEEP}Field-level access control inside a pipeline — at each document level, decide to $$KEEP it, $$PRUNE (drop it), or $$DESCEND into sub-documents. Implements per-field row-level security.
db.docs.aggregate([{$redact: {$cond: [{$in: [userLevel, "$allowed"]}, "$$DESCEND", "$$PRUNE"]}}])$dateTrunc / $dateAdd / $dateDiffDate arithmetic expressions (5.0+). $dateTrunc rounds a date down to a unit (hour/day/week) for bucketing; $dateAdd/$dateDiff add intervals or compute the gap between two dates in a chosen unit.
db.events.aggregate([{$group: {_id: {$dateTrunc: {date: "$at", unit: "day"}}, n: {$sum: 1}}}])db.subs.aggregate([{$project: {days: {$dateDiff: {startDate: "$start", endDate: "$end", unit: "day"}}}}])$ifNull / $coalesce-style fallback$ifNull returns the first argument if it is non-null and present, otherwise the fallback. The idiomatic way to supply a default for a possibly-missing field inside a pipeline.
db.users.aggregate([{$project: {nickname: {$ifNull: ["$nickname", "$name"]}}}])db.orders.aggregate([{$project: {tax: {$ifNull: ["$tax_cents", 0]}}}])$map / $filter / $reduce // array expressionsTransform arrays element-wise inside a pipeline: $map applies an expression to each element, $filter keeps elements matching a condition, $reduce folds the array to a single value.
db.orders.aggregate([{$project: {prices: {$map: {input: "$items", as: "i", in: {$multiply: ["$$i.qty", "$$i.unit"]}}}}}])db.orders.aggregate([{$project: {total: {$reduce: {input: "$amounts", initialValue: 0, in: {$add: ["$$value", "$$this"]}}}}}])$densify / $fill // gap filling$densify (5.1+) inserts missing documents to make a numeric/date sequence complete; $fill backfills null fields by carrying the last value forward or linear interpolation. Together they prep time-series for charting.
db.metrics.aggregate([{$densify: {field: "ts", range: {step: 1, unit: "hour", bounds: "full"}}}, {$fill: {sortBy: {ts: 1}, output: {value: {method: "locf"}}}}])db.<coll>.createIndex({field: 1})Create a single-field ascending index. The number is the sort direction (1 asc, -1 desc) — for single-field indexes both directions behave identically.
db.users.createIndex({email: 1})db.orders.createIndex({created_at: -1})db.<coll>.createIndex({a: 1, b: -1, c: 1})Create a COMPOUND index. Order matters — the index supports queries on a, a+b, and a+b+c, but NOT b alone or c alone. Match the order to your most common query.
⚠ Common pitfall: A compound index on {region: 1, status: 1, created_at: -1} CAN serve a query on region alone or region+status, but cannot help a query on status alone — you would scan the index for every region.
db.orders.createIndex({region: 1, status: 1, created_at: -1})createIndex({field: 1}, {unique: true})Enforce uniqueness across the indexed field. Combine with partial filters to enforce uniqueness only on a subset (e.g. active users) without rejecting null-on-delete patterns.
⚠ Common pitfall: A unique index on a field that some docs lack treats missing as a single value — at most one doc can be missing the field. Use sparse or partial to allow many missing-field docs.
db.users.createIndex({email: 1}, {unique: true})db.users.createIndex({email: 1}, {unique: true, partialFilterExpression: {deleted_at: null}})createIndex({field: 1}, {sparse: true})Sparse index only indexes documents where the field is PRESENT — saves space when most docs lack the field. Combine with unique for "unique if present".
⚠ Common pitfall: A sparse index cannot be used to sort by that field — the sort would silently miss docs that lack the field. Partial indexes are the modern replacement and do not have this limitation.
db.users.createIndex({phone: 1}, {sparse: true})createIndex({field: 1}, {partialFilterExpression: {<filter>}})Partial index only indexes docs matching the filter expression. Strictly more powerful than sparse — you can index "where status = active" or "where created_at > X".
db.users.createIndex({email: 1}, {partialFilterExpression: {status: "active"}})db.orders.createIndex({user_id: 1}, {partialFilterExpression: {total_cents: {$gt: 0}}})createIndex({createdAt: 1}, {expireAfterSeconds: N})TTL index — MongoDB deletes documents older than N seconds (counted from the indexed date field) via a background reaper that runs every 60 seconds.
⚠ Common pitfall: The reaper runs every 60s, so TTL is best-effort: docs can outlive their expiry by up to a minute. Also: the field MUST be a BSON Date — strings or numbers are silently ignored.
db.sessions.createIndex({created_at: 1}, {expireAfterSeconds: 3600}) -- 1h 后自动清db.events.createIndex({expires_at: 1}, {expireAfterSeconds: 0}) -- 用字段值作为绝对过期时刻createIndex({field: "text"}) / $text $searchText index enables full-text search via $text: $search queries with stemming, stop words, and per-language analyzers. A collection can have AT MOST ONE text index.
db.posts.createIndex({title: "text", body: "text"})db.posts.find({$text: {$search: "mongodb tutorial"}}, {score: {$meta: "textScore"}}).sort({score: {$meta: "textScore"}})createIndex({loc: "2dsphere"}) / $nearGeospatial index for GeoJSON points / polygons / lines. Powers $near (nearest first), $geoWithin (inside a region), $geoIntersects (overlapping geometries).
db.places.createIndex({loc: "2dsphere"})db.places.find({loc: {$near: {$geometry: {type: "Point", coordinates: [-73.97, 40.77]}, $maxDistance: 1000}}})createIndex({field: "hashed"})Hashed index used primarily as a shard key — distributes writes uniformly across shards regardless of value distribution. Cannot serve range queries on the field.
db.events.createIndex({user_id: "hashed"})sh.shardCollection("app.events", {user_id: "hashed"})getIndexes() / dropIndex(name)List every index on a collection (name, key, options) or drop one by name. Always inspect getIndexes() before adding — duplicate-but-not-identical indexes are a common waste of disk and RAM.
db.users.getIndexes()
db.users.dropIndex("email_1")db.users.dropIndexes() -- 删全部,保留 _id_
explain("executionStats")Show how MongoDB plans to execute a query. Look for stage: "IXSCAN" (good) vs "COLLSCAN" (bad), and the keysExamined / docsExamined / nReturned ratio.
db.orders.find({user_id: x, status: "paid"}).explain("executionStats")db.orders.aggregate([...]).explain("executionStats")createIndex({a: 1, "$**": 1}) // wildcard indexA wildcard index ({"$**": 1} or {"path.$**": 1}) indexes every field (or every field under a path) — useful for collections with unpredictable, user-defined attribute keys. One index covers arbitrary single-field queries.
⚠ Common pitfall: A wildcard index does not support multi-field compound queries, sorts that span multiple wildcard fields, or shard keys. It also bloats write amplification — every field write touches the index.
db.products.createIndex({"attributes.$**": 1})db.products.find({"attributes.color": "red"}) // 命中通配索引createIndex({a: 1, b: 1}, {collation: {locale, strength}})A collation index sorts and compares strings by language rules (case/accent insensitivity) rather than raw bytes. A query must use the SAME collation as the index to use it.
db.users.createIndex({name: 1}, {collation: {locale: "en", strength: 2}}) // 大小写不敏感db.users.find({name: "alice"}).collation({locale: "en", strength: 2})createIndex({a: 1}, {background: true}) // legacy flagBefore 4.2, {background: true} built an index without holding a collection-wide write lock. Since 4.2 ALL index builds use an optimized hybrid build, so the flag is accepted but ignored.
db.orders.createIndex({user_id: 1}, {background: true}) // 4.2+ 已无差别createIndex({a: 1}, {name: "custom_name"})Override the auto-generated index name (field_direction concatenation). Required when the default name would exceed the 127-byte limit on a long compound or deeply-nested-path index.
db.events.createIndex({region: 1, type: 1, created_at: -1}, {name: "evt_main"})db.events.dropIndex("evt_main")createIndex({a: 1}, {hidden: true})A hidden index (4.4+) is maintained on writes but invisible to the query planner. Hide a suspect index to measure the impact of dropping it WITHOUT the cost of a rebuild if you were wrong.
db.orders.hideIndex("region_1_status_1")db.orders.unhideIndex("region_1_status_1") // 判断错了秒级恢复reIndex() / collection.reIndexDrop and rebuild every index on a collection. Almost never needed on WiredTiger — it was a fix for fragmented MMAPv1 indexes. Holds an exclusive lock and is offline for the duration.
⚠ Common pitfall: reIndex blocks all operations on the collection until it finishes and offers no benefit on modern WiredTiger storage. Drop + createIndex one at a time if you genuinely need to rebuild.
db.orders.reIndex() // 极少需要
createIndexes (batch) / index build progresscreateIndexes builds several indexes in one command, sharing a single collection scan. Track a long build via db.currentOp() — look for the "createIndexes" op and its progress fields.
db.runCommand({createIndexes: "orders", indexes: [{key: {a: 1}, name: "a_1"}, {key: {b: -1}, name: "b_-1"}]})db.currentOp({"command.createIndexes": {$exists: true}})createIndex({a: 1}, {wildcardProjection: {...}}) // wildcard subsetLimit a wildcard index to (or exclude) specific paths via wildcardProjection — index only the user-controlled attribute sub-tree, not the whole document, to keep the index small.
db.products.createIndex({"$**": 1}, {wildcardProjection: {attributes: 1}})rs.status()Return the replica-set health: each member`s state (PRIMARY, SECONDARY, ARBITER, RECOVERING), optime, lag, last heartbeat. The single most important command on a replica set.
rs.status()
rs.status().members.map(m => ({name: m.name, state: m.stateStr, lag: m.optimeDate}))rs.initiate({...}) / rs.conf()rs.initiate seeds a new replica set on a freshly started mongod; rs.conf() returns the current configuration document. Re-configure with rs.reconfig(newCfg).
rs.initiate({_id: "rs0", members: [
{_id: 0, host: "n1:27017"},
{_id: 1, host: "n2:27017"},
{_id: 2, host: "n3:27017"}
]})rs.add("host:port") / rs.remove("host:port")Add or remove a member from a running replica set. Adding a node triggers an initial sync — for big data sets, use snapshot+restore rather than streaming sync.
⚠ Common pitfall: rs.add() with no priority/votes sets defaults. A 4-node set has an even number of voting members — add an arbiter or set votes: 0 on one node to avoid election deadlocks.
rs.add("n4:27017")rs.add({host: "n4:27017", priority: 0, votes: 0, hidden: true}) -- 隐藏只读副本rs.remove("n4:27017")rs.stepDown(N) / rs.freeze(N)stepDown asks the current PRIMARY to step down for N seconds, triggering a re-election. freeze prevents a SECONDARY from running for election for N seconds. Both are key tools for planned failovers.
rs.stepDown(60) -- 让位 60 秒
rs.freeze(120) -- 当前节点 120 秒内不参选
rs.printReplicationInfo() / rs.printSecondaryReplicationInfo()rs.printReplicationInfo() shows oplog window (start, end, size); rs.printSecondaryReplicationInfo() shows lag per secondary in seconds. Use to size your oplog and detect stale members.
rs.printReplicationInfo()
rs.printSecondaryReplicationInfo()
db.getMongo().setReadPref("secondary"|"secondaryPreferred")Route read queries to secondaries — useful for offloading analytics from the primary. Trade-off: secondaries may lag, so reads can return stale data.
⚠ Common pitfall: "secondary" mode hard-fails if every secondary is unhealthy; "secondaryPreferred" falls back to the primary. For mission-critical reads, never use secondary mode — eventual consistency surprises ship to prod.
db.getMongo().setReadPref("secondaryPreferred")db.orders.find().readPref("secondary") -- 单次查询级别writeConcern: {w: "majority", j: true, wtimeout: 5000}Per-write durability. w: "majority" waits for acknowledgment from a majority of voting nodes; j: true waits for the on-disk journal flush; wtimeout caps the wait.
db.orders.insertOne(doc, {writeConcern: {w: "majority", j: true, wtimeout: 5000}})db.orders.updateOne(f, u, {writeConcern: {w: 1}}) -- 仅 primary ack,快但弱rs.reconfig(cfg, {force: true})Apply a new replica-set configuration. {force: true} pushes a config through even without a majority of voting members reachable — the disaster-recovery escape hatch when you have lost quorum.
⚠ Common pitfall: force reconfig can create two primaries if the partitioned-away members come back — only use it when you are certain the other side is permanently gone, and bump the config version carefully.
cfg = rs.conf(); cfg.members = cfg.members.filter(m => m.host !== "dead:27017"); rs.reconfig(cfg, {force: true})rs.isMaster() / db.hello()hello() (the renamed, non-stigmatizing replacement for isMaster) returns the node`s role, the primary`s address, the member list, and max wire version — exactly what a driver reads to route operations.
db.hello().isWritablePrimary
db.hello().primary // 当前 primary 地址
readConcern: {level: "majority"|"linearizable"|"snapshot"}Read concern controls the consistency/recency of returned data. "majority" returns data acknowledged by a majority (no rollback risk); "linearizable" guarantees the latest; "snapshot" reads a consistent point-in-time inside a transaction.
⚠ Common pitfall: "linearizable" reads must target the primary and can be slow — they wait for a no-op write to confirm leadership. Use "majority" as the safe default; reserve "linearizable" for read-your-own-write correctness needs.
db.orders.find({_id: x}).readConcern("majority")db.runCommand({find: "orders", filter: {_id: x}, readConcern: {level: "linearizable"}, maxTimeMS: 1000})tag sets + writeConcern: {w: "<customTag>"}Replica-set member tags let you define custom write concerns by data-center or rack — w: "twoDataCenters" waits until the write reaches members tagged in two DCs, for cross-region durability.
cfg.settings.getLastErrorModes = {twoDC: {dc: 2}}; rs.reconfig(cfg)db.orders.insertOne(doc, {writeConcern: {w: "twoDC"}})rs.syncFrom("host:port")Override which member a secondary replicates from, instead of the automatically-chosen sync source. Used to pull initial sync from a closer/idler node, or to break a bad sync chain.
⚠ Common pitfall: The override is not permanent — the replica set may revert to its own choice on the next sync-source re-evaluation. It is a hint, not a hard pin.
rs.syncFrom("n2:27017")local.oplog.rs // tailing the oplogThe oplog is a capped collection in the local database recording every write as an idempotent operation. Tailing it (or, preferably, change streams) powers replication and CDC pipelines.
use local; db.oplog.rs.find().sort({$natural: -1}).limit(5)db.oplog.rs.stats().maxSize // oplog 容量上限
sh.status() / sh.status(true)Print the sharded-cluster summary: shards, databases sharded, collections sharded with their shard keys, chunk distribution per shard. true verbose prints every chunk.
sh.status()
sh.status(true) -- 详细,每个 chunk
sh.enableSharding("dbname")Enable sharding on a database. Required before any collection in that database can be sharded. Does not move data on its own.
sh.enableSharding("myapp")sh.shardCollection("db.coll", {key: 1|"hashed"})Shard a collection on a chosen key. Range keys (1) preserve locality and serve range queries; hashed keys distribute writes uniformly. Cannot change shard key post-shard (until 4.4 refineCollectionShardKey).
⚠ Common pitfall: Pick the shard key BEFORE production data exists. A bad shard key (low cardinality, monotonically increasing) causes hot shards and is painful to fix — you may need to dump, drop, re-shard, and re-load.
sh.shardCollection("myapp.events", {user_id: "hashed"})sh.shardCollection("myapp.orders", {region: 1, created_at: 1})sh.addShard("rs0/host:port,host:port")Register a replica set as a new shard in the cluster. The argument is the replica-set connection string. The balancer will start migrating chunks to the new shard automatically.
sh.addShard("rs1/n1:27018,n2:27018,n3:27018")sh.moveChunk("db.coll", {<shardKey>: v}, "shardName")Manually move a chunk to a specific shard. Almost never needed — the balancer does this automatically. Use only when balancer is off and you are hand-rebalancing.
sh.moveChunk("myapp.orders", {region: "us-east"}, "shard0002")mongos --configdb <rs>/host:port,... / connect via mongosmongos is the query router — clients connect to mongos, NOT to individual shards. mongos uses the config-server replica set to know which chunks live on which shards.
⚠ Common pitfall: Direct connections to individual shards bypass the cluster — writes that miss the shard key route to the wrong shard, and reads see only one shard`s subset of data. Always go through mongos.
mongos --configdb cfgrs/cfg1:27019,cfg2:27019,cfg3:27019
mongo --host mongos1:27017 -- 客户端连 mongos
sh.startBalancer() / sh.stopBalancer() / sh.getBalancerState()Control the chunk balancer. Stop it during heavy bulk loads or maintenance windows to avoid migration overhead, then start it again. getBalancerState() reports whether it is currently enabled.
sh.stopBalancer()
sh.startBalancer()
sh.getBalancerState() // true / false
sh.disableBalancing("db.coll") / sh.enableBalancingDisable balancing for ONE collection while leaving the cluster balancer running for others. Useful to pin a hot, manually-tuned collection`s chunk layout in place.
sh.disableBalancing("myapp.sessions")sh.enableBalancing("myapp.sessions")sh.addShardToZone / sh.updateZoneKeyRange // zone shardingZone (tag-aware) sharding pins a shard-key range to a named zone of shards — e.g. keep EU customers` chunks on EU-region shards for data residency. The balancer respects zone boundaries.
sh.addShardToZone("shard0001", "EU")sh.updateZoneKeyRange("myapp.users", {region: "eu", _id: MinKey}, {region: "eu", _id: MaxKey}, "EU")refineCollectionShardKey("db.coll", {a: 1, b: 1})Append fields to an existing shard key (4.4+) to increase its cardinality without re-sharding from scratch — useful when a once-fine key has become coarse as data grew.
⚠ Common pitfall: You can only ADD fields to the end of the existing key — you cannot remove, reorder, or change the prefix. The new fields should be indexed and present on documents going forward.
sh.shardCollection("myapp.events", {user_id: 1})db.adminCommand({refineCollectionShardKey: "myapp.events", key: {user_id: 1, created_at: 1}})reshardCollection("db.coll", {newKey})Fully change a collection`s shard key online (5.0+). MongoDB clones the data under the new key, keeps writes flowing, then atomically cuts over. The escape from a bad shard-key decision.
⚠ Common pitfall: Resharding temporarily needs disk space for a second copy of the collection and adds write load during the clone. Plan capacity and run it in a low-traffic window.
db.adminCommand({reshardCollection: "myapp.orders", key: {customer_id: "hashed"}})targeted vs scatter-gather queryA query that includes the shard key (or its prefix) is TARGETED — mongos routes it to the one shard holding those chunks. A query without the shard key is SCATTER-GATHER — mongos fans it to every shard.
⚠ Common pitfall: Scatter-gather queries scale poorly — latency tracks the slowest shard and load multiplies by shard count. Design the common query path to always carry the shard key.
db.orders.find({region: "us-east", _id: x}) // 定向(含分片键)db.orders.find({status: "paid"}) // scatter-gather(无分片键)$where: "this.a > this.b" — avoid$where runs a JavaScript function for EVERY document against the JS engine — orders of magnitude slower than native operators and a security risk. $expr with aggregation expressions replaces it.
⚠ Common pitfall: $where also disables index usage on the predicate — every query becomes a collection scan with per-doc JS eval. Banned in many managed Mongo services for security.
// BAD
db.orders.find({$where: "this.total > this.paid"})// GOOD
db.orders.find({$expr: {$gt: ["$total", "$paid"]}})Huge $in arrays (10k+ values)A $in array of 10k+ values defeats the per-key index lookup optimization and consumes large amounts of memory on every match. Page the IDs or stage them into a collection and $lookup.
// BAD
db.users.find({_id: {$in: [...10000_ids]}})// GOOD: page
for (let i = 0; i < ids.length; i += 1000) db.users.find({_id: {$in: ids.slice(i, i+1000)}}).forEach(...)Missing index — COLLSCAN in productionRun explain("executionStats") on every new query before shipping. A COLLSCAN stage with docsExamined ≈ collection size means MongoDB scanned every document to answer it.
⚠ Common pitfall: Use db.currentOp({secs_running: {$gt: 1}}) to catch in-flight slow queries on prod; mongotop and mongostat reveal hot collections. Slow query log fires on any query > slowOpThresholdMs (default 100ms).
db.orders.find({user_id: x}).explain("executionStats") // 看 stagedb.currentOp({secs_running: {$gt: 1}})Unbounded $lookup — N*M scan$lookup without an index on foreignField is O(N * M) — every input doc triggers a full scan of the foreign collection. Sub-second queries become minutes-long on real data.
db.users.createIndex({_id: 1}) // 默认已存在,但自定义 foreignField 务必显式建db.orders.createIndex({user_id: 1}) // foreignField 索引ObjectId vs string ID mismatchfind({_id: "65f1a2..."}) does NOT match a document whose _id is ObjectId("65f1a2...") — they are different BSON types. Wrap string IDs with ObjectId() at the boundary.
⚠ Common pitfall: The driver typically does NOT convert for you. In Node: new ObjectId(stringId). In Python (pymongo): from bson import ObjectId; ObjectId(stringId). Forgetting this returns "no results" silently.
// Node driver
db.collection("users").findOne({_id: new ObjectId(req.params.id)})Unanchored $regex — full collection scan{email: /alice/} matches every doc whose email contains "alice" — but it scans EVERY document because the regex is not anchored to the start. Add /^alice/ to use an index.
// BAD: scans everything
db.users.find({email: /alice/})// GOOD: uses index
db.users.find({email: /^alice@/})writeConcern: {w: 0} — fire and forget{w: 0} writes return immediately, before the server even confirms it received the command. Network errors, duplicate-key errors, and crashed primaries are SILENTLY lost.
⚠ Common pitfall: Use {w: 0} only for fully-disposable telemetry where loss is acceptable. For anything user-facing, {w: "majority", j: true} is the safe default — slower per write, but no silent data loss on primary failover.
db.metrics.insertOne(doc, {writeConcern: {w: 0}}) // 仅可丢遥测db.orders.insertOne(doc, {writeConcern: {w: "majority", j: true}}) // 用户数据Multi-doc transactions — last resortTransactions (4.0+ replica sets, 4.2+ sharded clusters) work but cost: every read inside a txn locks documents, contention skyrockets, and any abort retries the whole block. Re-shape data to avoid them.
⚠ Common pitfall: A single document is atomic by default. Many "transaction" needs reshape into a single nested document — order + items + payments in one doc rather than three collections — and the transaction disappears entirely.
const session = db.getMongo().startSession()
session.startTransaction()
try { ... session.commitTransaction() } catch { session.abortTransaction() }Array equality {tags: ["a", "b"]} is order-sensitiveQuerying {tags: ["a", "b"]} matches ONLY documents whose tags array is exactly ["a", "b"] in that order — not ["b", "a"], not a superset. For membership use $all (order-free) or $in (any one).
⚠ Common pitfall: This trips up everyone at least once — exact-array equality is rarely what you want. {tags: {$all: ["a", "b"]}} matches any array CONTAINING both, regardless of order or extra elements.
// 精确相等(很少需要)
db.posts.find({tags: ["a", "b"]})// 包含两者(常用)
db.posts.find({tags: {$all: ["a", "b"]}})Number type coercion: int vs long vs doubleThe shell stores bare integers as a double by default, so {n: 5} may not equal a field stored as a 32-bit int from a driver. Wrap with NumberInt() / NumberLong() to control the BSON type explicitly.
⚠ Common pitfall: Mixed numeric types in one field break sorting expectations and can defeat range queries. Pick one numeric type per field at write time and enforce it with a $jsonSchema validator.
db.counters.insertOne({_id: "c", n: NumberLong(0)})db.counters.find({n: NumberInt(5)})Case-insensitive search without a collation indexUsing {$regex: "alice", $options: "i"} for case-insensitive match forces a full scan — the i flag defeats index use. Build a collation index with strength: 2 and query with the matching collation instead.
// BAD: 全扫
db.users.find({name: {$regex: "^alice$", $options: "i"}})// GOOD: collation 索引
db.users.createIndex({name: 1}, {collation: {locale: "en", strength: 2}})
db.users.find({name: "Alice"}).collation({locale: "en", strength: 2})Unbounded array growth in a documentPushing into an array forever (comments, events, log lines on one doc) eventually hits the 16 MB BSON document limit and degrades every read/write of that doc. Cap arrays with $slice, or move the data to its own collection.
⚠ Common pitfall: Even before 16 MB, a multi-MB document is loaded and rewritten in full on every update — the "bucket pattern" (N items per bucket doc) or a separate collection keeps writes cheap.
// 限长,避免无限增长
db.feeds.updateOne({_id: "f"}, {$push: {items: {$each: [x], $slice: -1000}}})Negation operators skip missing fields{field: {$ne: v}}, {$nin: [...]}, and {$not: {...}} all MATCH documents where the field is absent, because "missing" is never equal to v. To require the field, combine with {$exists: true}.
⚠ Common pitfall: A "find users not in country US" query with {country: {$ne: "US"}} silently includes every user with no country field at all. Add {country: {$exists: true, $ne: "US"}}.
// 可能误纳缺失字段
db.users.find({country: {$ne: "US"}})// 要求字段存在
db.users.find({country: {$exists: true, $ne: "US"}})skip/limit pagination drifts under concurrent writesfind().skip(n).limit(m) re-counts from the start each page, so inserts/deletes between page loads shift the window — users see duplicates or gaps. Use range pagination on a stable unique key instead.
⚠ Common pitfall: Range (a.k.a. seek/keyset) pagination — {_id: {$gt: lastSeenId}} sorted by _id — is both stable under writes and O(1) per page instead of O(skip).
// 漂移且越翻越慢
db.posts.find().sort({_id: 1}).skip(10000).limit(20)// 稳定且快
db.posts.find({_id: {$gt: lastId}}).sort({_id: 1}).limit(20)upsert race creates duplicates without a unique indexTwo concurrent updateOne(..., {upsert: true}) on the same key can BOTH insert if no unique index enforces the key — the upsert check and insert are not atomic across connections. A unique index makes one fail cleanly.
⚠ Common pitfall: Always back an upsert key with a unique index. On the duplicate-key error, retry the operation — it will now find the row the other writer inserted and update it instead.
db.counters.createIndex({name: 1}, {unique: true})db.counters.updateOne({name: "x"}, {$inc: {n: 1}}, {upsert: true})Forgotten allowDiskUse on large $sort / $groupA pipeline $sort or $group that exceeds the 100 MB in-memory limit errors with "Sort exceeded memory limit" unless you pass {allowDiskUse: true}. From 6.0 spilling to disk is automatic, but older servers fail hard.
db.events.aggregate([{$sort: {at: 1}}], {allowDiskUse: true})Searchable MongoDB cheat sheet, 80+ entries backend engineers and SREs actually type into mongosh. Nine sections: shell basics (use, show dbs, db.serverStatus, db.runCommand, drop), CRUD (insertOne, insertMany ordered:false, find + sort + limit, updateOne with the #1 bug of forgetting $set which silently REPLACES the whole document, findOneAndUpdate, bulkWrite, countDocuments vs estimatedDocumentCount), query operators ($eq / $gt / $in with 10k-array warning, $ne also matching missing fields, $regex anchored vs unanchored index difference, $elemMatch to bind multiple predicates to the SAME array element, $or / $and with the JS-object- key-collision bug, $expr to compare two fields), projection (inclusion / exclusion with _id exception, $elemMatch, $slice pagination, $meta textScore), aggregation ($match early, $group with accumulators + allowDiskUse, $lookup with MANDATORY foreignField index, $unwind, $sort + $limit + $skip deep-skip warning, $count, $facet, $bucket histograms, $cond / $switch, $dateToString, $merge / $out for materialized views), indexes (createIndex single + compound prefix-order rules, unique + partialFilterExpression, sparse vs partial, TTL with the 60-second reaper caveat, text one-per-collection, 2dsphere $near, hashed for shard keys, explain executionStats IXSCAN vs COLLSCAN), replication (rs.status, rs.add / rs.remove odd-voting-member rules, rs.stepDown, read preferences, writeConcern w:majority), sharding (sh.status, sh.shardCollection shard-key warnings, sh.addShard, sh.moveChunk, mongos as the only correct entry), and pitfalls (avoid $where, huge $in, missing indexes, unbounded $lookup, ObjectId vs string mismatch, unanchored $regex, w:0 silent data loss, reshape data to avoid multi-doc transactions). Every entry: command + EN/ZH description + 1-3 mongosh-pasteable examples + a real production pitfall. Search across all fields. Pure client-side — no MongoDB connection. Pair with our PostgreSQL, Redis, SQL and Docker cheat sheets.
Paste or drop your content into the tool panel.
Click the button. All processing is local in your browser.
Copy the result or download to disk in one click.
Use it in the small gaps between coding, reviewing, debugging, and shipping.
These links move the current task into a more complete workflow.
On-call SRE, no time to read docs. You filter to "replication", confirm rs.status() shows the new PRIMARY, then check why writes hung: the app used {w: 1} and lost acks during the 12-second election. The writeConcern entry reminds you w:majority is the safe default and explains the throughput trade-off you accepted. You ship the config fix before the next page.
Backend dev with a slow API. You paste find().explain ("executionStats") from the explain entry, see COLLSCAN and totalDocsExamined of 1.2M against nReturned of 20. The compound-index prefix-order entry shows your index on {status, createdAt} can't serve a sort on createdAt alone. You build the right index and the endpoint drops back under 50ms.
Tech lead doing a postmortem. The updateOne entry spells out the exact bug: {name: "x"} without $set is a full document REPLACE, dropping email, roles and timestamps with no error. You add the entry to your onboarding doc and a driver lint, and you write the restore-from-oplog runbook so the next person never hits it.
Staff engineer in a design review. The transactions entry argues most "transaction" needs are data-model problems: order, items and payments often want to be one nested document where the write is atomic for free. You reshape the schema, drop the 60-second-limited multi-doc transaction, and keep cross-account transfers as the only real transaction in the system.
Writing updateOne(filter, {name: 'x'}) without $set — that is a full REPLACE that drops every other field. Always wrap changes in {$set: {...}}.
Trusting {field: {$ne: v}} to skip a value — it also matches docs where the field is MISSING. Pair it with {$exists: true} when absence should not count.
Adding a unique index on a sometimes-missing field — only ONE doc may omit it. Use partialFilterExpression (e.g. {email: {$exists: true}}) so multiple docs can lack the field.
This cheat sheet is a single static page. Your search text is matched in-memory against a built-in array of commands and never leaves the tab — no MongoDB connection, no upload, no network request, and nothing is written to the URL. Open DevTools Network while typing and you will see zero traffic, so it is safe behind bastion-only clusters and air-gapped networks.
Folks in your role tend to reach for these alongside this tool.