mongodb 35 Q&As

MongoDB FAQ & Answers

35 expert MongoDB answers researched from official documentation. Every answer cites authoritative sources you can verify.

unknown

35 questions

What is MongoDB and how does it differ from traditional relational databases?

MongoDB is a document-oriented NoSQL database storing data in flexible JSON-like documents (BSON format). Key differences from SQL: schema-less design with dynamic schemas, documents instead of rows, embedded data models instead of joins, built-in horizontal scaling with sharding, single-document atomic operations. MongoDB 8.0 (2024) delivers 32% better performance for 95/5 read/write workloads, 54% faster bulk inserts, and 50x faster resharding. Use MongoDB for flexible schemas, rapid development, horizontal scaling, hierarchical data. Use SQL for complex transactions across multiple tables, rigid schemas, and complex joins. MongoDB supports full ACID transactions since 4.0 (replica sets) and 4.2 (sharded clusters).

Sources

mongodb.com mongodb.com infoq.com

99% confidence

What is BSON and how does it differ from JSON?

BSON (Binary JSON) is MongoDB's binary-encoded serialization format. Key differences from JSON: supports additional data types (Date, ObjectId, Binary, Int32/64, Decimal128), binary format for faster parsing and compact storage, preserves precise type information. MongoDB 8.0 optimizes BSON processing with seekForKeyValueView() method that consolidates keystring and RecordId into a tuple, avoiding unnecessary BSONObj creation for faster data retrieval. BSON enables efficient indexing on nested fields with type-aware comparisons. MongoDB drivers automatically convert JSON to BSON. ObjectId type provides unique identifiers with embedded timestamps. 16MB document size limit applies.

Sources

mongodb.com mydbops.com mongodb.com

99% confidence

What are MongoDB CRUD operations and how do they work?

CRUD operations in MongoDB: Create with insertOne() and insertMany() to add documents. Read with find() and findOne() to query documents using filters. Update with updateOne(), updateMany(), and replaceOne() to modify documents. Delete with deleteOne() and deleteMany() to remove documents. All write operations are atomic at the single-document level. Example: db.users.insertOne({name: 'John', age: 30}); db.users.find({age: {$gte: 18}}); db.users.updateOne({name: 'John'}, {$set: {age: 31}}); db.users.deleteOne({name: 'John'}). If collection doesn't exist, insert operations automatically create it. Inserted documents get unique _id field automatically if not specified.

Sources

mongodb.com dbschema.com geeksforgeeks.org

99% confidence

What is the MongoDB aggregation pipeline and when should you use it?

The MongoDB aggregation pipeline processes documents through multiple stages, each transforming the data. Use for complex data transformations, analytics, and reporting beyond simple queries. Example: db.sales.aggregate([{$match: {date: {$gte: ISODate('2025-01-01')}}}, {$group: {_id: '$product', total: {$sum: '$amount'}}}, {$sort: {total: -1}}]). Key stages: $match (filtering), $group (aggregation), $project (reshaping), $sort, $lookup (joining). Performance: Place $match early to leverage indexes and reduce document count. MongoDB uses slot-based execution engine for improved performance with lower CPU and memory costs. Aggregation collapses $sort + $limit into single internal sort stage. Limit pipelines to <10 stages for maintainability.

Sources

mongodb.com mongodb.com medium.com

99% confidence

How do you use the $match stage in MongoDB aggregation pipeline?

The $match stage filters documents in aggregation pipelines, similar to find() queries. Place $match as first stage in pipeline to leverage indexes and reduce downstream document processing. Example: db.orders.aggregate([{$match: {status: 'completed', date: {$gte: ISODate('2025-01-01')}}}, {$group: {_id: null, total: {$sum: '$amount'}}}]). Use standard query operators: $eq, $gt, $in, $regex. Performance optimization: Create indexes on $match fields (up to 75% latency improvement possible with proper indexing). $match can use all MongoDB query operators and expressions. Early filtering minimizes data processed by subsequent pipeline stages.

Sources

mongodb.com mongodb.com medium.com

99% confidence

How do you use the $group stage for data aggregation in MongoDB?

The $group stage groups documents by specified expression and computes accumulated values using accumulator operators. Syntax: {$group: {_id: '$category', total: {$sum: '$amount'}, count: {$sum: 1}, avg: {$avg: '$price'}}}. Use _id: null to group all documents into one. Common accumulators: $sum (numeric total), $avg (average), $min/$max (extremes), $first/$last (first/last in group), $push (creates array), $addToSet (unique array). Example: db.sales.aggregate([{$group: {_id: '$region', revenue: {$sum: '$amount'}, orders: {$sum: 1}}}]). Performance: Create indexes on grouping fields. MongoDB 8.0 time-series aggregations run 60% faster than 7.0.

Sources

mongodb.com mongodb.com mongodb.com

99% confidence

How do you create and use single field indexes in MongoDB?

Single field indexes index one document field for fast queries. Create with: db.users.createIndex({email: 1}); where 1 indicates ascending order (-1 for descending). Use for queries: db.users.find({email: '[email protected]'}). Indexes improve read performance but add write overhead. Default _id field is automatically indexed. Check index usage: db.users.find({email: 'test'}).explain('executionStats'). Look for IXSCAN (index scan, good) vs COLLSCAN (collection scan, needs index). Remove index: db.users.dropIndex({email: 1}). Ideal for exact match queries and single-field sorting. MongoDB 8.0 delivers up to 36% better read throughput with proper indexing.

Sources

mongodb.com mongodb.com infoq.com

99% confidence

How do compound indexes work in MongoDB and when should you use them?

Compound indexes index multiple fields together, supporting queries on field prefixes. Create with: db.users.createIndex({name: 1, age: -1, city: 1});. Supports queries on {name}, {name, age}, and {name, age, city} - field order matters (index prefix rule). Example: db.users.find({name: 'John', age: {$gt: 25}}). Use for multi-field queries and sorts. Follow ESR rule for optimal field order: Equality first, Sort second, Range last. Compound indexes reduce total index count and improve complex query performance. Verify index usage: db.users.find({...}).explain('executionStats'). MongoDB 8.0 improves index performance with optimized BSON processing.

Sources

mongodb.com mongodb.com medium.com

99% confidence

How do text indexes enable full-text search in MongoDB?

Text indexes enable full-text search with stemming, stop-word removal, and case-insensitive matching. Create with: db.articles.createIndex({title: 'text', content: 'text'});. Only one text index per collection. Search: db.articles.find({$text: {$search: 'mongodb tutorial'}}). Add scoring: db.articles.find({$text: {$search: 'mongodb'}}, {score: {$meta: 'textScore'}}).sort({score: {$meta: 'textScore'}}). Supports phrase searches ("exact phrase"), negation (-word), language specification. Limitation: No wildcard or fuzzy matching. Production recommendation (2025): Use MongoDB Atlas Search for advanced full-text search with Apache Lucene, supporting relevance ranking, faceting, and dedicated search nodes.

Sources

mongodb.com mongodb.com stackoverflow.com

99% confidence

How do TTL (Time To Live) indexes work for automatic document expiration?

TTL indexes automatically delete documents after specified time period. Create on date field: db.sessions.createIndex({createdAt: 1}, {expireAfterSeconds: 3600}); Documents expire when createdAt + 3600 seconds < current time. Background process runs every 60 seconds to remove expired documents. Use for session data, logs, cache, temporary data. Example: db.logs.createIndex({timestamp: 1}, {expireAfterSeconds: 86400}); // 24 hours. Modify expiration: db.runCommand({collMod: 'logs', index: {keyPattern: {timestamp: 1}, expireAfterSeconds: 43200}}). TTL indexes are single-field only, work with date or date array fields. Documents removed from all indexes and storage.

Sources

mongodb.com mongodb.com

99% confidence

How do you use MongoDB field update operators ($set, $unset, $inc)?

Field update operators modify document fields without replacing entire document. $set sets or updates field values: db.users.updateOne({_id: 1}, {$set: {name: 'John', status: 'active'}});. $unset removes fields: db.users.updateOne({_id: 1}, {$unset: {tempField: 1}});. $inc increments numeric values: db.products.updateOne({_id: 1}, {$inc: {stock: -1, views: 1}});. Combine operators: db.users.updateOne({_id: 1}, {$set: {lastLogin: new Date()}, $inc: {loginCount: 1}}). Operations are atomic at document level. Work with updateOne(), updateMany(), findAndModify(). MongoDB 8.0 delivers 20% faster concurrent writes during replication.

Sources

mongodb.com mongodb.com

99% confidence

How do you use MongoDB array update operators ($push, $pull, $addToSet)?

Array update operators modify array fields efficiently. $push adds elements: db.users.updateOne({_id: 1}, {$push: {tags: 'newTag', scores: 95}});. $pull removes matching elements: db.users.updateOne({_id: 1}, {$pull: {tags: 'oldTag', scores: {$lt: 60}}});. $addToSet adds unique elements (no duplicates): db.users.updateOne({_id: 1}, {$addToSet: {skills: 'MongoDB'}});. $push supports modifiers: db.posts.updateOne({_id: 1}, {$push: {comments: {$each: [c1, c2], $position: 0}}}); Array filters enable precise updates: db.users.updateOne({_id: 1}, {$set: {'grades.$[elem].score': 95}}, {arrayFilters: [{'elem.student': 'John'}]});

Sources

mongodb.com mongodb.com

99% confidence

How do you use comparison query operators in MongoDB?

Comparison operators filter documents based on value comparisons. $eq (equals): db.users.find({status: {$eq: 'active'}}); shorthand: db.users.find({status: 'active'});. $ne (not equal): db.users.find({status: {$ne: 'inactive'}});. $gt, $gte, $lt, $lte for numeric/date comparisons: db.products.find({price: {$gt: 100, $lte: 500}});. $in (matches any in array): db.users.find({country: {$in: ['USA', 'Canada', 'UK']}});. $nin (matches none): db.users.find({status: {$nin: ['deleted', 'banned']}});. Work with all data types. Combine with logical operators for complex queries. Create indexes on comparison fields for performance.

Sources

mongodb.com mongodb.com

99% confidence

How do you use logical query operators ($and, $or, $not) in MongoDB?

Logical operators combine multiple query conditions. $and (all conditions true): db.users.find({$and: [{age: {$gte: 18}}, {status: 'active'}]});. $or (any condition true): db.products.find({$or: [{category: 'electronics'}, {price: {$lt: 50}}]});. $not (inverts condition): db.users.find({age: {$not: {$lt: 18}}});. Implicit $and for comma-separated conditions: db.users.find({age: {$gte: 18}, status: 'active'});. $nor (no conditions true): db.users.find({$nor: [{status: 'deleted'}, {banned: true}]});. Combine operators: db.products.find({$and: [{$or: [{category: 'books'}, {category: 'music'}]}, {price: {$lt: 100}}]});

Sources

mongodb.com mongodb.com

99% confidence

How does MongoDB implement multi-document ACID transactions?

MongoDB supports ACID transactions since 4.0 (replica sets) and 4.2 (sharded clusters). Implementation: const session = client.startSession(); try { session.startTransaction(); await collection.insertOne(doc, {session}); await collection.updateOne(filter, update, {session}); await session.commitTransaction(); } catch (error) { await session.abortTransaction(); } finally { await session.endSession(); }. Best practices (2025): Limit to <1,000 documents per transaction, default 60s timeout, use w:'majority' write concern, optimize queries with indexes. Transactions use snapshot isolation and two-phase commit. Performance: Can reduce commit latency 10x by batching writes. Use for financial transfers, inventory management requiring cross-document consistency.

Sources

mongodb.com mongodb.com mongodb.com

99% confidence

How do MongoDB replica sets provide high availability and automatic failover?

MongoDB replica sets maintain multiple copies of data for high availability. Components: primary (accepts writes), secondaries (replicate data), optional arbiter (votes only). Automatic failover: If primary fails, secondaries elect new primary typically within 12 seconds (median time). Configure: rs.initiate({_id: 'rs', members: [{_id: 0, host: 'server1:27017'}, {_id: 1, host: 'server2:27017'}, {_id: 2, host: 'server3:27017'}]}); Write concern w:'majority' ensures writes persist on majority. Read preferences: secondary, primaryPreferred, secondaryPreferred, nearest. Use odd number of nodes for quorum. Tune electionTimeoutMillis for faster failover. Deploy across availability zones for disaster recovery.

Sources

mongodb.com mongodb.com geeksforgeeks.org

99% confidence

How does MongoDB sharding distribute data across multiple servers?

MongoDB sharding horizontally scales data by distributing it across multiple servers (shards). Architecture: mongos routers direct queries, config servers (use 3 for high availability) store metadata, shards hold data subsets. Enable sharding: sh.enableSharding('mydb'); sh.shardCollection('mydb.users', {userId: 'hashed'});. Shard key selection criteria (2025): (1) High cardinality (many unique values), (2) Even write distribution (avoid monotonic keys like timestamps), (3) Query isolation (include shard key in queries to avoid scatter-gather). Shard key types: ranged (supports range queries, co-locates related data, risk of hotspots), hashed (guarantees even distribution, optimal for event/time-series data, no range queries). Critical: Shard key cannot be changed after collection creation. Poor choices cause jumbo chunks (>64MB, unmovable between shards). Best practice (2025): Shard early in lifecycle (before 512GB data). Monitor: sh.status() for chunk distribution, enable balancer for automatic redistribution.

Sources

mongodb.com mongodb.com percona.com

99% confidence

How do you model relationships in MongoDB: embedded vs referenced documents?

Choose between embedded (denormalized) and referenced (normalized) relationships. Embedded: Store related data in single document for fast atomic reads: {user: 'John', orders: [{id: 1, total: 100}, {id: 2, total: 200}]}. Use for one-to-few relationships, data accessed together, when total size stays under 16MB. Referenced: Store ObjectId references and use $lookup for joins: db.users.insertOne({name: 'John', orders: [ObjectId('...')]});. Use for one-to-many/many-to-many, large subdocuments, data needing independent access. Embedded patterns perform better (single-document operations, no join overhead). 2025 guidance: Hybrid approach often best - embed frequently accessed data, reference rarely accessed data, cache strategically.

Sources

mongodb.com dbschema.com medium.com

99% confidence

How does MongoDB schema validation enforce data structure rules?

Schema validation enforces document structure using JSON Schema (draft 4) rules during inserts and updates. Apply during collection creation: db.createCollection('users', {validator: {$jsonSchema: {bsonType: 'object', required: ['name', 'email'], properties: {name: {bsonType: 'string'}, age: {bsonType: 'int', minimum: 0}, email: {bsonType: 'string', pattern: '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+.[a-zA-Z]{2,}$'}}}}});. Add to existing collection: db.runCommand({collMod: 'users', validator: {...}});. Validation levels: strict (all operations) or moderate (updates to valid docs). Actions: error (reject) or warn (log). 2025 enhancement: MongoDB Compass generative AI can assist in creating validation rules. Use for data quality, documentation, gradual schema evolution.

Sources

mongodb.com datacamp.com studio3t.com

99% confidence

How do MongoDB write concerns and read concerns affect data consistency?

Write concern determines acknowledgment level for write operations: w:1 (primary only, fast but may lose data on failover), w:'majority' (majority of replica set members, recommended default since MongoDB 5.0), w:'all' (all members, highest durability). Example: db.orders.insertOne(order, {writeConcern: {w: 'majority', j: true, wtimeout: 5000}}); Journal option j:true ensures writes are written to disk before acknowledgment. wtimeout prevents indefinite blocking. Read concern controls read consistency: local (default, may read uncommitted data that could be rolled back), majority (only reads data committed by majority, durable reads), snapshot (point-in-time reads for transactions, ensures causal consistency). Example: db.users.find({}, {readConcern: {level: 'majority'}});. Causal consistency guarantee (2025): Use both writeConcern: {w: 'majority'} and readConcern: {level: 'majority'} for all four guarantees: read your own writes, monotonic reads, monotonic writes, writes follow reads. Critical: For transactions, use snapshot read concern with majority write concern for consistency across shards.

Sources

mongodb.com mongodb.com mongodb.com

99% confidence

How do you use MongoDB explain() to analyze and optimize query performance?

Use explain() to analyze query execution plans and identify optimization opportunities. Query analysis: db.users.find({age: {$gt: 25}}).explain('executionStats');. Key metrics: executionTimeMillis, totalDocsExamined, totalKeysExamined. Look for COLLSCAN (collection scan - needs index) vs IXSCAN (index scan - good). Optimize by creating indexes on queried fields: db.users.createIndex({age: 1});. Check index usage: db.users.find({age: {$gt: 25}}).hint({age: 1}).explain();. For aggregations: db.sales.aggregate([{$match: {date: ISODate('2025-01-01')}}], {explain: true});. Follow ESR rule for compound indexes: Equality, Sort, Range. Monitor query performance and create indexes to reduce document scanning.

Sources

mongodb.com mongodb.com geeksforgeeks.org

99% confidence

How do MongoDB change streams provide real-time data change notifications?

Change streams provide real-time notifications of data changes without polling. Basic usage: const changeStream = db.collection.watch(); changeStream.on('change', (change) => { console.log(change); });. Filter changes with pipeline: const pipeline = [{$match: {'operationType': {$in: ['insert', 'update']}}}]; const stream = db.collection.watch(pipeline);. Change events include: operationType (insert/update/delete/replace), documentKey (_id), fullDocument (complete document), updateDescription (modified fields). Resume capability for fault tolerance: let resumeToken; stream.on('change', change => { resumeToken = change._id; /* persist to database */ }); stream = db.collection.watch([], {resumeAfter: resumeToken});. Use startAfter for resuming after invalidate events. Best practice (2025): Persist resume token after processing each event to prevent data loss. Use cases: real-time dashboards, cache invalidation, ETL pipelines, notification services, cross-platform sync. Requirements: replica sets (MongoDB 3.6+) or sharded clusters (MongoDB 4.0+). Alternative: Use tailable cursors for capped collections.

Sources

mongodb.com medium.com geeksforgeeks.org

99% confidence

How do you use projection operators to optimize query performance?

Projection specifies which fields to return, reducing network transfer and improving performance. Inclusion projection: db.users.find({}, {name: 1, email: 1, _id: 0}); returns only name and email fields. Exclusion projection: db.users.find({}, {password: 0, secret: 0}); excludes sensitive fields. Array projections: $slice returns subset, $elemMatch filters array elements: db.users.find({}, {scores: {$slice: 3}}); // First 3 scores. Covered queries occur when both query filter and projection use only indexed fields, avoiding document access: db.users.find({status: 'active'}, {name: 1, status: 1, _id: 0}); // With index on {status: 1, name: 1}. Critical: Exclude _id: 0 for covered queries. Always project only needed fields.

Sources

mongodb.com mongodb.com medium.com

99% confidence

How do you handle transactions in MongoDB sharded clusters?

Distributed transactions in sharded clusters (MongoDB 4.2+) use two-phase commit for ACID guarantees. Implementation: const session = client.startSession(); try { session.startTransaction({readConcern: {level: 'snapshot'}, writeConcern: {w: 'majority'}}); await orders.insertOne(doc, {session}); await inventory.updateOne(filter, update, {session}); await session.commitTransaction(); } catch (error) { await session.abortTransaction(); } finally { session.endSession(); }. Critical requirements (2025): Use snapshot read concern (only option for consistent multi-shard reads), w:'majority' write concern, include shard key in queries to avoid scatter-gather. Performance characteristics: Single-shard transactions have replica-set performance; cross-shard transactions incur 2-3x latency overhead. Limitations: 16MB transaction size limit, maxTimeMS defaults to transactionLifetimeLimitSeconds (60s). Best practices: Minimize cross-shard transactions via document design (embed related data), batch operations, set explicit timeouts (maxTimeMS on commitTransaction), implement retry logic for TransientTransactionError. Production tip: Monitor with sh.status() to verify shard key includes common query patterns.

Sources

mongodb.com mongodb.com percona.com

99% confidence

How do you create and use geospatial indexes in MongoDB?

Geospatial indexes enable location-based queries using earth-like sphere calculations. 2dsphere index for GeoJSON (recommended 2025): db.places.createIndex({location: '2dsphere'}); Store data as GeoJSON: {location: {type: 'Point', coordinates: [longitude, latitude]}}. Longitude: -180 to 180, Latitude: -90 to 90. Query nearby points: db.places.find({location: {$near: {$geometry: {type: 'Point', coordinates: [-73.9855, 40.7580]}, $maxDistance: 1000}}});. $near requires geospatial index (auto-sorted by distance). 2d index for legacy coordinate pairs: db.places.createIndex({coordinates: '2d'}); Query within area: db.places.find({coordinates: {$geoWithin: {$box: [[-74, 40], [-73, 41]]}}});. Operators: $geoIntersects (geometry intersections), $geoNear (sorted distance aggregation), $geoWithin (area containment). Index behavior: Always sparse (null/empty values not indexed). Uses WGS84 reference system for accurate spherical calculations. Use case: location-aware apps, proximity searches, delivery routing. Production note (2025): DB-Engines reports 35% annual growth in GeoJSON query usage.

Sources

mongodb.com mongodb.com database.guide

99% confidence

How do MongoDB capped collections work for high-throughput insert scenarios?

Capped collections are fixed-size collections with automatic FIFO (first-in-first-out) overwrite when capacity reached. Create: db.createCollection('logs', {capped: true, size: 10000000, max: 1000000}); // 10MB max size, 1M max documents. Performance characteristics: High-throughput inserts (no fragmentation, sequential writes), automatic oldest-document deletion, natural insertion order maintained. Tailable cursors for real-time streaming: const cursor = db.logs.find().addCursorFlag('tailable', true).addCursorFlag('awaitData', true); // Similar to tail -f. Use cases: High-volume logging, caching, PubSub systems, oplog replication (MongoDB uses internally). Limitations: Cannot shard, cannot increase size after creation (drop and recreate required), cannot delete individual documents (only emptying entire collection), updates cannot increase document size. No default _id index (add manually if needed for queries). Alternative (2025): For flexible TTL-based expiration, prefer TTL indexes on regular collections. Modern recommendation: Use Change Streams instead of tailable cursors for most real-time notification needs.

Sources

mongodb.com mongodb.com stackoverflow.com

99% confidence

How do you implement backup and recovery strategies for MongoDB?

MongoDB backup strategies by database size (2025): Small (<100GB): mongodump for logical backups: mongodump --uri='mongodb://secondary:27017' --oplog --gzip --archive=/backup/backup-$(date +%Y%m%d).gz; Restore: mongorestore --gzip --archive=/backup/backup-20250113.gz --oplogReplay;. Critical: Use --oplog for point-in-time recovery (PITR) capability, run on secondary to avoid primary performance impact. Medium-Large (100GB+): Use Percona Backup for MongoDB (PBM) with incremental backups, consistent oplog capture for sharded clusters, and PITR. Filesystem snapshots: Require journaling enabled, use fsyncLock/fsyncUnlock for consistency. Point-in-time recovery: Oplog captures incremental changes between backups. Formula: R+W > N for strong consistency (R=read, W=write, N=replicas). Best practices (2025): Backup frequency based on RPO (Recovery Point Objective - acceptable data loss), test restore procedures regularly, store backups off-site, automate with cron/systemd timers, monitor backup success. Production script: mongodump --uri='mongodb://secondary' --oplog --gzip --archive | aws s3 cp - s3://backups/mongo-$(date +%Y%m%d).gz;

Sources

mongodb.com percona.com geeksforgeeks.org

99% confidence

What are MongoDB best practices for production deployment security?

Production security best practices (2025): Enable authentication (MongoDB runs without auth by default): mongod --auth. Use SCRAM (default, recommended) or X.509 certificate authentication for TLS/SSL. Enable SSL/TLS encryption: mongod --sslMode requireSSL --sslPEMKeyFile server.pem. Create role-based access control: db.createUser({user: 'app_user', pwd: 'strong_password', roles: [{role: 'readWrite', db: 'myapp'}]});. Network security: Bind to specific interfaces, whitelist IP addresses, never expose to public internet without SSL/TLS. Database encryption: Use WiredTiger storage engine encryption or Client-Side Field Level Encryption. Regular security updates: Keep MongoDB updated, monitor advisories. Audit logging: Enable auditFile for compliance. Least privilege principle: Create unique user per person/application, grant minimal permissions, periodically rotate credentials.

Sources

mongodb.com severalnines.com percona.com

99% confidence

What is MongoDB $lookup and how do you optimize its performance?

$lookup performs left outer joins in aggregation pipelines, combining documents from two collections. Syntax: {$lookup: {from: 'movies', localField: 'movie_id', foreignField: '_id', as: 'movie_details'}}. Critical optimization: Create index on foreignField (massive bottleneck without it - causes full collection scan per document). Best practices: Place $match before $lookup to reduce input dataset, use pipeline syntax for filtered joins (MongoDB 3.6+), avoid pagination after $lookup. Performance impact: Without index on foreignField, query scans entire foreign collection for each input document. For frequent joins, consider embedded documents or denormalization instead. $lookup only performs left outer joins (unmatched documents get empty array).

Sources

mongodb.com medium.com

99% confidence

How do you choose the right MongoDB shard key?

Shard key distributes data across shards. Criteria: (1) High cardinality (many unique values), (2) Even write distribution (avoid monotonic keys like timestamps - creates hotspots), (3) Query isolation (include shard key in queries to avoid scatter-gather across all shards). Shard key types: Ranged (supports range queries, co-locates related data, risk of hotspots), Hashed (guarantees even distribution, optimal for event/time-series data, no range queries). Critical: Cannot change shard key after collection creation. Poor choices cause jumbo chunks (>64MB, unmovable). Best practice: Shard early (before 512GB data). Enable: sh.enableSharding('mydb'); sh.shardCollection('mydb.users', {userId: 'hashed'});. Monitor: sh.status() for chunk distribution.

Sources

mongodb.com percona.com

99% confidence

How do MongoDB write concerns and read concerns work?

Write concern determines acknowledgment level: w:1 (primary only, fast but may lose data), w:'majority' (majority of replica set, recommended default since 5.0), w:'all' (all members, highest durability). Example: db.orders.insertOne(order, {writeConcern: {w: 'majority', j: true, wtimeout: 5000}}). j:true ensures disk write before ack, wtimeout prevents indefinite blocking. Read concern controls consistency: local (may read uncommitted data that could rollback), majority (only committed data, durable), snapshot (point-in-time for transactions). Causal consistency: Use both writeConcern: {w: 'majority'} and readConcern: {level: 'majority'} for guarantees (read your writes, monotonic reads/writes). For transactions: use snapshot read concern with majority write concern.

Sources

mongodb.com mongodb.com

99% confidence

How do MongoDB change streams provide real-time notifications?

Change streams provide real-time data change notifications without polling. Basic usage: const stream = db.collection.watch(); stream.on('change', (change) => { console.log(change); });. Filter with pipeline: const pipeline = [{$match: {'operationType': {$in: ['insert', 'update']}}}]; const stream = db.collection.watch(pipeline);. Events include: operationType, documentKey (_id), fullDocument, updateDescription. Resume capability: stream.on('change', change => { resumeToken = change._id; }); stream = db.collection.watch([], {resumeAfter: resumeToken});. Best practice: Persist resume token after processing each event to prevent data loss. Use cases: real-time dashboards, cache invalidation, ETL pipelines, notifications. Requirements: replica sets (3.6+) or sharded clusters (4.0+).

Sources

mongodb.com medium.com

99% confidence

How do MongoDB capped collections work?

Capped collections are fixed-size with automatic FIFO overwrite when capacity reached. Create: db.createCollection('logs', {capped: true, size: 10000000, max: 1000000}); // 10MB, 1M docs max. Performance: High-throughput inserts (no fragmentation, sequential writes), automatic oldest-document deletion, natural insertion order. Tailable cursors for streaming: const cursor = db.logs.find().addCursorFlag('tailable', true).addCursorFlag('awaitData', true); // Like tail -f. Use cases: High-volume logging, caching, PubSub, oplog replication. Limitations: Cannot shard, cannot increase size after creation (must drop and recreate), cannot delete individual documents, updates cannot increase document size. Modern alternative (2025): Use TTL indexes on regular collections for flexible expiration, Change Streams instead of tailable cursors for real-time notifications.

Sources

mongodb.com mongodb.com

99% confidence

What are MongoDB backup strategies for production?

Backup strategies by database size: Small (<100GB): mongodump for logical backups: mongodump --uri='mongodb://secondary:27017' --oplog --gzip --archive=/backup/backup-$(date +%Y%m%d).gz; Use --oplog for point-in-time recovery (PITR), run on secondary to avoid primary performance impact. Medium-Large (100GB+): Use Percona Backup for MongoDB (PBM) with incremental backups, consistent oplog capture for sharded clusters, PITR. Filesystem snapshots: Require journaling enabled, use fsyncLock/fsyncUnlock for consistency. Best practices: Backup frequency based on RPO (acceptable data loss), test restore procedures regularly, store backups off-site, automate with cron, monitor backup success. Production script: mongodump --uri='mongodb://secondary' --oplog --gzip --archive | aws s3 cp - s3://backups/mongo-$(date +%Y%m%d).gz;

Sources

mongodb.com percona.com

99% confidence

What are MongoDB production security best practices?

Production security essentials: Enable authentication (MongoDB runs without auth by default): mongod --auth. Use SCRAM (default) or X.509 certificates. Enable TLS/SSL: mongod --sslMode requireSSL --sslPEMKeyFile server.pem. Role-based access: db.createUser({user: 'app_user', pwd: 'strong_password', roles: [{role: 'readWrite', db: 'myapp'}]});. Network security: Bind to specific interfaces, whitelist IPs, never expose to public internet without TLS. Database encryption: WiredTiger storage engine encryption or Client-Side Field Level Encryption. Regular updates: Keep MongoDB current, monitor security advisories. Audit logging: Enable auditFile for compliance. Least privilege: Unique user per person/app, grant minimal permissions, rotate credentials periodically.

Sources

mongodb.com percona.com

99% confidence

Browse All Topics