1001Ferramentas
🌿Generators

MongoDB aggregate() Builder

Build MongoDB aggregate() pipelines with $match, $group, $sort, $limit.


  

MongoDB Aggregation Pipeline

The aggregation pipeline is MongoDB's most powerful data-processing primitive. A pipeline is an array of stages executed in order β€” each stage transforms the stream of documents and feeds the next: db.collection.aggregate([{ $match: ... }, { $group: ... }, { $sort: ... }]). It replaces the legacy mapReduce, runs natively in parallel on the storage engine and is the foundation of every dashboard, BI report and analytics query in modern MongoDB.

Core stages

  • $match β€” filter like find; put it first to leverage indexes.
  • $project, $addFields/$set, $replaceRoot β€” reshape documents.
  • $group β€” group by an expression and apply accumulators.
  • $sort, $limit, $skip, $count, $sample.
  • $unwind β€” denormalise arrays into one document per element.
  • $lookup β€” left-outer JOIN against another collection.
  • $facet β€” run several sub-pipelines in parallel, returning a single document.
  • $bucket, $bucketAuto β€” histograms.
  • $out, $merge β€” write results into a collection (materialised views).
  • $densify, $fill β€” time-series gap-filling (MongoDB 5.1+).

Group accumulators and expressions

Inside $group: $sum, $avg, $min, $max, $count, $push (array), $addToSet (set), $first, $last, $stdDevPop/$stdDevSamp, $median/$percentile (5.0+). Conditional expressions live everywhere: $cond (if/else), $switch, $ifNull; arithmetic with $multiply, $divide, $mod; string ops $concat, $toUpper, $regexMatch; date ops $dateToString, $dateDiff.

Worked example β€” top 10 customers

db.orders.aggregate([
  { $match: { status: "completed", createdAt: { $gte: ISODate("2025-01-01") } } },
  { $group: { _id: "$userId", total: { $sum: "$amount" }, orders: { $sum: 1 } } },
  { $sort: { total: -1 } },
  { $limit: 10 },
  { $lookup: { from: "users", localField: "_id", foreignField: "_id", as: "user" } },
  { $unwind: "$user" },
  { $project: { _id: 0, name: "$user.name", total: 1, orders: 1 } }
])

Performance

Rules of thumb: push $match and $project as early as possible; only the first stage can use an index, so leading $match + $sort matter most. The optimiser does reorder stages when safe. Hard limit: 100 MB of RAM per stage; pass { allowDiskUse: true } for large groupings/sorts. Each document still caps at 16 MB. Inspect plans with db.collection.aggregate(pipeline).explain('executionStats'). Atlas Search adds the $search stage powered by Lucene for full-text/vector queries.

FAQ

Does aggregation replace MapReduce? Yes. mapReduce is deprecated since MongoDB 5.0 β€” aggregation is faster, runs in parallel and supports almost every JS pattern with native operators.

Will my $match use an index? Only if it is the very first stage and the field is indexed. A $match placed after $project or $group runs against intermediate, unindexed documents.

Is the pipeline too complex to author by hand? Use MongoDB Compass β€” its Aggregation tab is a visual builder with live preview; Studio 3T offers an even richer drag-and-drop UI and exports JS/Python/Java.

Can I materialise a view? Yes β€” finish the pipeline with $out: "summary" (replaces collection) or $merge (upserts), and schedule it as an Atlas Trigger or external cron. Read-only views via db.createView() also reuse pipelines.

Related Tools