KingFish
KingFish

Reputation: 9163

mongodb aggregation: Chaining aggregation

Not sure if this can be done, but I gotta ask:

Can I send in multiple aggregation in one request? In other words, rather than doing the following:

one_results = db.results.aggregate([ 
    { $project: { _id: 0, key: "$Field1" }}, 
    { $group: { '_id': '$key', count: { $sum: 1 }}} ])

two_results = db.results.aggregate([ 
    { $project: { _id: 0, key: "$Field2" }}, 
    { $group: { '_id': '$key', count: { $sum: 1 }}} ])

I want to do something like this:

[one_results, two_results] = db.results.aggregate(
  [ 
    { $project: { _id: 0, key: "$Field1" }}, 
    { $group: { '_id': '$key', count: { $sum: 1 }}} 
  ],
  [ 
    { $project: { _id: 0, key: "$Field1" }}, 
    { $group: { '_id': '$key', count: { $sum: 1 }}} 
  ])

I know it's a stretch, but I gotta ask...

Thanks

Upvotes: 1

Views: 3096

Answers (2)

Harty911
Harty911

Reputation: 97

Yes it's possible using $facet, see mongo documentation

In your use case:

results = db.results.aggregate([ 
{ $facet:
   {
      one_result: [ 
          { $project: { _id: 0, key: "$Field1" }}, 
          { $group: { '_id': '$key', count: { $sum: 1 }}} ],
      two_result: [ 
          { $project: { _id: 0, key: "$Field2" }}, 
          { $group: { '_id': '$key', count: { $sum: 1 }}}  ],
   }
}];

// result 1
... = results[0].one_result

// result 2
... = results[0].two_result 

Upvotes: 2

WiredPrairie
WiredPrairie

Reputation: 59763

The technical answer is definitely no. It's not supported.

The overhead of making the request to the server pales in comparison to the effort of calculating the aggregation results, so there is little benefit in sending two requests as an array. With many drivers, you could just send two separate requests, even asynchronously giving you similar if not better results (if the load could be distributed).

While you could do multiple calculations in one pipeline, you would want to avoid potentially unrelated aggregation calculations, possibly at greater CPU and IO cost than it's worth, and further, many pipelines wouldn't align well enough to combine them into one.

For example, the first pipeline operator were $match statements that selected a very different subset of documents, it wouldn't be practical to merge them (it's often recommended to try to filter as many documents using an index as the first step of a pipeline).

Upvotes: 3

Related Questions