user3105453
user3105453

Reputation: 1981

Combine aggregation pipelines with update operators

I would like to aggregate different measurements into a single document acting as a bucket. I receive individual measurements in this format:

[
  {
    "datetime": "2020-08-03T08:00:01.648475Z",
    "celsius": 20.7,
    "humidity": 57
  },
  {
    "datetime": "2020-08-03T08:05:07.756834Z",
    "celsius": 18.9,
    "humidity": 58
  }
]

The aggregated bucket should look like this:

{
  "_id": "2020-08-03",
  "celsiusHigh": 20.7,
  "celsiusLow": 18.9,
  "firstMeasurementAt": "2020-08-03T08:00:01.648475Z",
  "lastMeasurementAt": "2020-08-03T08:05:07.756834Z",
  "humidityHigh": 58,
  "humidityLow": 57,
  "measurements": [
    {
      "datetime": "2020-08-03T08:00:01.648475Z",
      "celsius": 20.7,
      "humidity": 57
    },
    {
      "datetime": "2020-08-03T08:05:07.756834Z",
      "celsius": 18.9,
      "humidity": 58
    }
  ]
}

Since MongoDB 4.2 I can make use of Aggregation Pipelines. However, these pipelines support only the stages $addFields, $set, $project, $unset, $replaceRoot, and $replaceWith. I would like to use other update operators, such as $setOnInsert (to set fields only once - upon insertion), $currentDate (to set e.g. lastUpdatedAt), or $addToSet to set some of the bucket's fields. Without $addToSet the operation looks quite verbose:

{ $set : { "measurements": { $setUnion : [ { $ifNull: [ "$measurements", [] ] } , [ { "datetime": "2020-08-03T08:05:07.756834Z", "celsius": 18.9, "humidity": 58 } ] ] } } }

Is there a possibility to combine aggregation pipelines with update operators?

Upvotes: 1

Views: 116

Answers (1)

Tom Slabbaert
Tom Slabbaert

Reputation: 22286

Update operators only support structure manipulation of a single object at a time so you can't achieve the result you want with them.

But what you can do is have an aggregation pipeline utilizing $out

db.collection.aggregate([
    {
        $group: {
            _id: {$ArrayElemAt: [{$split: ["$datetime", "T"]}, 0]},
            celsiusHigh: {$max: "$celsius"},
            celsiusLow: {$min: "$celsius"},
            firstMeasurementAt: {$min: "$datetime"},
            lastMeasurementAt: {$max: "$datetime"},
            humidityHigh: {$max: "$humidity"},
            humidityLow: {$min: "$humidity"},
            measurements: {$push: "$$ROOT"}
        }
    },
    {
        $out: "collection_name"
    }
]);

Note that this is under the assumptions all datetimes are saved under the same format / timezone. without this it can't be done.

Upvotes: 1

Related Questions