MongoDB aggregate get distinct values of field and output list of another field

Question

I have a MongoDB collection with documents like:

{'city': 'NYC', 'value': 'blue'},
{'city': 'NYC', 'value': 'red'},
{'city': 'Boston', 'value': 'blue'},
{'city': 'Boston', 'value': 'green'}

I want to aggregate distinct values of city with a list of distinct values of value, like:

{'city': 'NYC', 'values': ['blue', 'red']},
{'city': 'Boston', 'values': ['blue', 'green']}

How can I do this in a PyMongo pipeline?

Something with a shell like:

cursor = db.aggregate([
        {'$group': {
            '_id': {
                'value': '$value',
                'city': '$city'
            }
        }},
])

hhharsha36 · Accepted Answer

In the _id field of the group, you should specify only the keys you want to be grouped by (city in your case).

Followed by that key, the rest of the keys are additional keys you want from the query result. $addToSet will append each finding of the grouped field to an array without duplicates.

Below is the Aggregation code you are looking for:

cursor = db.aggregate([
  {
    "$group": {
      "_id": "$city",
      "value": {
        "$addToSet": "$value"
      }
    }
  },
])

In the about code, _id consists of grouped city names.

MongoDB aggregate get distinct values of field and output list of another field

Answers (1)

Related Questions