fgalan
fgalan

Reputation: 12294

Alternative to $addToSet to process elements in an array (instead of the array as a whole)

I have a collection in MongoDB with documents which follow this pattern:

{
        "_id" : {
                "id" : "ID1",
                "type" : "TYPE1"
        },
        "attrs" : [
                {
                        "name" : "ATTR1",
                        "value" : "foo"
                },
                {
                        "name" : "ATTR2",
                        "type" : "bar"
                },
                ...
                {
                        "name" : "ATTRn",
                        "value" : "blabla"
                }
        ]
}

Each document in the collection represents an entity (with an unique ID and type) an a set of attributes. Each document can have a varible number of attributes, even belonging to the same type (i.e. two documents with the same _id.type could have different sets of attributes).

I would like to get the names of the attributes associated to a given type (actually, the union of the attribute sets). I'm trying with the following:

db.runCommand({aggregate: "col", pipeline: [ {$group: {_id: "$_id.type", attr: {$addToSet: "$attrs.name"}} }]})

Which results in:

{
        "result" : [
                {
                        "_id" : "TYPE1",
                        "attr" : [
                                [
                                        "ATTR1",
                                        "ATTR2",
                                        "ATTR3"
                                ],
                                [
                                        "ATTR4",
                                        "ATTR5"
                                ]
                        ]
                },
                ...                    
        ],
        "ok" : 1
}

The problem is that $addToSet doesn't process element by element when adding the elements of an array. Instead of that, it considers the whole array as a single element an at it. Thus, at the end what I get is an "array of arrays" while what I would like to have is something like this:

{
        "result" : [
                {
                        "_id" : "TYPE1",
                        "attr" : [

                                    "ATTR1",
                                    "ATTR2",
                                    "ATTR3",                                 
                                    "ATTR4",
                                    "ATTR5"
                        ]
                },
                ...                    
        ],
        "ok" : 1
}

How the above query should be reformulated in order to get this result?

Upvotes: 0

Views: 1320

Answers (2)

King Friday
King Friday

Reputation: 26076

Look again for Mongo v2.4 +

$addToSet combined with $each works perfectly for saving and updating when you want to add array items only if they don't already exist.

{ $addToSet: { <field>: { $each: [ <value1>, <value2> ... ] } } }

reference: https://docs.mongodb.com/manual/reference/operator/update/each/#up._S_each

Upvotes: 1

JohnnyHK
JohnnyHK

Reputation: 311865

You need to $unwind the attrs array before grouping:

db.col.aggregate([
    {$unwind: '$attrs'},
    {$group: {_id: "$_id.type", attr: {$addToSet: "$attrs.name"}} }
])

Output:

{
    "result" : [ 
        {
            "_id" : "TYPE1",
            "attr" : [ 
                "ATTRn", 
                "ATTR2", 
                "ATTR1"
            ]
        }
    ],
    "ok" : 1
}

The $unwind duplicates each doc, once per attr element.

Upvotes: 3

Related Questions