user2105469
user2105469

Reputation: 1433

mongodb: counting unique array elements

I'm trying to get an overall count of array elements across an entire collection.

The input collection is

 db.test.insert(    { "_id" : 1, "item" : "A", "hx" : [ { "label" : "new"    , "price" : 10.99 , "category" : "P1"} , 
                                                { "label" : "active" , "price" : 12.99 , "category" : "P2"} ] } )

 db.test.insert(    { "_id" : 2, "item" : "B", "hx" : [ { "label" : "new"    , "price" : 10.99 , "category" : "P2"} , 
                                                { "label" : "active" , "price" :  8.99 , "category" : "P3"} ] } )

 db.test.insert(    { "_id" : 3, "item" : "C", "hx" : [ { "label" : "new"    , "price" : 10.99 , "category" : "P1"} , 
                                                { "label" : "active" , "price" : 15.99 , "category" : "P4"} ] } )

I've tried variations of

 a = db.test.aggregate(
    [ { 
          $group: { 
                   _id : "$cat", 
                   cat : { $addToSet : "$hx.category" } 
                  } 
      },  
      { $unwind : "$hx.category" } , 
      { $group : { _id : "$cat", count: { $sum : 1 } } }
    ])

to get to

  db.res.insert({ "category" : "P1" , "count" : 2 })
  db.res.insert({ "category" : "P2" , "count" : 2 })
  db.res.insert({ "category" : "P3" , "count" : 1 })
  db.res.insert({ "category" : "P4" , "count" : 1 })

but the result I get is

 > a
 { "result" : [ ], "ok" : 1 }

Eventually I'd like to run this within pymongo: any hint towards a pymongo implementation is much appreciated.

Upvotes: 2

Views: 99

Answers (3)

chridam
chridam

Reputation: 103335

You could try the following aggregation:

>>> from bson.son import SON
>>> pipeline = [
...     {"$unwind": "$hx"},
...     {"$group": {"_id": "$hx.category", "count": {"$sum": 1}}},
...     {"$project": SON([("count", -1), ("_id", -1)])}
... ]
>>> list(db.test.aggregate(pipeline))
[{u'count': 1, u'_id': u'P3'}, {u'count': 2, u'_id': u'P2'}, {u'count': 1, u'_id': u'P4'}, {u'count': 2, u'_id': u'P1'}]

Upvotes: 2

RomanKonz
RomanKonz

Reputation: 1027

I assume that your category field is a set with unique entries.

db.test.aggregate([ 
    {$unwind: "$hx"},
    {$group: {_id: "$hx.category", count:{$sum:1}}}
])

Upvotes: 1

Azee
Azee

Reputation: 1829

Looks like a task for a Map-Reduce

Upvotes: 0

Related Questions