davidx1
davidx1

Reputation: 3673

MongoDB: Counting how many items with a given value exist in an array, that's in a document?

I have a document that looks something like this

{
  name : james,
  books : [
    {
      title: title1,
      year: 1990
    },
    {
      title: title2,
      year: 1990
    },
    {
      title: title3,
      year: 1991
    }
  ]
}

Say if I want to count how many books james owns with the year of 1990, how would I go about doing that? I've tried the following. But I realized it doesn't work because 'books' is an array.

db.collection(collectionName).find({name:james, books: {year: 1990}}).count(function(book_count){
  console.log(book_count);
  }

Any pointers would be much appreciated. Thanks!

EDIT:

I did see on another answer than you can use this code below to get the size of the whole array. But I am wondering how to get a count of items in the array with a particular parameter. ie. instead of seeing how many books james owns. I want to know how many of james' book are published in 1990.

db.mycollection.aggregate({$project: { count: { $size:"$foo" }}})

Upvotes: 2

Views: 2273

Answers (2)

chridam
chridam

Reputation: 103355

The aggregation framework is ideal for such. Consider running the following pipeline to get the desired result.

pipeline = [
    {
        "$match": {
            "name": "james",
            "books.year": 1990
        }
    },
    {
        "$project": {
            "numberOfBooks": {
                "$size": {                  
                    "$filter": {
                        "input": "$books",
                        "as": "el",
                        "cond": { "$eq": [ "$$el.year", 1990 ] }
                    }                   
                }
            }
        }
    }
];
db.collection.pipeline(pipeline);

The above pipeline uses the new $filter operator available for MongoDB 3.2 to produce an array which meets the specified condition i.e. it filters outer elements that do not satisfy the criteria. The initial $match pipeline is necessary to filter out documents getting into the aggregation pipeline early as a pipeline optimization strategy.

The $size operator which accepts a single expression as argument then gives you the number of elements in the resulting array, thus you have your desired book count.


For an alternative solution which does not use the $filter operator not found in earlier versions, consider the following pipeline operation:

pipeline = [
    {
        "$match": {
            "name": "james",
            "books.year": 1990
        }
    },
    {
        "$project": {
            "numberOfBooks": {
                "$size": {                  
                    "$setDifference": [
                        {
                            "$map": {
                                "input": "$books",
                                "as": "el",
                                "in": {
                                    "$cond": [
                                        { "$eq": [ "$$el.year", 1990 ] },
                                        "$$el",
                                        false
                                    ]
                                }
                            }
                        },
                        [false]
                    ]                   
                }
            }
        }
    }
];
db.collection.pipeline(pipeline);

The $project pipeline stage involves fittering the books array so that you remove the documents which do not have the year 1990. This is made possible through the $setDifference and $map operators.

The $map operator in essence creates a new array field that holds values as a result of the evaluated logic in a subexpression to each element of an array. The $setDifference operator then returns a set with elements that appear in the first set but not in the second set; i.e. performs a relative complement of the second set relative to the first. In this case it will return the final books array that has elements with year 1990 and subsequently the $size calculates the number of elements in the resulting array, thus giving you the book count.


For a solution that uses the $unwind operator, bearing in mind that (thanks to this insightful response from @BlakesSeven in the comments):

Since there is only a single document returned with nothing but a null key and a count, there is no more chance for this breaking that limit than the previous operation with the same output. It's not that $unwind "breaks the limit", it's that it "produces a copy of each document per array entry", which uses more memory ( possible memory cap on aggregation pipelines of 10% total memory ) and therefore also takes "time" to produce as well as "time" to process.

and as a last resort, run the following pipeline:

pipeline = [
    {
        "$match": {
            "name": "james",
            "books.year": 1990
        }
    },
    { "$unwind": "$books" },
    {
        "$match": { "books.year": 1990 }
    },
    {
        "$group": {
            "_id": null
            "count": { "$sum": 1 }
        }
    }
]
db.collection.pipeline(pipeline)

Upvotes: 2

cubbuk
cubbuk

Reputation: 7920

You can use $elemMatch in projection to retrieve the document with only the matching books.

db.collection(collectionName).findOne({name:james, books: {year: 1990}}, { books: { $elemMatch: { year: 1990 } } }). // returned document will only contains books having the year 1990.

If you want only the count then you need to use aggregation framework. First match the documents, then unwind the books array, then match against year field. Something like following should work:

db.collection(collectionName).aggregate([{$match: {name: "james"}}, {$unwind:"$books"}, {$match:{"books.year":1990}}]

Upvotes: 0

Related Questions