alex.ac
alex.ac

Reputation: 1063

Reading categories and number of articles in a single query

I have a database in MongoDb that contains two collections: 'categories' and 'articles'. I'm using Mongoose on NodeJs to connect to the database and read the categories. I want to calculate the number of articles for a category without making an additional request/query, so it would be perfect if I could solve this at the database level.

An item from the 'categories' collection looks like:

{
'_id' : ObjectId("..."),
'feed_id' : 1,
'name': 'Blog posts'
}

An item from the 'articles' collection looks like:

{
'_id' : ObjectId("..."),
'feed_id' : 1,
'title': 'Article title',
'published' : '12/09/2012',
...
}

so the categories and articles are linked using the 'feed_id' field.

I would like to export all categories together with a corresponding number of articles:

{
'_id' : ObjectId("..."),
'feed_id' : 1,
'name': 'Blog posts',
'no_articles': 4
}

I'm not sure how exactly I should do this:

1) Create a 'no_articles' field in the categories collection? If yes, I would like this to be updated automatically when a document is inserted or deleted from the articles collection.

2) Sum up the articles into 'no_articles' when categories are read?

I read something about MapReduce and group, but didn't quite understand if it's possible to use them for this particular task.

Upvotes: 1

Views: 975

Answers (2)

Mason
Mason

Reputation: 8846

Why not just store the category directly in the post document? Since it appears that you're creating new category document for each post that uses the category (as evidenced by a 1-to-many linkage using feed_id) then it might make sense to store an array of categories within the post document.

{
'_id' : ObjectId("..."),
'feed_id' : 1,
'title': 'Article title',
'published' : '12/09/2012',
...
categories : [ 'Blog Posts', 'Category 2' ]
}

Then you can do a

db.articles.find({categories : 'Blog Posts' })

To find all the articles with a certain category and you can add a .count() to get the count

Using those feed_ids to join is anathema to MongoDB. You can't join across collections so you either have to denormalize or put everything in one big collection. Mongo is designed so that you'll denormalize everything.

If this doesn't seem like the right way to solve your problem then you might be better suited to use a RDBMS.

Upvotes: 0

Sergio Tulentsev
Sergio Tulentsev

Reputation: 230521

This is one of use cases where traditional relational databases really shine.

It is impossible to do that with one query in mongodb. The "no_articles field" you mentioned is the way to go. Common name (among Rails people, anyway) for this approach is: Counter Cache Column. I am not very familiar with Mongoose, so I don't know whether it will maintain that field for you or not. MongoDB itself certainly won't do it. But maintaining it yourself isn't a lot of work, you just need to be accurate.

I advise against counting articles when you read categories. This is a classic example of N+1 query problem and counter cache column is there to prevent it.

Upvotes: 1

Related Questions