Keeno
Keeno

Reputation: 1686

MongoDB count number of times an item is referenced in another table

Take the following models:

A Question (like, a stack question!)

Question {
    _id : ObjectId(123),
    Question : "MongoDB: Count number of referenced items",
    Tags : [ObjectId(1), ObjectId(2), ObjectId(3)] 
}

A table of tags

Tag{
    _id : ObjectId(1),
    tag : "MongoDB",
    description : "stuff"
},
{
 .... more tags
}

I would like to list all the tags, and next to each tag show a count of how many times that tag is used. You can then click on the tag to edit it, like so:

MongoDB (1232)
C# (23232)

Whats the most robust way to do this. I have investigated the following.

  1. Aggregation framework. This enables me to get a count of how many times the ObjectId appeared in the Question table, but I have no easy way of getting the name of the tag. I guess I would have to do another database call, get all the tags then map them together.

  2. Have a COUNT on the tags table, and every time I submit a question check to see if a tag has been added or removed then increment or decrement this counter. I cant seem to get my brain to accept this is a good way to do this.

[Edit - see comments]

  1. Store tag names in the question. Use tag name is Id. This would allow #1 to work a better. Tag names May change however, so i guess this has knock-on effects.

Which of these would people recommend (and why). Are there any strategies I have missed?

I am using the mongo driver for C#

Upvotes: 2

Views: 340

Answers (2)

Matt27
Matt27

Reputation: 325

You could make 2 queries to mongo to get C# lists of each collection.

List<Question> questions = YourGetQuestionListFunction();
List<Tag> tags = YourGetTagListFunction();

Then make an in memory linq query on these C# collections and return a custom object with only the properties you want (including a question count). e.g.

var result = from t in tags
select new {
TagId = t._id,
TagName = t.tag,
Description = t.description,
QuestionCount = questions.Where(x => x.Tags != null && x.Tags.Contains(t.Id)).Count()
}

Upvotes: 1

Joachim Isaksson
Joachim Isaksson

Reputation: 180917

The most efficient approach I can think of given this limited view of the system is to save the tag name in the collection instead of the ObjectId. That would have the advantage that your group by/count would be one single aggregate operation, and I can't see why ObjectId would be a better identifier of a tag than the name would be (assuming the name is unique)

The downside is that the rename of a tag would be a two part operation instead of a one part. You'd have to first rename it in the tags table, then rename it in the questions table in all Tags arrays using something like;

db.questions.update({'Tags':"Old Name"},{$set:{'Tags.$':'New Name'}})

Given that a count probably is a more frequent operation, I'd say that doing that operation in a single call to MongoDB while increasing the rename to two separate operations should be a definite net gain.

Upvotes: 2

Related Questions