Reputation: 791
I have data across three collections and need to produce a data set which aggregates data from these collections, and filters by a date range.
The collections are:
db.games
{
_id : ObjectId,
startTime : MongoDateTime
}
db.entries
{
player_id : ObjectId, // refers to db.players['_id']
game_id : ObjectId // refers to db.games['_id']
}
db.players
{
_id : ObjectId,
screen_name,
email
}
I want to return a collection which is number of entries by player for games within a specified range. Where the output should look like:
output
{
player_id,
screen_name,
email,
sum_entries
}
I think I need to start by creating a collection of games within the date range, combined with all the entries and then aggregate over count of entries, and finally output collection with the player data, it's seems a lot of steps and I'm not sure how to go about this.
Upvotes: 0
Views: 157
Reputation: 69683
The reason why you have these problems is because you try to use MongoDB like a relational database, not like a document-oriented database. Normalizing your data over many collections is often counter-productive, because MongoDB can not perform any JOIN-operations. MongoDB works much better when you have nested documents which embed other objects in arrays instead of referencing them. A better way to organize that data in MongoDB would be to either have each game
have an array of player
s which took part in it or to have an array in each player
with the game
s they took part in. It's also not necessarily a mistake to have some redundant additional data in these arrays, like the names and not just the ID's.
But now you have the problem, so let's see how we can deal with it.
As I said, MongoDB doesn't do JOINs. There is no way to access data from more than one collection at a time.
One thing you can do is solving the problem programmatically. Create a program which fetches all players
, then all entries
for each player, and then the games
referenced by the entries where startTime
matches.
Another thing you could try is MapReduce. MapReduce can be used to append results to another collection. You could try to use one MapReduce job for each of the relevant collections into one and then query the resulting collection.
Upvotes: 1