lostintranslation
lostintranslation

Reputation: 24583

Mongodb aggregate with 'join'

I have a 'reports' collection in mongodb. The collection holds a the ObjectId of the user that the report belongs to. I am currently doing an aggregate so that I can group the reports by user.

db.reports.aggregate(
    { 
        $group: { 
            _id: "$user", 
            stuff: { 
                $push: {
                    things: {
                        properties: "$properties"
                    }
                }
            }
        }
    },  
    { 
        $project: { 
            _id: 0, 
            user: "$_id", 
            stuff: "$stuff"
        }
    }
)

This give me an array of users id and the report 'stuff'. Instead of just the userId I am wondering if I can form the aggregate such that instead of just the userId I could hit the users collection and return more information about the user.

Is that possible? I am using mongoose as an ORM, but looking at the mongoose docs, aggregate looks to be a straight pass through to mongodb's aggregate function. Don't think I can take advantage of mongoose's populate as aggregate is not dealing with a schema, but I could be wrong.

Lastly reports are computer generated,each user could have millions. This prevents me from storing the reports in an array with the users collection.

Upvotes: 0

Views: 4535

Answers (2)

Discipol
Discipol

Reputation: 3157

You could combine the two collections into one and do aggregation thusly. You don't need to have same structured top-level documents in an collection. It wouldn't be too difficult to mark each document of type A or B and adapt your queries involving A type documents to include if the document is type A.

You might not like how it looks but would set you up for the "join".

Upvotes: 0

WiredPrairie
WiredPrairie

Reputation: 59793

No, you can't do that as there are no joins in MongoDB (not in normal queries, MapReduce, or the Aggregation Framework).

Only one collection can be accessed at a time from the aggregation functions.

Mongoose won't directly help, or necessarily make the query for additional user information any more efficient than doing an $in on a large batch of Users at a time (an array of userId). ($in docs here)

There really aren't work-arounds for this as the lack of joins is currently an intentional design of MongoDB (ie., it's how it works). Two paths you might consider:

  1. You may find that another database platform would be better suited to the types of queries that you're trying to run.
  2. You might try using $in as suggested above after the aggregation results are returned to your client code (it's one of the ways Mongoose handles fetching related documents). Just grab the userIds, and in batches request the associated User documents. While it's a bit more network traffic, depending on how you want to display the results, you may consider it an optimization to only fetch extra User information as it's shown to a user (if that's possible), by incrementally loading the extra data.

Upvotes: 2

Related Questions