Hide
Hide

Reputation: 3317

MongoDB perform join with aggregate

I have two collections User and Post.

User

Post

From perspective of rdb, User:Post relationship is 1:N.

Each user can write multiple post.

For example, currently documents is inserted like this.

User

> db.user.find()
{ "_id" : ObjectId("5d15e41a1b48d9417ebc28d2"), "user_id" : NumberLong(1), "region" : "US", "is_join" : true }
{ "_id" : ObjectId("5d15e41a1b48d9417ebc28d5"), "user_id" : NumberLong(2), "region" : "KR", "is_join" : true }
{ "_id" : ObjectId("5d15e41a1b48d9417ebc28d8"), "user_id" : NumberLong(3), "region" : "US", "is_join" : true }
{ "_id" : ObjectId("5d15e41a1b48d9417ebc28da"), "user_id" : NumberLong(4), "region" : "KR", "is_join" : false }
{ "_id" : ObjectId("5d15e41a1b48d9417ebc28dc"), "user_id" : NumberLong(5), "region" : "US", "is_join" : true }

Post

> db.post.find()
{ "_id" : ObjectId("5d15e41a1b48d9417ebc28d3"), "post_id" : NumberLong(1), "user_id" : NumberLong(1), "body" : "first", "is_block" : false }
{ "_id" : ObjectId("5d15e41a1b48d9417ebc28d4"), "post_id" : NumberLong(4), "user_id" : NumberLong(1), "body" : "fourth", "is_block" : false }
{ "_id" : ObjectId("5d15e41a1b48d9417ebc28d6"), "post_id" : NumberLong(2), "user_id" : NumberLong(2), "body" : "second", "is_block" : false }
{ "_id" : ObjectId("5d15e41a1b48d9417ebc28d7"), "post_id" : NumberLong(3), "user_id" : NumberLong(2), "body" : "third", "is_block" : false }
{ "_id" : ObjectId("5d15e41a1b48d9417ebc28d9"), "post_id" : NumberLong(5), "user_id" : NumberLong(3), "body" : "fifth", "is_block" : true }
{ "_id" : ObjectId("5d15e41a1b48d9417ebc28db"), "post_id" : NumberLong(6), "user_id" : NumberLong(4), "body" : "sixth", "is_block" : false }
{ "_id" : ObjectId("5d15e41a1b48d9417ebc28dd"), "post_id" : NumberLong(7), "user_id" : NumberLong(5), "body" : "seven", "is_block" : true }
{ "_id" : ObjectId("5d15e41a1b48d9417ebc28de"), "post_id" : NumberLong(8), "user_id" : NumberLong(5), "body" : "eight", "is_block" : false }

To perform join via aggregate, there is more condition that have to apply.

  1. User region='US', is_join=true
  2. Posts created by user with the result of number 1
  3. Post is_block=false
  4. Sort by post_id
  5. (Optional) If user do not wrote any post, except it. I know it can be perform through preserveNullAndEmptyArrays, but I think it cause performance issue.

Desired result

{
    "posts" : [
        {
            "post_id" : NumberLong(1),
            "body" : "first",
            "is_block" : false
        },
        {
            "post_id" : NumberLong(2),
            "body" : "second",
            "is_block" : false
        },
        {
            "post_id" : NumberLong(3),
            "body" : "third",
            "is_block" : false
        },
        {
            "post_id" : NumberLong(4),
            "body" : "fourth",
            "is_block" : false
        },
        {
            "post_id" : NumberLong(8),
            "body" : "eight",
            "is_block" : false
        }
    ]
}

post_id = 5 was excluded by is_block=true

post_id = 6 was excluded by is_join=false

post_id = 7 was excluded by is_block=true

And whole result is sorted by post_id.

I'm new at mongodb So, maybe I thinking too much in the form of a relational database.

And I don't know it can be perform on NoSQL.

Is there any way about it?

Any suggestion, very appreciate.

Thanks.

Upvotes: 0

Views: 611

Answers (2)

Rupesh
Rupesh

Reputation: 890

Try this -

To join two collection, you can use aggregation as -

  User.aggregate([{
    '$match': { 'region':'US', 'is_join': true }  // match from users collection
  }, {
    $lookup: {             // it will aggregate the result from both collection
      from: 'posts',
      localField: '_id',
      foreignField: 'user_id',
      as: 'posts'
    }
  },
   {"$unwind":"$posts"},
   {$match : { "posts.is_block" : false } }, // check inside post collection
   {$sort : { "posts.post_id" : 1}},  // sort the data
 ], (err, users) => {
    if (err) return callback(err, null);

        console.log('users :', users);

  });

Upvotes: 0

Shivam Mishra
Shivam Mishra

Reputation: 1856

You can achieve such result using lookup pipeline operator.

const region = 'US';
const is_join = true;
const is_block = false;
const query = [
    {
        $match: {
            region: region,
            is_join: is_join
        }
    },
    {
        $lookup: {
            let: { user_id: "$user_id" },
            from: 'posts',
            pipeline: [
                {
                    $match: {
                        $expr: {
                            $and: [
                                { $eq: ["$$user_id", "$user_id"], },
                                { $eq: ["$is_block", is_block] }
                            ]

                        }
                    }
                },
                {
                    $sort: {
                        post_id: 1
                    }
                }
            ],
            as: "posts"
        }
    },
    {
        $unwind: "$posts"
    },
    {
        $group:{
            _id: "mygroup",
            posts: { 
                $push: {
                    post_id: "$posts.post_id",
                    body: "$posts.body",
                    is_block: "$posts.is_block",
                }
            }
        }
    },
    {
        $project:{
            _id: false
        }
    }
]
db.users.aggregate(query)

OUTPUT

{
    "posts" : [
        {
            "post_id" : 1,
            "body" : "first",
            "is_block" : false
        },
        {
            "post_id" : 4,
            "body" : "fourth",
            "is_block" : false
        },
        {
            "post_id" : 8,
            "body" : "eight",
            "is_block" : false
        }
    ]
}

Upvotes: 1

Related Questions