Liyang
Liyang

Reputation: 75

How to join two collection in mongo without lookup

I have two collection, there name are post and comment. The model structure is in the following. I want to use aggregation query post and sort by comments like length sum, currently I can query a post comments like length sum in the following query statement.

My question is how can I query post and join comment collection in Mongo version 2.6. I know after Mongo 3.2 have a lookup function.

I want to query post collection and sort by foreign comments likes length. Is it have a best way to do this in mongo 2.6?

post

{
    "_id": ObjectId("5a39e22c27308912334b4567"),
    "uid": "0",
    "content": "what is hello world mean?",
}

comment

/* 1 */
{
    "_id": ObjectId("5a595d8c2703892c3d8b4567"),
    "uid": "1",
    "post_id": "5a39e22c27308912334b4567",
    "comment": "hello world",
    "like": [
        "2"
    ]
}
/* 2 */
{
    "_id": ObjectId("5a595d8c2703892c3d8b4512"),
    "uid": "2",
    "post_id": "5a39e22c27308912334b4567",
    "comment": "hello stackoverflow",
    "like": [
        "1",
        "2"
    ]
}

Query a post comments like sum

db.getCollection('comment').aggregate([
    {
        "$match": {
            post_id: "5a39e22c27308912334b4567"
        }
    },
    {
        "$project": {
            "likeLength": {
                "$size": "$like"
            },
            "post_id": "$post_id"
        }
    },
    {
        "$group": {
            _id: "$post_id",
            "likeLengthSum": {
                "$sum": "$likeLength"
            }
        }
    }
])

Upvotes: 2

Views: 4280

Answers (2)

turivishal
turivishal

Reputation: 36094

There is no other way to join collections in the current MongoDB v6 without $lookup,

I can predict two reasons that causing you the issues,

  1. The $lookup is slow and expensive - How to improve performance?
    • $lookup optimization:
      • Follow the guideline provided in the documentation
    • Use indexs:
      • You can use the index on the reference collection's fields, as per your sample data you can create an index for post_id field, an index for uid field, or a compound index for both the fields on the basis of your use cases
      • You can read more about How to Improve Performance with Indexes and Document Filters
        db.comment.createIndex({ "post_id": -1 });
        db.comment.createIndex({ "uid": -1 });
        // or
        db.comment.createIndex({ "post_id": -1, "uid": -1 });
        
    • Document Filters:
      • Use the $match, $limit, and $skip stages to restrict the documents that enter the pipeline
      • You can refer to the documentation for more detailed examples
        { $skip: 0 },
        { $limit: 10 } // as per your use case
        
    • Limit the $lookup result:
      • Try to limit the result of lookup by $limit stage,
      • Try to coordinate or balance with improved query and the UI/Use cases

  1. You want to avoid $lookup - How to improve the collection schema to avoid $lookup?
    • Store the analytics/metrics:
      • If you are trying to get the total counts of the comments in a particular post then you must store the total count in the post collection whenever you get post get a new comment
        {
          "_id": ObjectId("5a39e22c27308912334b4567"),
          "uid": "0",
          "content": "what is hello world mean?",
          // new fields
          "total_comments": 10
        }
        
    • Store minimum reference data:
      • If you want to show the comments of a particular post, you can limit the result for ex: show 5 comments per post
      • You can also store a max of 5 latest comments in the post collection to avoid the $lookup, whenever you get the latest comment then add it and just remove the oldest comment from 5 comments
        {
          "_id": ObjectId("5a39e22c27308912334b4567"),
          "uid": "0",
          "content": "what is hello world mean?",
          // new fields
          "total_comments": 10,
          "comments": [
            {
              "_id": ObjectId("5a595d8c2703892c3d8b4567"),
              "uid": "1",
              "comment": "hello world"
            },
            {
              "_id": ObjectId("5a595d8c2703892c3d8b4512"),
              "uid": "2",
              "comment": "hello stackoverflow"
            }
          ]
        }
        
    • Must read about Reduce $lookup Operations
    • Must read about Improve Your Schema

Upvotes: 1

David Makogon
David Makogon

Reputation: 71031

There is no "best" way to query, as it'll really depend on your specific needs, but... you cannot perform a single query across multiple collections (aside from the $lookup aggregation pipeline function in later versions, as you already are aware).

You'll need to make multiple queries: one to your post collection, and one to your comment collection.

If you must perform a single query, then consider storing both types of documents in a single collection (with some identifier property to let you filter on either posts or comments, within your query).

Upvotes: 3

Related Questions