Taxellool
Taxellool

Reputation: 4322

mongodb aggregate $lookup vs find and populate

I have a Video Schema like this:

const VideoSchema = new mongoose.Schema({
  caption: {
    type: String,
    trim: true,
    maxlength: 512,
    required: true,
  },
  owner: {
    type: mongoose.Schema.ObjectId,
    ref: 'User',
    required: true,
  },
  // some more fields
  comments: [{
    type: mongoose.Schema.ObjectId,
    ref: 'Comment',
  }],
  commentsCount: {
    type: Number,
    required: true,
    default: 0,
  },
}, { timestamps: true });

and a simple Comment schema like this:

const CommentSchema = new mongoose.Schema({
  text: {
    type: String,
    required: true,
    maxLength: 512,
  },
  owner: {
    type: mongoose.Schema.ObjectId,
    ref: 'User',
    required: true,
  },
  videoId: {
    type: mongoose.Schema.ObjectId,
    ref: 'Video',
    required: true,
    index: true,
  },
}, { timestamps: true });

and with schemas like this I'm able to perform any kind of find query on my Video collection and populate it with its comments:

Video.find({ owner: someUserId }).populate({ path: 'comments' });

My question is how necessary is it to keep comment ids inside video collection? given that I have indexed the videoId field in my Comment schema, how bad it would be (speaking of performance) to get rid of these comment ids and the count of them and use aggregation $lookup to find a video's comments like this:

Video.aggregate([
  {
    $match: {
      owner: someUserId,
    },
  },
  {
    $lookup: {
      from: 'comments',
      localField: '_id',
      foreignField: 'videoId',
      as: 'comments',
    }
  }
])

How different are these in terms of performance?

Upvotes: 2

Views: 4584

Answers (1)

Akrion
Akrion

Reputation: 18525

Well there is no way the $lookup would be faster than having the list of the comment ids on the actual video object. I mean you have to do a whole other request to mongo to get them now. So performance wise obviously the lookup would add time. That is assuming you are not using mongoose populate to "convert" those comment ids into the referenced objects.

If you are then removing the comments from the video (as well as the actual count prop) and doing the lookup is the way to go. Since you are matching right away in your arg and then doing a simple lookup I do not see how this would be a bottleneck for you. Also you can optimize/change/tune your aggregation vie explain etc.

You video schema would be pretty clean that way:

const VideoSchema = new mongoose.Schema({
  caption: {
    type: String,
    trim: true,
    maxlength: 512,
    required: true,
  },
  owner: {
    type: mongoose.Schema.ObjectId,
    ref: 'User',
    required: true,
  },
  // some more fields
}, { timestamps: true });

Upvotes: 3

Related Questions