shamon shamsudeen
shamon shamsudeen

Reputation: 5848

Mongodb lookup like search: local field as array of objects

I have two collections userProfile and skills,

Eg:userProfile

{
    "_id": "5f72c6d4e23732390c96b031",
    "name":"name"
    "other_skills": [
        "1","2"
    ],
    "primary_skills": [
        {
            "_id": "607ffd1549e13876fef7f2c5",
            "years": 4.5,
            "skill_id": "1"
        },
        {
            "_id": "607ffd1549e13876fef7f2c6",
            "years": 2,
            "skill_id": "2"
        },
        {
            "_id": "607ffd1549e13876fef7f2c7",
            "years": 1,
            "skill_id": "3"
        }
    ]
}

Eg:Skills

{
    "_id":1,
    "name": "Ruby on Rails",
}
{
    "_id":2,
    "name": "PHP",
}
{
    "_id":3,
    "name": "php",
}

I want to retrieve the userprofile based on the skills

eg: input of skill php i want to retrieve the userprofiles that matches either in primary_skills or other_skills

But I got confused about the implementation, I think it can do with pipeline in lookup and the elemMatch. This is the query I tried so far

const skills = ['php','PHP']    
userProfile.aggrigate([{
        $lookup:{
            from:'skills',
            let:{'primary_skills':'$primary_skills'},
            pipeline:[
                {
                    $match:{
                        primary_skills:{
                            $elemMatch:{
                               name:'' //not sure how to write match
                            }
                        }
                    }
                }
            ]
        }
    }])

Can somebody help me with this, Thanks in advance

Upvotes: 1

Views: 749

Answers (1)

Tom Slabbaert
Tom Slabbaert

Reputation: 22276

I'll first show you how to correct your pipeline to work, however this approach is very inefficient as you will have to $lookup on every single user in your db which is obviously a lot of overhead.

Here is how to properly match your condition:

const skills = ['php','PHP']    
db.userProfile.aggregate([
  {
    $lookup: {
      from: "skills",
      let: {
        "primary_skills": {
          $map: {
            input: "$primary_skills",
            as: "skill",
            in: "$$skill.skill_id"
          }
        },
        "other_skills": "$other_skills"
      },
      pipeline: [
        {
          $match: {
            $expr: {
              "$in": [
                "$_id",
                {
                  "$concatArrays": [
                    "$$other_skills",
                    "$$primary_skills"
                  ]
                }
              ]
            }
          }
        }
      ],
      as: "skills"
    }
  },
  {
     $match: {
        'skills.name': {$in: skills}
     }
  } 
])

Mongo Playground

As I've said I recommend you do not do this. what I suggest you do is split it into 2 calls, first fetch the relevant skill ids. and then query on users.

By doing this you can also utilize indexes for much faster queries, like so:

const skills = ['php', 'PHP'];
const matchedSkillIds = await skills.distinct('_id', {name: {$in: skills}});
const users = await userProfile.find({
    $or: [
        {
            'primary_skills.skill_id': {$in: matchedSkillIds}
        },
        {
            'other_skills': {$in: matchedSkillIds}
        }
    ]
})

Finally if you do insist on doing it in one query at the very least start the pipeline from the skill collection.

Upvotes: 1

Related Questions