Mongodb query on reference document

Question

I have two schema defined in mongoose as follows,

Model: Article

var articleSchema = new mongoose.Schema({
    sequenceId: String
    title: String,
    abstract: String,
    authors: [String],
    date: Date,
    words: [{
        selectedWord: String,
        details: { type: Schema.Types.ObjectId, ref: 'Word' }
    }],
    otherTags: [String]
})

Model: Word

var wordSchema = new mongoose.Schema({
    preferredWord: String,
    synonyms: [String],
    category: String
})

Now I am trying to get 2 sets of results for the following scenarios,

Get all the articles that have 'wordAbc' and/or 'wordXyz' in either selectedWord, preferredWord or synonyms
Get all the unique words in selectedWord, preferredWord and synonyms across all the articles in the database

What would be the best/efficient way perform query using mongoose?

For the first result I tried that partial query but getting the CastError message,

Article.find({})
.populate( 'words', null, { 'details': {$in: ['wordAbc', 'wordXyz']}} )
.exec(function(err, docs) {});

Ravi Shankar Bharti · Accepted Answer

I think you can achieve both the things you want using aggregation pipeline.

Get all the articles that have 'wordAbc' and/or 'wordXyz' in either selectedWord, preferredWord or synonyms

First you will need to populate all the words in details field of words array, and then match the articles based on selectedWord, preferredWord or synonyms.

This can be achieved like this:

Article.aggregate([{
    $unwind : {
        path :"$words",
        preserveNullAndEmptyArrays :true
    }
},{
    $lookup : {
        from : "words",
        localField : "words.details",
        foreignField : "_id",
        as : "words.details"
    }
},{
    $unwind : {
        path : "$words.details",
        preserveNullAndEmptyArrays : true
    }
},{
    $match : {
        $or : [{
            "words.selectedWord" : {$in: ['wordAbc', 'wordXyz']}
        },{
            "words.details.preferredWord" : {$in: ['wordAbc', 'wordXyz']}
        },{
            "words.details.synonyms" : {$in: ['wordAbc', 'wordXyz']}
        }]
    }
},{
    $group : {
        _id : "$_id",
        title : {$first : "$title"},
        abstract : {$first : "$abstract"},
        authors : {$first : "$authors"},
        date : {$first : "$date"},
        words: {$push : "$words"},
        otherTags: {$first : "$otherTags"}
    }
}])

Get all the unique words in selectedWord, preferredWord and synonyms across all the articles in the database

In this case you will have to unwind words array and then populate words.details from words collection, and then unwind synonyms array, so that we can create a set of selectedWord, preferredWord and synonyms across all the articles, and then finally make a whole set of all the unique words in the last stage of aggregation pipeline.

This can be achieved like this:

Article.aggregate([{
    $project : {
        words : "$words"
    }
},{
    $unwind : "$words" 
},{
    $lookup : {
        from : "words",
        localField : "words.details",
        foreignField : "_id",
        as : "words.details"
    }
},{
    $unwind : "$words.details"
},{
    $unwind : "$words.details.synonyms"
},{
    $project : {
        selectedWord : "$words.selectedWord",
        preferredWord : "$words.details.preferredWord",
        synonyms : "$words.details.synonyms"
    }
},{
    $group : {
        _id : "null",
        selectedWord : {$addToSet  :"$selectedWord"},
        preferredWord : {$addToSet  :"$preferredWord"},
        synonyms : {$addToSet  :"$synonyms"}
    }
},{
    $project : {
        commonWords : {$setUnion : ["$selectedWord","$preferredWord","$synonys"]}
    }
}])

Explanation of 2nd aggregation.

$project : We want only words, so i carried on words field of all the articles and removed all other unnecessary fields from the pipeline.
$unwind : we need to unwind the words array, so that we can $lookup words.details from words collection in the next stage of pipeline
$lookup : populate details from words collection.
$unwind : Since $lookup returns an array, we need to unwind it to make it an object
$unwind : unwind words.details.synonyms, so that we can group them and create array of unique words in next stage of pipeline, At this stage, individual documents in aggregation pipeline will look something like this:
```
{
    words : {
        selectedWord :"someword",
        details : {
            preferredWord : "otherword",
            synonym : "synonymword"
        }
    }
}
```
$project : we needed this flatten the object structure. After this stage individual document in the pipeline would look like this:
```
{
     selectedWord :"someword",
     preferredWord : "otherword",
     synonym : "synonymword"
}
```
$group : combine all the selectedWord into one array, preferredWord into one array, and synonyms in one array, $addToSet is used to remove duplicate objects
$project : combine all the 3 arrays and create one array of unique Words

For details information on all the used operators of mongoDB, read respective documentations.

$setUnion documentation

$addToSet documentation

$project documentation

$unwind documentation

Documentation of all the operators of mongodb aggregation pipeline

I hope this helps you out

Mongodb query on reference document

Answers (1)

Related Questions