Tejas Mahajan
Tejas Mahajan

Reputation: 43

Use of $unwind and $text in aggregation framework mongodb

I have a collection in mongo db say called pages. In that i have an array of documents called articles. And in each of those documents in that array i have say an article number and article content.

What i want to do is that unwind the articles and then use $text to search for a word in article Content. But $text has to be in first stage of pipeline.

What happens now if i execute in first stage of pipeline without unwinding is that on first search of text it returns all the remaining articles for that document irrespective of whether it has the text.

Note : Pages collection contains a lot of documents.

Sample Collection :

{
   pageNo: 1,
   articles:[{
          articleNo:1,
          articleContent:"cat dog cat dog"
        },{
          articleNo:2,
          articleContent:" Some random text"
        }]
},
{
   pageNo: 2,
   articles:[{
          articleNo:1,
          articleContent:"Some random text"
        },{
          articleNo:2,
          articleContent:"cat dog cat"
        }]
}

Expected output: Say i search for "cat"

{
   pageNo:1,
    articles:[{
          articleNo:1,
          articleContent:"cat dog cat dog"
        }]
},
{
  pageNo:2,
   articles:[{
          articleNo:2,
          articleContent:"cat dog cat" 
        }]
}

Upvotes: 2

Views: 1652

Answers (1)

TomG
TomG

Reputation: 2539

The below answer will return your desired results. the first $match is used only to filter documents without cat in it at all, with the help of the text index. If you don't use this stage, the results will be the same and correct but may be slower.

db.pages.aggregate([
     {
         $match: {
             $text: {
                 $search: "cat"
             }
         } 
     },
     {
         $unwind: '$articles'
     },
     {
         $match: {
             'articles.articleContent': /cat/
         }
     },
     {
         $group: {
             _id: {
                 _id: '$_id',
                 pageNo: '$pageNo'
             },
             articles: {
                 $push: '$articles'
             }
         }
     },
     {
         $project: {
             _id: '$_id._id',
             pageNo: '$_id.pageNo',
             articles: 1
         }
     }
])

Upvotes: 1

Related Questions