Reputation: 353
I had created a collection in mongo db as show below
db.articles.insert([
{ _id: 1, subject: "one", author: "abc", views: 50 },
{ _id: 2, subject: "lastone", author: "abc", views: 5 },
{ _id: 3, subject: "firstone", author: "abc", views: 90 },
{ _id: 4, subject: "everyone", author: "abc", views: 100 },
{ _id: 5, subject: "allone", author: "efg", views: 100 },
{ _id: 6, subject: "noone", author: "efg", views: 100 },
{ _id: 7, subject: "nothing", author: "abc", views: 100 }])
after that I given text indexing to the field subject and author.
db.articles.createIndex(
{subject: "text",
author: "text"})
Now I am trying to search a word with "one" in indexed field. When I execute query ...
db.articles.count({$text: {$search: "\"one\""}})
... the result is 1
.
The problem is that when I want combination of words "one", "abc" ...
db.articles.count({$text: {$search: "\"one\" \"abc\""}}
... it gives the result as 4
. Including the records that contains the subject name as "lastone", "firstone", "everyone", "one" as the result.
So my question is that why the first query dosn't fetch 4 records? And how can I write a query that can fetch 4 records with word "one"?
Upvotes: 4
Views: 1651
Reputation: 47905
This command ...
db.articles.count({$text: {$search: "\"one\""}})
... will count the documents having the exact phrase "one"
. There is only one such document, hence the result is 1
.
Querying with the vaule "one" should only return on document since there is only one document containing either "one" or some value for which "one" is a stem. From the docs:
For case insensitive and diacritic insensitive text searches, the
$text
operator matches on the complete stemmed word. So if a document field contains the word blueberry, a search on the term blue will not match. However, blueberry or blueberries will match.
Looking at the documents in your question ...
one
is not a stem of everyone
one
is not a stem of lastone
one
is not a stem of allone
one
is not a stem of firstone
one
is not a stem of noone
... so none of these documents will be matched for the value one
.
You can, of course, query with multiple values. For example:
The docs suggest that this should be evaulated as one or abc
and it correctly returns 5:
db.articles.count({$text: {$search: "one abc"}})
The docs suggest that this should be evaulated as "abc" AND ("abc" or "one")
and it correctly returns 5:
db.articles.count({$text: {$search: "\"abc\" one"}})
The docs suggest that this should be evaulated as "one" AND ("one" or "abc")
but it somehow returns 4:
db.articles.count({$text: {$search: "\"one\" abc"}})
In the last example MongoDB includes the documents with subject in "one", "lastone", "firstone", "everyone" but excludes the document with subject "nothing". This suggest that it has somehow deemed "one" to be a stem of "lastone", "firstone" and "everyone" but when executing count({$text: {$search: "one"}})
it returns 1
which clearly indicates that one
is not seen as a stem of "lastone", "firstone" and "everyone".
I suspect this might be a bug and might be worth raising with MongoDB.
FWIW, it's possible that what you actually want is a partial string search in which case $regex
might work. The following query ...
db.articles.count({ subject: { $regex: /one$/ }, author: { $regex: /abc$/ } })
... means something like count where subject like '%one%' and author like '%abc%'
and for your documents that returns 4
i.e. the documents where subject
is one of "one", "lastone", "firstone", "allone", "everyone", "noone" and author
is "abc".
Upvotes: 4