Reputation: 831
Can someone explain me what I do wrong. I am use MongoDB version: 3.2.6. In example below I'm creates two similar documents in collection 'users'. Then I'm create compound index for $text operator then search for some text 'John':
> db.users.insert({name: 'John Smith', email: '[email protected]'})
WriteResult({ "nInserted" : 1 })
> db.users.insert({name: 'Some Man', email: '[email protected]'})
WriteResult({ "nInserted" : 1 })
> db.users.createIndex({name: 'text', email: 'text'})
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
> db.users.find({$text:{$search: 'John'}})
{ "_id" : ObjectId("57fe313f4dfa1e8339b08174"), "name" : "John Smith", "email" : "[email protected]" }
And as you see all working fine but if I try find document using word 'Some' it's don't work (empty results):
> db.users.find({$text:{$search: 'Some'}})
>
For example if try to find this document by other word 'Man' all working good or if change 'Some' for example to 'Somer' all working good too. What the mystic with 'Some' maybe it's special word or something... Thank you for your help.
Upvotes: 1
Views: 349
Reputation: 1487
In mongodb, the default language for the indexed data is english.
Hence it ignores stop words of english language. To stop ignoring stop words and include them in searches, we can give none as default language while creating index.
If you specify a language value of "none", then the text search uses simple tokenization with no list of stop words and no stemming. Example
db.users.createIndex(
{name: 'text', email: 'text'},
{ default_language: "none" }
)
Upvotes: 0
Reputation: 5918
As Erik mentioned, Some
is being interpreted as a stop word for the English language, which is the default language for text indices, if not specified otherwise.
If you want a workaround for your particular scenario, you can simply change the default language when defining your index, by setting it to none
:
db.users.createIndex(
{name: 'text', email: 'text'},
{default_language: 'none'}
);
Thus, the index that you created will be hit as the field content is tokenized and compared with the keyword that you provide, returning all the records that match.
> db.users.find({ $text: { $search: "Some" }});
{ "_id" : ObjectId("57fe3e21a134e614a7178c1c"), "name" : "Some Man", "email" : "[email protected]" }
Upvotes: 1
Reputation: 3048
I think the problem is that 'Some' is considered to be a Stop Word and thus is discarded in the search. You get the same thing for 'and'
or 'his'
.
Insert for example this user:
db.users.insert({name: 'Tom and his little brother', email: '[email protected]'})
This is what you get when querying:
> db.users.find({$text:{$search: 'and'}})
> db.users.find({$text:{$search: 'his'}})
> db.users.find({$text:{$search: 'little'}})
{ "_id" : ObjectId("57fe39ed8aaf457673d4354d"), "name" : "Tom and his little brother", "email" : "[email protected]" }
Upvotes: 0