Reputation: 4622
In MongoDB "db.foo.find()" syntax, how can I tell it to match all letters and their accented versions?
For example, if I have a list of names in my database:
João
François
Jesús
How would I allow a search for the strings "Joao", "Francois", or "Jesus" to match the given name?
I am hoping that I don't have to do a search like this every time:
db.names.find({name : /Fr[aã...][nñ][cç][all accented o characters][all accented i characters]s/ })
Upvotes: 18
Views: 16872
Reputation: 5088
As of Mongo 3.2, you can use $text
and set $diacriticSensitive
to false:
{
$text:
{
$search: <string>,
$language: <string>,
$caseSensitive: <boolean>,
$diacriticSensitive: <boolean>
}
}
See more in the Mongo docs: https://docs.mongodb.com/manual/reference/operator/query/text/
Upvotes: 24
Reputation: 46291
I suggest you add an indexed field like NameSearchable
of simplified strings, e.g.
The same mapping that is used when inserting new items in the database can be used when searching. The original string with correct casing and accents will be preserved.
Most importantly, the query can make use of indexing. Case insensitive queries and regex queries can not use indexes (with the exception of rooted regexs) and will grow prohibitively slow on large collections.
Oh, and since the simplified strings can be created from the original strings, it's not a problem to add this to existing collections.
Upvotes: 11
Reputation: 1813
In this blog: http://tech.rgou.net/en/php/pesquisas-nao-sensiveis-ao-caso-e-acento-no-mongodb-e-php/
Somebody used the approach you were trying to do. This is as far as I know the only solution for the latest MongoDB version.
Upvotes: 3
Reputation: 43235
It seems more like fuzzy matching search which mongoDb does not support currently. What you can try is:
/1. Store variations of the name in seperate element in the collection for each entry. Then the query can be run by finding if the search term exists within the variations array.
or
/2. Store soundex
string for each of the names in the same collection. Then for your search string, get a soundex string , and query the database, you will get result which has similar Soundex
result to your query.
You can filter and verify that data more in your script.
example :
Soundex code for François = F652, Soundex Code for Francois = F652
Soundex code for Jesús = J220, Soundex Code for Jesus = J220
Check more here : http://creativyst.com/Doc/Articles/SoundEx1/SoundEx1.htm#SoundExConverter
Upvotes: 0