Gregory Wullimann
Gregory Wullimann

Reputation: 567

MongoDB match partial text ignoring accents (diacritics)

I have two fields that I should match with simple text.

I'm currently using Jenssegers'Laravel Mongodb (https://github.com/jenssegers/laravel-mongodb)

The code right now is like this, and works almost as like I want:

$nameFilter = [[ 
    '$match' => 
    [
        '$or' =>
        [ 
            [ 
                'content.itemList.name' => ['$regex' => new Regex($request->q, 'i')] 
            ],
            [ 
                'content.itemList.commonName' => ['$regex' => new Regex($request->q, 'i')] 
            ]
        ]
    ]
]];

What's missing is that I want to ignore accents in the fields name and commonName, so for example if content.itemList.name is "foöBàr" and the query is "obar" I should get it in the results.

Edit: after days of trying I haven't found a solution yet.

Something so trivial I suppose should be easily done in MongoDB.

Other things I've tried:


Example documents

{
    lastname: "Mbappé",
    firstname: "Kylian",
    name: "Kylian Mbappé"
    otherfields: 123
}

What I want:

A query that matches any of lastname, firstname, or name with partial string (lian, appe, mbappe, etc.) both case-insensitive and diacritic (accent) insensitive.

Good matches should be, for example: "Mbappe" "appe" "mbappé" "Kylian" "kylian mbappe"

Upvotes: 1

Views: 2511

Answers (2)

Prakash Harvani
Prakash Harvani

Reputation: 1041

I think try in mongodb like this it will work.

db.users.find({name:{$regex: 'appe',$options:'i'},firstname:{$regex: 'lian',$options:'i'},lastname:{$regex: 'appé',$options:'i'}})

Upvotes: 0

lockonzero
lockonzero

Reputation: 126

The use of regex with collation is indeed not supported Use of collation in mongodb $regex

I am guessing that in order to make this work, I would to create a workaround such as a field in the MongoDB data without diacritics in order to use it for the search function.

Using your example document

{
    lastname: "Mbappé",
    firstname: "Kylian",
    name: "Kylian Mbappé"
    otherfields: 123
    name_clean: "Kylian Mbappe" // this is new
}

I would comment on the original post but Stack Overflow says that I need 50 reputation to do that :(

Upvotes: 4

Related Questions