godie
godie

Reputation: 129

mongodb full text search, have a spellchecker or a way to implement?

I have a index full text on my collection of mongoidb and i want to know if exist a way to find a result even if the word is misspelled, for example:

{id:1,
description: "The world is yours"
},{
id:2
description: "Hello my friend"}

if i search for the word 'wordl' the result would be:

{id:1,
    description: "The world is yours"
   }

is it that possible?

Upvotes: 1

Views: 1660

Answers (1)

WebMentor
WebMentor

Reputation: 36

MongoDB does not support Fuzzy Search at this time. One way to do it is using string/sounding similarity algorithms, like soundex.

I've created a simple example in PHP to show how to do it with soundex:

$dbName = 'soundex';
$client = new MongoClient("mongodb://127.0.0.1", array('db' => $db));
$db     = $client->$dbName;

$phrases = array(
    'This, is the last of earth. I am content.',
    'Take a step forward, lads. It will be easier that way.',
    'Adieu, mes amis. Je vais à la gloire. (Farewell, my friends. I go to glory.)',
    'I will die like a true-blue rebel. Don\'t waste any time in mourning - organize.'
);

// just for the example, so we can reuse the script several times
$db->phrases->drop();

foreach ($phrases as $phrase) {
    // remove all non characters/whitespaces
    $phrase = preg_replace('/[^a-z\s]/i', ' ', $phrase);

    // remove multiple whitespaces and whitespaces at the beginning/end of the phrase
    $phrase = preg_replace('/\s\s+/', ' ', trim($phrase));

    // split the phrase into unique words
    $words = array_unique(explode(' ', $phrase));

    $soundex = array();

    foreach ($words as $word) {
        $soundex[] = soundex($word);
    }

    $soundex = array_unique($soundex);

    $db->phrases->insert(array(
        'phrase'  => $phrase,
        'soundex' => $soundex
    ));
}

// search for some words

$searches = array(
    'earht',    // earth, phrase 1
    'eaasierr', // easier, phrase 2
    'faerwel',  // farewell, phrase 3
    'reebell'   // rebel, phrase 4
);

foreach ($searches as $search) {
    $cursor = $db->phrases->find(array(
        'soundex' => array(
            '$in' => array(soundex($search))
        )
    ));

    if ($cursor->count()) {
        foreach ($cursor as $doc) {
            echo "Search result for '$search':\n";
            echo $doc['phrase'] . "\n\n";
        }
    } else {
        echo "No results for '$search'\n\n";
        echo soundex($search);
    }
}

This example will output:

Search result for 'earht':
This is the last of earth I am content

Search result for 'eaasierr':
Take a step forward lads It will be easier that way

Search result for 'faerwel':
Adieu mes amis Je vais la gloire Farewell my friends I go to glory

Search result for 'reebell':
I will die like a true blue rebel Don t waste any time in mourning organize

It's just a simple example without removing stop words. You also have to remember to create an index on the soundex values.

Learn more about soundex: http://php.net/soundex

Hope that helps to get an idea of how to do a fuzzy search with MongoDB.

Upvotes: 2

Related Questions