jumbodrawn
jumbodrawn

Reputation: 128

Proper way to implement near-match searching MySQL

I have a table on a MySQL database that has two (relevant) columns, 'id' and 'username'.

I have read that MySQL and relational databases in general are not optimal for searching for near matches on strings, so I wonder, what is the industry practice for implementing simple, but not exact match, search functionalities- for example when one searches for accounts by name on Facebook and non-exact matches are shown? I found Apache Lucene when researching this, but this seems to be used for indexing pages of a website, not necessarily arbitrary strings in a database table.

Is there an external tool for this use case? It seems like any SQL query for this task would require a full scan, even if it was simply looking for the inclusion of a substring.

Upvotes: 1

Views: 204

Answers (1)

Julien Ambrosio
Julien Ambrosio

Reputation: 486

In your situation I would recommend for you to use Elasticsearch instead of relational database. This search engine is a powerful tool for implementing search and analytics functionality. Elasticsearch also flexible and versatile, with a rich query language using JSON as query language and support for many different types of data.

And of course supports near-match searching. As you said, MySQL and anothers relational databases aren't recommended to use near-match searching, they aren't for this purpose.

--------------UPDATE------------

If you want to use full-text-search using a relational database It's possile but you might have problem to scale if your numbers of users increase a lot. Keep in mind that ElasticSearch is robust and powerfull, so, you can do a lot of types of searches so easily in this search engine, but it can be more expensive too. When I propose to you use ElasticSearch I'm thinking about the scaling the search. But I've thinking in your problem since I answered and I've understood that you only need a simple full-text-search. For conclude, in the begginning you can use only relational database to do that, but in the future you might move your search to ElasticSearch or if your search became complex.

Follow this guide to do full-text search in Postgresql. http://rachbelaid.com/postgres-full-text-search-is-good-enough/

There's another example in MySql: https://sjhannah.com/blog/2014/11/03/using-soundex-and-mysql-full-text-search-for-fuzzy-matching/

Like I said in the comments, It's a trade-off you must to do. You can prefer to use ElasticSearch in the beginning or you can choose another database and move to ElasticSearch in the future.

I also recommend this book to you: Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems. Actually I'm reading this book and it would help you to understand this topic.

--------------UPDATE------------

To implement near-match searching in ElasticSearch you can use fuzzy matching query. The fuzzy matching query allows you to controls how lenient the matching should be, for example for this query bellow:

{
  "query": {
    "fuzzy": {
      "username": {
        "value": "julienambrosio",
        "fuzziness": 2
      }
    }
  }
}

They'll return "julienambrosio", such as "julienambrosio1", "julienambrosio12" or "juliembrosio".

You can adjust the level of fuzziness to control how lenient/strict the matching should be.

Before you create this example you should to study more about ElasticSearch. There're a lot of courses in udemy, youtube and etc.

You can read more about in the official docs.

Upvotes: 5

Related Questions