Vyacheslav Orlovsky
Vyacheslav Orlovsky

Reputation: 436

SQLite. Searching by user-defined queries

I have an SQLite database (using the sqlite3 package) in my Flutter project, and I need to implement a search functionality. The main table I need to search through contains approximately 500k records.

The search query is defined by the user and can consist of multiple words. Additionally, several columns need to be checked. The search results should be sorted by relevance, which, as I understand, means that records containing every token from the search query should be ordered by the index of the first occurrence of any token in the column.

For example, if the user searches for "land", the results should be ranked as follows:

"The land of chaos"
"The highlander"
"Alice in Wonderland"

I've tried several approaches but haven't found an acceptable solution yet.

1. Custom Ranking Function

The closest I've come to achieving my goal was by adding a custom function to calculate the rank of each record:

   database.createFunction(
      functionName: 'get_search_result_rank',
      argumentCount: const AllowedArgumentCount.any(),
      directOnly: true,
      deterministic: true,
      function: (args) {
        final name = (args[0] as String).toLowerCase();
        int minIndex = maxInt;
        for (int i = 1; i < args.length; i++) {
          final index = name.indexOf(args[i] as String);
          if (index == -1) {
            return -1;
          }
          if (index < minIndex) {
            minIndex = index;
          }
        }
        return minIndex;
      },
    );

I then use the resulting minIndex values as ranks to filter and order records like this:

   SELECT
     id,
     name,
     article,
     get_search_result_rank(name, ${queryTokens.join(',')}) AS rank
   FROM products
   WHERE rank >= 0
   ORDER BY rank ASC

While this approach works, it is obviously far too slow for a table of this size.

2. Using FTS5 (Full-Text Search)

I also tried using FTS5, but it seems unsuitable for my case because it is designed for full-text search, while my query tokens may represent only a part of a word in a column.

Here’s how I set up the FTS virtual table:

   CREATE VIRTUAL TABLE IF NOT EXISTS 
     products_fts 
   USING fts5(id, article, name, content='', tokenize='trigram')

Then I populated it:

   INSERT INTO 
     products_fts (rowid, id, article, name) 
   SELECT ROWID, id, article, name
   FROM products

And performed the search query:

   SELECT
     id,
     name,
     article,
   FROM products
   WHERE products_fts MATCH '$queryString'
   ORDER BY rank ASC

Problems I encountered:

3. Using Basic LIKE Expressions

Another possible solution could be using simple LIKE expressions, but I have no idea how to implement proper ranking of the results in this case.

Does anyone have an idea of how to efficiently implement this search with proper ranking?

Upvotes: 1

Views: 52

Answers (1)

gmw
gmw

Reputation: 151

I think the instr function does what you want.

import sqlite3
connection = sqlite3.connect(':memory:')
cursor = connection.cursor()
keyword = 'land'
sql = f"""
  create table books as 
    select 'The Land of Chaos' as title union
    select 'The Highlander' union
    select 'Alice in Wonderland' union
    select 'Michael Kohlhaas'
  ;
"""
cursor.execute(sql)
sql = f"""
  select 
    title,
    instr(upper(title), upper('{keyword}'))
  from books
  where upper(title) like '%' || upper('{keyword}') || '%'
  order by 2, 1
  ;
"""
cursor.execute(sql).fetchall()

# [('The Land of Chaos', 5), ('The Highlander', 9), ('Alice in Wonderland', 16)]

Upvotes: 0

Related Questions