waylonion
waylonion

Reputation: 6976

SQLite match against words

I have a sqlite database with a table named kanji which has a column named english_meaning. I would like to query for a "word".

This is my existing query and code:

public Cursor fetchAllKanjiWithQuery(String input) {
    input = "%" + input + "%";
    String query = "SELECT * FROM kanji WHERE literal LIKE ? OR radicals LIKE ? OR english_meaning LIKE ? OR kun_reading LIKE ? OR romaji LIKE ? ORDER BY stroke_count";
    String[] selectionArgs = new String[]{input, input, input, input, input};
    Cursor cursor = db.rawQuery(query, selectionArgs);
    if (cursor != null) {
        cursor.moveToFirst();
    }
    return cursor;
}

With the input being "one", results where english_meaning is "bone" or "oneself" would match.

However, I'm only interested in strings with the word "one". For example, valid matches should include:

Namely, "one" can appear at the beginning or end of the string, it's separated from other characters by spaces. If possible, "one" should also be allowed to be followed by punctuation character such as "," and ".".

Currently, my query with LIKE and % input % doesn't give the desired result. How can this be fixed?

Upvotes: 4

Views: 983

Answers (2)

The best solution is the usability of a full-text search engine like SQLite FTS4 or FTS5 which is design for efficient searching of large datasets.

You can create a virtual table that indexes the text columns you would want to search, and then its used SQL to search for words or phrases in the columns. It will handle tokenization, stemming and other text processing tasks for you, to manually list them out.

Use FTS like this to search for the word one in the english_meaining of your kanji:

Create a virtual table for the english_meaning column:

CREATE VIRTUAL TABLE kanji_fts USING FTS5(english_meaning);

Populate a virtual table with data from kanji table:

INSERT INTO kanji_fts(rowid, english_meaning) SELECT rowid, english_meaning FROM kanji;

Search for the word one in the table:

SELECT * FROM kanji WHERE rowid IN (SELECT rowid FROM kanji_fts WHERE english_meaning MATCH 'one');

It will return all the rows from kanji and the english_meaning column will contain the one as a separate word.


FTS allows you to do searches with minimal code and high performance.


This worked for me:

// Create the virtual table if it doesn't exist
db.execSQL("CREATE VIRTUAL TABLE IF NOT EXISTS kanji_fts USING FTS5(english_meaning);");

// Populate the virtual table with data if it's empty
Cursor cursor = db.rawQuery("SELECT COUNT(*) FROM kanji_fts", null);
cursor.moveToFirst();
int count = cursor.getInt(0);
if (count == 0) {
    db.execSQL("INSERT INTO kanji_fts(rowid, english_meaning) SELECT rowid, english_meaning FROM kanji;");
}

// Perform a search on the virtual table
String query = "SELECT * FROM kanji WHERE rowid IN (SELECT rowid FROM kanji_fts WHERE english_meaning MATCH ?) ORDER BY stroke_count";
String[] selectionArgs = new String[]{"one"};
cursor = db.rawQuery(query, selectionArgs);

// Process the results
if (cursor != null) {
    cursor.moveToFirst();
}
return cursor;

Upvotes: 1

Dejan Dozet
Dejan Dozet

Reputation: 1019

Why not do it like this:

SELECT *
 from kanji  
 where (english_meaning = 'one'
    or english_meaning = 'one.'
    or english_meaning = 'one,'
    or english_meaning like 'one %'
    or english_meaning like '% one %'
    or english_meaning like '% one'
    or english_meaning like '% one.'
    or english_meaning like '% one,')

Upvotes: 1

Related Questions