Valour
Valour

Reputation: 810

Searching in database with scrambled words in SQLite

I am wondering if its possible to search in the database with the given scrambled words.

I have a mobs table in database and it holds the name of the monster names

If given monster name is A Golden Dregon or A Golden Dfigon or A Gelden Dragon I want it to find A Golden Dragon or with the matches that close to it from database. Usually one or two letters at max is given like this as scrambled.

Is that possible with just SQL queries? Or should I build the query by parsing the given monster name?

I am using LUA for the code side.

Upvotes: 5

Views: 480

Answers (3)

Michael Warner
Michael Warner

Reputation: 4217

I have come to know this search type as a fuzzy search. I mainly program in JS and use fuse.js all the time for this kind of problem.

Fuzzy Searches are based on the Levenshtein algorithm that rate the distance of two strings. When you have this distance value you can sort or drop elements from a list based on the score.

I found the algorithm in lua here.

function levenshtein(s, t)
  local s, t = tostring(s), tostring(t)
  if type(s) == 'string' and type(t) == 'string' then
    local m, n, d = #s, #t, {}
    for i = 0, m do d[i] = { [0] = i } end
    for j = 1, n do d[0][j] = j end
    for i = 1, m do
      for j = 1, n do
        local cost = s:sub(i,i) == t:sub(j,j) and 0 or 1
        d[i][j] = math.min(d[i-1][j]+1, d[i][j-1]+1, d[i-1][j-1]+cost)
      end
    end
    return d[m][n]
  end
end

As explained in the site you compare two strings like so and get a score based on the distance of them, then sort or drop the items being search based on the scores given. As this is CPU expensive I would suggest caching or use a memoize function to store common mistakes.

  levenshtein('referrer', 'referrer') -- zero distance
  >>> 0
  levenshtein('referrer', 'referer') -- distance of one character
  >>> 1
  levenshtein('random', 'strings') -- random big distance
  >>> 6 

Got a simple version of it working in lua here I must say lua is an easy language to pick up and start coding with.

local monsters = {'A Golden Dragon', 'Goblins', 'Bunny', 'Dragoon'}

function levenshtein(s, t)
  local s, t = tostring(s), tostring(t)
  if type(s) == 'string' and type(t) == 'string' then
    local m, n, d = #s, #t, {}
    for i = 0, m do d[i] = { [0] = i } end
    for j = 1, n do d[0][j] = j end
    for i = 1, m do
      for j = 1, n do
        local cost = s:sub(i,i) == t:sub(j,j) and 0 or 1
        d[i][j] = math.min(d[i-1][j]+1, d[i][j-1]+1, d[i-1][j-1]+cost)
      end
    end
    return d[m][n]
  end
end

--Fuzzy Search Returns the Best Match in a list
function fuzzySearch(list, searchText)
    local bestMatch = nil;
    local lowestScore = nil;

    for i = 1, #list do
        local score = levenshtein(list[i], searchText)
        if lowestScore == nil or score < lowestScore then
            bestMatch = list[i]
            lowestScore = score
        end 
    end

    return bestMatch
end

print ( fuzzySearch(monsters, 'golen dragggon') )
print ( fuzzySearch(monsters, 'A Golden Dfigon') )
print ( fuzzySearch(monsters, 'A Gelden Dragon') )

print ( fuzzySearch(monsters, 'Dragooon') ) --should be Dragoon
print ( fuzzySearch(monsters, 'Funny') ) --should be Bunny
print ( fuzzySearch(monsters, 'Gob') ) --should be Goblins

Output

A Golden Dragon
A Golden Dragon
A Golden Dragon
Dragoon
Bunny
Goblins

For SQL

You can try to do this same algorithm in T-SQL as talked about here.

In SQLlite there is an extension called editdist3 which also uses this algorithm the docs are here.

Upvotes: 4

Mark Benningfield
Mark Benningfield

Reputation: 2892

You will have to parse the given monster name to some extent, by making assumptions about how badly it is misspelled. For example, if the user supplied the name

b fulden gorgon

There is no way in hell you can get to "A Golden Dragon". However, if you assume that the user will always get the first and last letters of every word correctly, then you could parse the words in the given name to get the first and last letters of each word, which would give you

"A", "G" "n", "D" "n"

Then you could use the LIKE operator in your query, like so:

SELECT * FROM mobs WHERE monster_name LIKE 'A G%n D%n';

The main point here is what assumptions you make about the misspelling. The closer you can narrow it down, the better your query results will be.

Upvotes: 0

Mr.Zeus
Mr.Zeus

Reputation: 464

I would be hard to compensate for all the different one and two letter scrambled combinations, but you could create a lua table of common misspellings of "A Golden Dragon" check if it is in the table. I have never used lua before but here is my best try at some sample code:

local mob_name = "A Golden Dregon"--you could do something like, input("Enter mob name:")
local scrambled_dragon_names = {"A Golden Dregon", "A Golden Dfigon", "A Gelden Dragon"}
for _,v in pairs(scrambled_dragon_names) do
  if v == mob_name then
    mob_name = "A Golden Dragon"
    break
  end
end

I really hope I have helped!

P.S. If you have anymore questions go ahead and comment and I will try to answer ASAP.

Upvotes: 0

Related Questions