Reputation: 720
My current programming project is a sort of french dictionary in Java (using sqlite). I was wondering what would happen if someone wanted to find the present tense for "avoir" but typed in "avior" and how I would handle it. So I thought I could implement some sort of closest match/did you mean functionality. So my question is:
Is there a way to use the database to search for similar matches?
when I made the same program in python a while back (using xml instead) I used this system but it wasn't very effective and required a large error margin to be somewhat effective (and subsequently suggesting words with no relevance!)... but something similar could still be useful nether the less
def getSimilar(self, word, Return = False):
matches = list()
for verb in self.data.getElementsByTagName("Verb"):
for x in range(16):
if x % 2 != 0 and x>0:
if (x == 15 or x == 3 or x == 1):
part = Dict(self.data).removeBrackets(Dict(self.data).getAccents(verb.childNodes[x].childNodes[0].data))
diff = 0
for char in word:
if (not char in part):
diff += 1
if (diff < self.similarityValue) and (-self.errorAllowance <= len(part) - len(word) <= self.errorAllowance):
matches.append(part)
else:
for y in range(14):
if (y % 2 != 0 and y>0):
part = Dict(self.data).getAccents(verb.childNodes[x].childNodes[y].childNodes[0].data)
diff = 0
for char in word:
if (not char in part):
diff += 1
if (diff < self.similarityValue) and (-self.errorAllowance <= len(part) - len(word) <= self.errorAllowance):
matches.append(part)
if not Return:
for match in matches:
print "Did you mean '" + match + "'?"
if Return: return matches
Any help is welcomed!
Jamie
Upvotes: 0
Views: 245
Reputation: 18550
try using https://github.com/mateusza/SQLite-Levenshtein
Works quite well
Upvotes: 2