Reputation: 8297
words = []
for w in wordList:
wcnt = 0
for i in range(len(word)):
if w[i] != word[i]:
wcnt += 1
if wcnt == 1:
words.append(w)
Given a word and a list of strings, I want to retrieve the list of strings that has only one character different from the given word
.
I tried the code above, and it works fine, but it takes too long.
I am practicing an interview, and I prefer not to use any library.
How can I make it simpler?
example)
word = "lost"
wordList= ["most","mist","miss","lost","fist","fish"]
Output should be ['most']
EDIT: only changing 1 character works. Not deleting or adding.
Upvotes: 4
Views: 3730
Reputation: 140266
The complexity would remain the same, but maybe you could speed it up using sum
inside a list comprehension:
words = [w for w in wordList if sum(a!=b for a,b in zip(word,w)) == 1]
zip
avoids playing with indices by directly interleaving letters and yielding them for one to one comparisonsum
avoids counting natively in python (the expression interleaves letters of both words and adds 1 if different, 0 otherwise, comparison yields True
or False
which are worth 1 and 0 respectively)IndexError
even if the length of the words differ (even if the results will then be unreliable) because zip
stops when the shorter sequence ends.so the more builtins you'll use, the faster it will generally get. Here it could probably be slightly improved to stop counting if the number of different letters reaches 2, but it would mean stop using comprehensions.
Upvotes: 9
Reputation: 266
Use the Levenstein Distance.
You can directly use it in python with the Natural Language Toolkit:
import nltk
nltk.edit_distance('asdff','asdfe')
This will return 1, since the distance of the word is 1, which means one letter is different.
Upvotes: 1
Reputation: 109666
How about using a built-in library (difflib
)?
from difflib import SequenceMatcher
word = "lost"
wordList= ["most", "mist", "miss", "lost", "fist", "fish"]
>>> [x for x in wordList
if SequenceMatcher(None, word, x).ratio() == (len(word) - 1) / float(len(word))]
['most']
Upvotes: 2