Dawn17
Dawn17

Reputation: 8297

Simpler way to get the list of words that has only 1 letter difference? (python)

words = []

for w in wordList:
    wcnt = 0
    for i in range(len(word)):
        if w[i] != word[i]:
            wcnt += 1
    if wcnt == 1:
        words.append(w)

Given a word and a list of strings, I want to retrieve the list of strings that has only one character different from the given word.

I tried the code above, and it works fine, but it takes too long.

I am practicing an interview, and I prefer not to use any library.

How can I make it simpler?

example) word = "lost"

wordList= ["most","mist","miss","lost","fist","fish"]

Output should be ['most']

EDIT: only changing 1 character works. Not deleting or adding.

Upvotes: 4

Views: 3730

Answers (3)

Jean-François Fabre
Jean-François Fabre

Reputation: 140266

The complexity would remain the same, but maybe you could speed it up using sum inside a list comprehension:

words = [w for w in wordList if sum(a!=b for a,b in zip(word,w)) == 1]
  • zip avoids playing with indices by directly interleaving letters and yielding them for one to one comparison
  • sum avoids counting natively in python (the expression interleaves letters of both words and adds 1 if different, 0 otherwise, comparison yields True or False which are worth 1 and 0 respectively)
  • list comprehensions are highly optimized python constructs
  • the code above doesn't use any external libraries, only built-ins
  • it doesn't crash with IndexError even if the length of the words differ (even if the results will then be unreliable) because zip stops when the shorter sequence ends.
  • one-liners are cool (when not too far-fetched/with side effects)

so the more builtins you'll use, the faster it will generally get. Here it could probably be slightly improved to stop counting if the number of different letters reaches 2, but it would mean stop using comprehensions.

Upvotes: 9

Michael Boesl
Michael Boesl

Reputation: 266

Use the Levenstein Distance.

You can directly use it in python with the Natural Language Toolkit:

import nltk
nltk.edit_distance('asdff','asdfe')

This will return 1, since the distance of the word is 1, which means one letter is different.

Upvotes: 1

Alexander
Alexander

Reputation: 109666

How about using a built-in library (difflib)?

from difflib import SequenceMatcher

word = "lost"
wordList= ["most", "mist", "miss", "lost", "fist", "fish"]


>>> [x for x in wordList 
     if SequenceMatcher(None, word, x).ratio() == (len(word) - 1) / float(len(word))]
['most']

Upvotes: 2

Related Questions