Hannan
Hannan

Reputation: 1191

How to search list of words for a given list of letters using Python

I have the following list of letters:

letters = ['t', 'u', 'v', 'w', 'x', 'y', 'z']

And following list of words:

words = ['apple', 'whisky', 'yutz', 'xray', 'tux', 'zebra']

How can I search using Python if any of the combination of words exist for the list of letters? Like just looking at it we can observe that two words 'yutz' and 'tux' are the only one which can be built for the list of letters we have.

I'm new to Python and I tried to make different for loops but not able to reach anywhere.

for word in words:
    for i in letters:
        if i in word:
            print(word)
        else:
            print('not in word')

And the result is disaster as you guys can understand.

Upvotes: 1

Views: 2431

Answers (4)

Martijn Pieters
Martijn Pieters

Reputation: 1124258

You need to look at your problem in terms of sets. Any word from your words list that is a subset of your set of letters can be formed by those letters. Put differently, letters needs to be a superset of the word:

letters = {'t', 'u', 'v', 'w', 'x', 'y', 'z'}  # a set, not a list
for word in words:
    if letters.issuperset(word):
        print(word)

The set.issuperset() method returns true if all elements of the iterable argument are in the set.

If you wanted a list, just use a list comprehension:

[word for word in words if letters.issuperset(word)]

Demo:

>>> words = ['apple', 'whisky', 'yutz', 'xray', 'tux', 'zebra']
>>> letters = {'t', 'u', 'v', 'w', 'x', 'y', 'z'}  # a set, not a list
>>> [word for word in words if letters.issuperset(word)]
['yutz', 'tux']

Note that this only looks at unique letters. apple is a subset of the letters set {'a', 'p', 'l', 'e'}. If you need to handle letter counts too, you need to use a multiset; Python has an implementation called collections.Counter(). This keeps track not only of the letters, but also of their counts.

The Counter type doesn't support testing for sub- or supersets, so you have to use subtraction instead; if an empty Counter() is produced, the whole word can be formed from the letter counts:

letters = Counter(['a', 'p', 'l', 'e', 'p', 'i'])
words = ['apple', 'applepie']
for word in words:
    if not Counter(word) - letters:
        print(word)

or as a list comprehension:

[word for word in words if not Counter(word) - letters]

which produces ['apple'], as there is only a single 'e' in the input letter multi-set, and only 2 'p's, not 3.

Upvotes: 6

Aaditya Ura
Aaditya Ura

Reputation: 12679

You can use itertool's permutation method :

In one line:

print(set(["".join(permutation) for item in words for permutation in itertools.permutations(letters,len(item)) if "".join(permutation) in words ]))

Detailed solution:

above list comprehension is same as:

words = ['apple', 'whisky', 'yutz', 'xray', 'tux', 'zebra']

letters = ['t', 'u', 'v', 'w', 'x', 'y', 'z']
import itertools

final=[]
for i in words:
    for k in itertools.permutations(letters,len(i)):
        if "".join(k) in words and "".join(k) not in final:
            final.append("".join(k))

print(final)

output:

['yutz', 'tux']

Upvotes: -2

cs95
cs95

Reputation: 402982

You can use set.difference here:

r = [w for w in words if not set(w).difference(letters)]

r
['yutz', 'tux']

If the result is an empty set, that means every character in w belongs to letters. If that is the case, set.difference returns an empty set, which is False-y, so not .... results in True and the word is printed. This is equivalent to:

for w in words:
    if not set(w).difference(letters):
        print(w)

yutz
tux

This is similar to testing with set.issuperset, but approaches the problem from a different perspective.

Upvotes: 3

Ajax1234
Ajax1234

Reputation: 71471

You can use the all function with a generator to determine if all the characters in a word belonging to words exists in letters:

letters = ['t', 'u', 'v', 'w', 'x', 'y', 'z']
words = ['apple', 'whisky', 'yutz', 'xray', 'tux', 'zebra']
final_words = [i for i in words if all(c in letters for c in i)]

Output:

['yutz', 'tux']

Upvotes: 0

Related Questions