Reputation: 21333
I have long list of strings called words
. All the strings have the same length which is 5. For any i in {0,...,4} we can check to see if the i
'th letter of a string is also the i'th letter of any other string in the list. We call a letter at the i
'th position unique if it is not.
I would like to remove all strings from my list for which there exists any i
for which the i
'th letter is unique.
As a very simple example consider: words = ["apple", "amber", "bpple", "bmber", "appld"]
. The string "appld" should be removed because d
is unique at the 4th position.
Is there a neat way to do this?
Upvotes: 1
Views: 187
Reputation: 1598
this can be done in O(NlogN) but space inefficient
d={}
for j,w in enumerate(words):
for i,c in enumerate(w):
if (i,c) in d:
d[(i,c)].append(j)
else:
d[(i,c)]=[j]
for i in reversed(list(set([v[0] for v in d.values() if len(v)==1]))):
words.pop(i)
print(words)
Upvotes: 1
Reputation: 73480
Using collections.Counter
and the zip(*...)
transposition idiom:
from collections import Counter
# counts for every index
bypos = [*map(Counter, zip(*words))]
# disallow count 1 for any letter x in its respective position
words = [w for w in words if all(c[x]!=1 for x, c in zip(w, bypos))]
# ['apple', 'amber', 'bpple', 'bmber']
Note that is better to rebuild the list in a single iteration than to remove elements repeatedly.
Some docs on the utils used here:
Upvotes: 4