Henry Price
Henry Price

Reputation: 23

Remove words from list containing certain characters

I have a long list of words that I'm trying to go through and if the word contains a specific character remove it. However, the solution I thought would work doesn't and doesn't remove any words

l3 = ['b', 'd', 'e', 'f', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y']


firstcheck = ['poach', 'omnificent', 'aminoxylol', 'teetotaller', 'kyathos', 'toxaemic', 'herohead', 'desole', 'nincompoophood', 'dinamode']

validwords = []

for i in l3:
    for x in firstchect:
        if i not in x:
            validwords.append(x)
            continue
        else:
            break

If a word from firstcheck has a character from l3 I want it removed or not added to this other list. I tried it both ways. Can anyone offer insight on what could be going wrong? I'm pretty sure I could use some list comprehension but I'm not very good at that.

Upvotes: 0

Views: 750

Answers (4)

sciroccorics
sciroccorics

Reputation: 2427

The accepted answer makes use of np.sum which means importing a huge numerical library to perform a simple task that the Python kernel can easily do by itself:

validwords = [w for w in firstcheck if all(c not in w for c in l3)]

Upvotes: 3

Ajinkya Taranekar
Ajinkya Taranekar

Reputation: 330

Ah, there was some mistake in code, rest was fine:

l3 = ['b', 'd', 'e', 'f', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y']


firstcheck = ['aza', 'ca', 'poach', 'omnificent', 'aminoxylol', 'teetotaller', 'kyathos', 'toxaemic', 'herohead', 'desole', 'nincompoophood', 'dinamode']

validwords = []
flag=1
for x in firstcheck:
    for i in l3:
        if i not in x:
            flag=1
        else:
            flag=0  
            break
    if(flag==1):
        validwords.append(x)   

print(validwords)

So, here the first mistake was, the for loops, we need to iterate through words first then, through l3, to avoid the readdition of elements.

Next, firstcheck spelling was wrong in 'for x in firstcheck` due to which error was there.

Also, I added a flag, such that if flag value is 1 it will add the element in validwords. To, check I added new elements as 'aza' and 'ca', due to which, now it shows correct o/p as 'aza' and 'ca'.

Hope this helps you.

Upvotes: 0

Sunny Shukla
Sunny Shukla

Reputation: 342

If you want to avoid all loops etc, you can use re directly.

import re
l3 = ['b', 'd', 'e', 'f', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y']
firstcheck = ['azz', 'poach', 'omnificent', 'aminoxylol', 'teetotaller', 'kyathos', 'toxaemic', 'herohead', 'desole', 'nincompoophood', 'dinamode']

# Create a regex string to remove.
strings_to_remove = "[{}]".format("".join(l3))
validwords = [x for x in firstcheck if re.sub(strings_to_remove, '', x) == x]
print(validwords)

Output:

['azz']

Upvotes: 0

Allen Qin
Allen Qin

Reputation: 19957

you can use a list comprehension:

import numpy as np
[w for w in firstcheck if np.sum([c in w for c in l3])==0]

It seems all the words contain at least 1 char from l3 and the output of above is an empty list.

If firstcheck is defined as below:

firstcheck = ['a', 'z', 'poach', 'omnificent']

The code should output:

['a', 'z']

Upvotes: 0

Related Questions