Aks
Aks

Reputation: 303

Filter using lambda function python

I have an array containing invalid string

arr_invalid = ['aks', 'rabbbit', 'dog'].  

I am parsing through a RDD using lambda function and need to ignore the case if any of this invalid string comes in the input string like if input string is akss or aks ignore both.

How do I achieve this without writing filter for each invalid string?

Upvotes: 1

Views: 4992

Answers (1)

Padraic Cunningham
Padraic Cunningham

Reputation: 180522

You need to compare each string unless the words come sorted, you can use any to see if any substring is in each string:

arr_invalid = ['aks', 'rabbbit', 'dog']

strings = [ "aks", "akss","foo", "saks"]


filt = list(filter(lambda x: not any(s in x.lower() for s in arr_invalid),strings))

Output:

 ['foo']

If you only want to exclude the strings if they start with one of the substrings:

t = tuple(arr_invalid)
filt = list(filter(lambda x: not x.lower().startswith(t), strings))

Output:

['foo', 'saks']

If the input is a single string just split:

st = "foo akss saks aks"
t = tuple(arr_invalid)
filt = list(filter(lambda x: not x.startswith(t),st.lower().split()))

You can also just use a list comp:

 [s for s in st.lower().split() if not s.startswith(t)]

As poke commented you could find exact matches with a set, you will still need it to combine it with either any and in or str.startswith for matching substrings:

arr_invalid = {'aks', 'rabbbit', 'dog'}

st = "foo akss saks aks"
t = tuple(arr_invalid)

file = list(filter(lambda s: s not in st or not s.startswith(t),st.lower().split())

Upvotes: 3

Related Questions