KKP
KKP

Reputation: 77

check if an item in a list is available in a column which is of type list

I'm trying to iterate over a list I have with a column in a dataframe which has list in each row.

list1 = ['installing','install','installed','replaced','repair','repaired','replace','part','used','new']

df[lwr_nopunc_spc_nostpwrd].head(3)

['daily', 'ask', 'questions']
['daily', 'system', 'check', 'task',  'replace']
['inspection', 'complete', 'replaced', 'horizontal', 'sealing', 'blade', 'inspection', 'complete', 'issues', 'found']

Now, I want to get two new columns in my dataframe that should show true or false if <new column one> any one of the item in df[lwr_nopunc_spc_nostpwrd] row is present in list1 <new columns two> if all the items in list1 one are present in df[lwr_nopunc_spc_nostpwrd] row

Please let me know how than can be achieved.I tried all() and any() methods but that doesn't seem to work.

def prt_usd(row):
    return(any(item in query['lwr_nopunc_spc_nostpwrd'] for item in part))

for row in query['lwr_nopunc_spc_nostpwrd']:
    prt_usd(query['lwr_nopunc_spc_nostpwrd'])

Upvotes: 2

Views: 232

Answers (2)

Arne
Arne

Reputation: 10545

You could use some set arithmetic with list comprehensions, like this (note that I simplified your examples to have more obvious test cases):

import pandas as pd

list1 = ['installing', 'replace']
set1 = set(list1)

df = pd.DataFrame({'col1': [['daily', 'ask'], 
                            ['daily', 'replace'],
                            ['installing', 'replace', 'blade']]})

# new1 should be True when the intersection of list1 with the row from col1 is not empty
df['new1'] = [set1.intersection(set(row)) != set() for row in df.col1]

# new2 should be True when list1 is a subset of the row from col1 
df['new2'] = [set1.issubset(set(row)) for row in df.col1]

df
    col1                          new1   new2
0   [daily, ask]                  False  False
1   [daily, replace]              True   False
2   [installing, replace, blade]  True   True

Upvotes: 2

Ben.T
Ben.T

Reputation: 29635

you can do it apply and set like:

# I changed the list to the second row to show that the column all works
list1 = ['daily', 'system', 'check', 'task',  'replace']
# create a set from it
s1 = set(list1)

# for any word, check that the intersection of s1 
# and the set of the list in this row is not empty
df['col_any'] = df['lwr_nopunc_spc_nostpwrd'].apply(lambda x: any(set(x)&s1))

# for all, subtract the set of this row from the set s1, 
# if not empty then it return True with any
# that you reverse using ~ in front of it to get True if all words from s1 are in this row
df['col_all'] = ~df['lwr_nopunc_spc_nostpwrd'].apply(lambda x: any(s1-set(x)))
print (df)
                             lwr_nopunc_spc_nostpwrd  col_any  col_all
0                            [daily, ask, questions]     True    False
1              [daily, system, check, task, replace]     True     True
2  [inspection, complete, replaced, horizontal, s...    False    False

Upvotes: 3

Related Questions