Reputation: 77
I'm trying to iterate over a list I have with a column in a dataframe which has list in each row.
list1 = ['installing','install','installed','replaced','repair','repaired','replace','part','used','new']
df[lwr_nopunc_spc_nostpwrd].head(3)
['daily', 'ask', 'questions']
['daily', 'system', 'check', 'task', 'replace']
['inspection', 'complete', 'replaced', 'horizontal', 'sealing', 'blade', 'inspection', 'complete', 'issues', 'found']
Now, I want to get two new columns in my dataframe that should show true or false if <new column one>
any one of the item in df[lwr_nopunc_spc_nostpwrd] row is present in list1 <new columns two>
if all the items in list1 one are present in df[lwr_nopunc_spc_nostpwrd] row
Please let me know how than can be achieved.I tried all()
and any()
methods but that doesn't seem to work.
def prt_usd(row):
return(any(item in query['lwr_nopunc_spc_nostpwrd'] for item in part))
for row in query['lwr_nopunc_spc_nostpwrd']:
prt_usd(query['lwr_nopunc_spc_nostpwrd'])
Upvotes: 2
Views: 232
Reputation: 10545
You could use some set arithmetic with list comprehensions, like this (note that I simplified your examples to have more obvious test cases):
import pandas as pd
list1 = ['installing', 'replace']
set1 = set(list1)
df = pd.DataFrame({'col1': [['daily', 'ask'],
['daily', 'replace'],
['installing', 'replace', 'blade']]})
# new1 should be True when the intersection of list1 with the row from col1 is not empty
df['new1'] = [set1.intersection(set(row)) != set() for row in df.col1]
# new2 should be True when list1 is a subset of the row from col1
df['new2'] = [set1.issubset(set(row)) for row in df.col1]
df
col1 new1 new2
0 [daily, ask] False False
1 [daily, replace] True False
2 [installing, replace, blade] True True
Upvotes: 2
Reputation: 29635
you can do it apply
and set
like:
# I changed the list to the second row to show that the column all works
list1 = ['daily', 'system', 'check', 'task', 'replace']
# create a set from it
s1 = set(list1)
# for any word, check that the intersection of s1
# and the set of the list in this row is not empty
df['col_any'] = df['lwr_nopunc_spc_nostpwrd'].apply(lambda x: any(set(x)&s1))
# for all, subtract the set of this row from the set s1,
# if not empty then it return True with any
# that you reverse using ~ in front of it to get True if all words from s1 are in this row
df['col_all'] = ~df['lwr_nopunc_spc_nostpwrd'].apply(lambda x: any(s1-set(x)))
print (df)
lwr_nopunc_spc_nostpwrd col_any col_all
0 [daily, ask, questions] True False
1 [daily, system, check, task, replace] True True
2 [inspection, complete, replaced, horizontal, s... False False
Upvotes: 3