Reputation: 541
Let us say I have a dataframe :
first_df = pd.DataFrame({"company" : ['abc','def','xyz','lmn','def','xyz'],
"art_type": ['300x240','100x600','400x600','300x240','100x600','400x600'],
"metrics" : ['imp','rev','cpm','imp','rev','cpm'],
"value": [1234,23,0.5,1234,23,0.5]})
first_df = first_df.append(first_df)
I want to remove all the rows which have a value for company in the list ['lmn','xyz'] and store that in another dataframe.
company_list = ['lmn', 'xyz']
I tried this :
deleted_data = first_df[first_df['company'] in company_list]
this obviously did not work because it is list in list. Is for loop the way to do this or is there any better way to do it?
for loop code :
deleted_data = pd.DataFrame()
for x in company_list:
deleted_data = deleted_data.append(first_df[first_df['company']==x])
Upvotes: 3
Views: 2223
Reputation: 109510
You can filter based on isin()
.
deleted_data = first_df.loc[first_df['company'].isin(company_list)]
>>> deleted_data
art_type company metrics value
2 400x600 xyz cpm 0.5
3 300x240 lmn imp 1234.0
5 400x600 xyz cpm 0.5
2 400x600 xyz cpm 0.5
3 300x240 lmn imp 1234.0
5 400x600 xyz cpm 0.5
retained_data = first_df.loc[~first_df['company'].isin(company_list)]
>>> retained_data
art_type company metrics value
0 300x240 abc imp 1234
1 100x600 def rev 23
4 100x600 def rev 23
0 300x240 abc imp 1234
1 100x600 def rev 23
4 100x600 def rev 23
Upvotes: 3