Reputation: 14112
I have a dataframe which looks like this
Label Type
Name
ppppp Base brute UnweightedBase
pbaaa Base Base
pb4a1 Très à gauche Category
pb4a2 A gauche pb4a2 Category
pb4a3 Au centre pb4a3 Category
pb4a4 A droite pb4a4 Category
if "Type" column's value is "UnweightedBase" and "Base", I would like that delete from the data.
I can do this but just for one item at a time with the following code:
to_del = df[df['Type'] == "UnweightedBase"].index.tolist()
df= df.drop(to_del, axis)
return df
How do I modify my code so that I can delete more than one value at once?
my failed attempt:
to_del = df[df['Type'] in ["UnweightedBase","Base"]].index.tolist()
df= df.drop(to_del, axis)
return df
Upvotes: 0
Views: 4705
Reputation: 880717
You could select the desired rows and reassign the resultant DataFrame to df
:
In [60]: df = df.loc[~df['Type'].isin(['UnweightedBase', 'Base'])]
In [61]: df
Out[61]:
Name Label Type
2 pb4a1 Très à gauche Category
3 pb4a2 A gauche pb4a2 Category
4 pb4a3 Au centre pb4a3 Category
5 pb4a4 A droite pb4a4 Category
I think this is more direct and safer than using
to_del = df[df['Type'].isin(type_val)].index.tolist()
df= df.drop(to_del, axis)
since the latter does essentially the same selection as an intermediate step:
df[df['Type'].isin(type_val)]
moreover, index.tolist()
will return index labels. If the index has non-unique values, you might delete unintended rows.
For example:
In [85]: df = pd.read_table('data', sep='\s{4,}')
In [86]: df.index = ['a','b','c','d','e','a']
In [87]: df
Out[87]:
Name Label Type
a ppppp Base brute UnweightedBase
b pbaaa Base Base
c pb4a1 Très à gauche Category
d pb4a2 A gauche pb4a2 Category
e pb4a3 Au centre pb4a3 Category
a pb4a4 A droite pb4a4 Category #<-- note the repeated index
In [88]: to_del = df[df['Type'].isin(['UnweightedBase', 'Base'])].index.tolist()
In [89]: to_del
Out[89]: ['a', 'b']
In [90]: df = df.drop(to_del)
In [91]: df
Out[91]:
Name Label Type
c pb4a1 Très à gauche Category
d pb4a2 A gauche pb4a2 Category
e pb4a3 Au centre pb4a3 Category
#<--- OOPs, we've lost the last row, even though the Type was Category.
Upvotes: 3