Reputation: 29
I have created code based off a website (https://thispointer.com/python-pandas-how-to-drop-rows-in-dataframe-by-conditions-on-column-values/) to delete rows in a data frame based on column values. The column 'zone_type' can have one of 5 values (response_button_text, response_button_image, fixation, timelimit_screen or continue_button). Unless the value for the row is 'response_button_image', I want to remove the row from the data frame.
# select all of the rows that are not to do with the task i.e. fixation screens etc.
indexZoneType = df[ (df['zone_type'] == 'fixation') & (df['zone_type'] == 'response_button_text') & (df['zone_type'] == 'timelimit_screen') & (df['zone_type'] == 'continue_button')].index
# delete these rows
df.drop(indexZoneType , inplace=True)
I thought this code should work? and I get no error, but when executing print(df)
, the data frame has not changed.
Thanks.
Upvotes: 0
Views: 531
Reputation: 13407
Ok, so your conditions are mutually exclusive, you probably wanted to use 'or' there i.e.
indexZoneType = df[ (df['zone_type'] == 'fixation') |(df['zone_type'] == 'response_button_text') | (df['zone_type'] == 'timelimit_screen') | (df['zone_type'] == 'continue_button')].index
Or even better - using df.isin(...)
and df.loc[...]
indexZoneType = df.loc[df['zone_type'].isin(['fixation', 'response_button_text', 'timelimit_screen', 'continue_button'])].index
Or even simpler:
indexZoneType = df.index[df['zone_type'].isin(['fixation', 'response_button_text', 'timelimit_screen', 'continue_button'])]
Reference: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.isin.html https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.loc.html
Upvotes: 1