Selection of rows by condition

Question

I have a pandas data frame:

df_total_data2

which have the following columns:

df_total_data.columns

Index([u'BBBlink', u'Name', u'_type'], dtype='object')

I want to drop the all the rows which don't satisfy a given condition, in this case the condition is that the column can't contain the word secure I want to drop the row at the place, not to have a function that return None if the condition isn't meet.

So I write this function:

df_total_data.apply(lambda x: 'secure' not in  x['BBBlink'],1 ).values

Which return a boolean array, but I don't know how to used it to drop the row.

Edit:

I got an array:

array([ True,  True,  True,  True,  True,False....True])

Now, how can I use this array to drop the columns?

DeepSpace · Accepted Answer

Once you got a boolean array you can select only the rows where it is True by doing df[boolean_array] or only the rows where it is False by adding ~, df[~boolean_array].

As for your question, you can either use the drop method or do it yourself:

df_total_data[df_total_data.apply(lambda x: 'secure' not in  x['BBBlink'],1 ).values]

Just remember that this is not inplace so you need to either assign the returned value to a new dataframe or re-assign it to the existing one.

By the way, you can simplify your condition a bit:

 df_total_data[df_total_data['BBBlink'].apply(lambda x: 'secure' not in  x)]

Selection of rows by condition

Answers (2)

Related Questions