Selection of rows by condition

I have a pandas data frame:

df_total_data2

which have the following columns:

df_total_data.columns

Index([u'BBBlink', u'Name', u'_type'], dtype='object')

I want to drop the all the rows which don't satisfy a given condition, in this case the condition is that the column can't contain the word secure I want to drop the row at the place, not to have a function that return None if the condition isn't meet.

So I write this function:

df_total_data.apply(lambda x: 'secure' not in  x['BBBlink'],1 ).values

Which return a boolean array, but I don't know how to used it to drop the row.

Edit:

I got an array:

array([ True,  True,  True,  True,  True,False....True])

Now, how can I use this array to drop the columns?

Upvotes: 1

Views: 96

Answers (2)

jezrael
jezrael

Reputation: 863741

IIUC you can use isin:

print df_total_data
  BBBlink  Name _type
0  secure  name     A
1  secure  name     A
2  secure  name     A
3  secure  name     A
4  secure  name     A
5     sre  name     A

print df_total_data.BBBlink.isin(['secure'])
0     True
1     True
2     True
3     True
4     True
5    False
Name: BBBlink, dtype: bool

print df_total_data[df_total_data.BBBlink.isin(['secure'])]
  BBBlink  Name _type
0  secure  name     A
1  secure  name     A
2  secure  name     A
3  secure  name     A
4  secure  name     A

print df_total_data[~df_total_data.BBBlink.isin(['secure'])]
  BBBlink  Name _type
5     sre  name     A

But if string is with other strings you can use str.contains:

print df_total_data
        BBBlink  Name _type
0     secure qq  name     A
1        secure  name     A
2        secure  name     A
3        secure  name     A
4  secure aa ss  name     A
5           sre  name     A

print  df_total_data[~df_total_data.BBBlink.str.contains('secure')]
 BBBlink  Name _type
5     sre  name     A

Upvotes: 1

DeepSpace
DeepSpace

Reputation: 81684

Once you got a boolean array you can select only the rows where it is True by doing df[boolean_array] or only the rows where it is False by adding ~, df[~boolean_array].

As for your question, you can either use the drop method or do it yourself:

df_total_data[df_total_data.apply(lambda x: 'secure' not in  x['BBBlink'],1 ).values]

Just remember that this is not inplace so you need to either assign the returned value to a new dataframe or re-assign it to the existing one.

By the way, you can simplify your condition a bit:

 df_total_data[df_total_data['BBBlink'].apply(lambda x: 'secure' not in  x)]

Upvotes: 1

Related Questions