arsenalist
arsenalist

Reputation: 391

How to filter by boolean?

I have a series of booleans extracted and I would like to filter a dataframe from this in Pandas but it is returning no results.

Dataframe

  Account   mphone  rphone  BPHONE
0 14999201  3931812 8014059 9992222
1 12980801  4444444 3932929 4279999
2 9999999   3279999 4419999 3938888

Here are the series:

df['BPHONE'].str.startswith(tuple(combined_list))

0    False
1     True
2    False
Name: BPHONE, dtype: bool

df['rphone'].str.startswith(tuple(combined_list))

0     True
1    False
2     True
Name: rphone, dtype: bool

Now below, when I try to filter this, I am getting empty results:

df[(df['BPHONE'].str.startswith(tuple(combined_list))) & (df['rphone'].str.startswith(tuple(combined_list)))]

    Account mphone  rphone  BPHONE

Lastly, when I just use one column, it seems to match and filter by row and not column. Any advice here on how to correct this?

df[(df['BPHONE'].str.startswith(tuple(combined_list)))]

  Account   mphone  rphone  BPHONE
1 12980801  4444444 3932929 4279999

I thought that this should just populate BPHONE along the column axis and not the row axis. How would I be able to filter this way?

The output wanted is the following:

Account  mphone rphone   BPHONE
14999201 3931812 8014059 np.nan
12980801 4444444 np.nan  4279999
99999999 3279999 4419999 np.nan

To explain this, rphone shows True, False, True, so only the True numbers should show. Under False it should not show, or show as NAN.

Upvotes: 0

Views: 194

Answers (2)

Vishnudev Krishnadas
Vishnudev Krishnadas

Reputation: 10970

The output you are expecting is not a filtered result, but conditionally replaced result.

condition = df['BPHONE'].str.startswith(tuple(combined_list))

Use np.where

df['BPHONE'] = pd.np.where(condition, df['BPHONE'], pd.np.nan)

OR

df.loc[~condition, 'BPHONE'] = pd.np.nan

Upvotes: 1

Celius Stingher
Celius Stingher

Reputation: 18377

All filters you are applying are functioning correctly:

df['BPHONE'].str.startswith(tuple(combined_list))   
0    False
1     True #Only this row will be retained 
2    False

The combined filter will return:

df[(df['BPHONE'].str.startswith(tuple(combined_list))) & (df['rphone'].str.startswith(tuple(combined_list)))]

    First filter Second filter Combined filter
0          False          True           False #Not retained
1           True         False           False #Not retained
2          False          True           False #Not retained

Upvotes: 1

Related Questions