Reputation:

Delete columns but keep specific values pandas df

I'm sure this is in SO somewhere but I can't seem to find it. I'm trying to remove or select designated columns in a pandas df. But I want to keep certain values or strings from those deleted columns.

For the df below I want to keep 'Big','Cat' in Col B,C but delete everything else.

import pandas as pd

d = ({
    'A' : ['A','Keep','A','Value'],           
    'B' : ['Big','X','Big','Y'],
    'C' : ['Cat','X','Cat','Y'],
    })

df = pd.DataFrame(data=d)

If I do either the following it only selects that row.

Big = df[df['B'] == 'Big']
Cat = df[df['C'] == 'Cat']

My intended output is:

       A    B    C
0      A  Big  Cat
1   Keep          
2      A  Big  Cat
3  Value

I need something like x = df[df['B','C'] != 'Big','Cat']

Upvotes: 2

Answers (5)

Dlamini

Reputation: 285

Perhaps a concise version:

df.loc[df['B'] != 'Big', 'B'] = ''
df.loc[df['C'] != 'Cat', 'C'] = ''
print(df)

Output:

       A    B    C
0      A  Big  Cat
1   Keep          
2      A  Big  Cat
3  Value

Upvotes: 0

U13-Forward

Reputation: 71610

Or this:

df[['B','C']]=df[['B','C']].apply(lambda row: row if row.tolist()==['Big','Cat'] else ['',''],axis=1)
print(df)

Output:

       A    B    C
0      A  Big  Cat
1   Keep          
2      A  Big  Cat
3  Value

Upvotes: 1

jpp

Reputation: 164843

You can filter on column combinations via NumPy and np.ndarray.all:

mask = (df[['B', 'C']].values != ['Big', 'Cat']).all(1)

df.loc[mask, ['B', 'C']] = ''

print(df)

       A    B    C
0      A  Big  Cat
1   Keep          
2      A  Big  Cat
3  Value

Upvotes: 1

BENY

Reputation: 323396

IIUC

Update

df[['B','C']]=df[['B','C']][df[['B','C']].isin(['Big','Cat'])].fillna('')
df
Out[30]: 
       A    B    C
0      A  Big  Cat
1   Keep          
2      A  Big  Cat
3  Value

Upvotes: 2

rafaelc

Reputation: 59304

Seems like you want to keep only some values and have empty string on ohters

Use np.where

keeps = ['Big', 'Cat']
df['B'] = np.where(df.B.isin(keeps), df.B, '')
df['C'] = np.where(df.C.isin(keeps), df.C, '')


    A     B     C
0   A     Big   Cat
1   Keep        
2   A     Big   Cat
3   Value

Another solution using df.where

cols = ['B', 'C']
df[cols] = df[cols].where(df.isin(keeps)).fillna('')

    A     B     C
0   A     Big   Cat
1   Keep        
2   A     Big   Cat
3   Value

Upvotes: 2

Delete columns but keep specific values pandas df

Answers (5)

Related Questions