sgerbhctim
sgerbhctim

Reputation: 3640

How to delete all columns in DataFrame except certain ones?

Let's say I have a DataFrame that looks like this:

a  b  c  d  e  f  g  
1  2  3  4  5  6  7
4  3  7  1  6  9  4
8  9  0  2  4  2  1

How would I go about deleting every column besides a and b?

This would result in:

a  b
1  2
4  3
8  9

I would like a way to delete these using a simple line of code that says, delete all columns besides a and b, because let's say hypothetically I have 1000 columns of data.

Thank you.

Upvotes: 172

Views: 230553

Answers (6)

Blowsh1t
Blowsh1t

Reputation: 127

Hey what you are looking for is:

df = df[["a","b"]]

You will recive a dataframe which only contains the columns a and b

Upvotes: 8

GollyJer
GollyJer

Reputation: 26782

Another option to add to the mix. I prefer this approach for readability.

df = df.filter(['a', 'b'])

Where the first positional argument is items=[]


Bonus

You can also use a like argument or regex to filter.
Helpful if you have a set of columns like ['a_1','a_2','b_1','b_2']

You can do

df = df.filter(like='b_')

and end up with ['b_1','b_2']

Pandas documentation for filter.

Upvotes: 115

BENY
BENY

Reputation: 323366

there are multiple solution .

df = df[['a','b']] #1

df = df[list('ab')] #2

df = df.loc[:,df.columns.isin(['a','b'])] #3

df = pd.DataFrame(data=df.eval('a,b').T,columns=['a','b']) #4 PS:I do not recommend this method , but still a way to achieve this 

Upvotes: 63

Taie
Taie

Reputation: 1189

If you have more than two columns that you want to drop, let's say 20 or 30, you can use lists as well. Make sure that you also specify the axis value.

drop_list = ["a","b"]
df = df.drop(df.columns.difference(drop_list), axis=1)

Upvotes: 2

Isaac Taylor
Isaac Taylor

Reputation: 41

If you only want to keep more columns than you're dropping put a "~" before the .isin statement to select every column except the ones you want:

df = df.loc[:, ~df.columns.isin(['a','b'])]

Upvotes: 3

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210972

In [48]: df.drop(df.columns.difference(['a','b']), 1, inplace=True)
Out[48]:
   a  b
0  1  2
1  4  3
2  8  9

or:

In [55]: df = df.loc[:, df.columns.intersection(['a','b'])]

In [56]: df
Out[56]:
   a  b
0  1  2
1  4  3
2  8  9

PS please be aware that the most idiomatic Pandas way to do that was already proposed by @Wen:

df = df[['a','b']]

or

df = df.loc[:, ['a','b']]

Upvotes: 161

Related Questions