Alex Man
Alex Man

Reputation: 477

Filter columns with number of unique values in a pandas dataframe

I have a very large dataframe with over 2000 columns. I am trying to count the number of unique values for each column and filter out the columns with unique values below a certain number. Here is an example:

import pandas as pd
df = pd.DataFrame({'A': ('a', 'b', 'c', 'd', 'e', 'a', 'a'), 'B': (1, 1, 2, 1, 3, 3, 1)})
df.nunique()
A      5
B      3
dtype: int64

So lets say I wanna filter out column B which has lower than 5 unique values and return a df without column B.

Thanks-

Upvotes: 1

Views: 2816

Answers (2)

user1503
user1503

Reputation: 70

Others may have a more pythonic way. Try this out to see if it works.

x = df.nunique()
df[list(x[x>=5].index)]

Upvotes: 3

BENY
BENY

Reputation: 323226

Pass the .loc

df=df.loc[:,df.nunique()>3]
   A
0  a
1  b
2  c
3  d
4  e
5  a
6  a

Upvotes: 5

Related Questions