Filter columns with number of unique values in a pandas dataframe

Question

I have a very large dataframe with over 2000 columns. I am trying to count the number of unique values for each column and filter out the columns with unique values below a certain number. Here is an example:

import pandas as pd
df = pd.DataFrame({'A': ('a', 'b', 'c', 'd', 'e', 'a', 'a'), 'B': (1, 1, 2, 1, 3, 3, 1)})
df.nunique()
A      5
B      3
dtype: int64

So lets say I wanna filter out column B which has lower than 5 unique values and return a df without column B.

Thanks-

user1503 · Accepted Answer

Others may have a more pythonic way. Try this out to see if it works.

x = df.nunique()
df[list(x[x>=5].index)]

Filter columns with number of unique values in a pandas dataframe

Answers (2)

Related Questions