Jana
Jana

Reputation: 49

Pandas: select column with most unique values

I have a pandas DataFrame and want to find select the column with the most unique values. I already filtered the unique values with nunique(). How can I now choose the column with the highest nunique()?

This is my code so far:

numeric_columns = df.select_dtypes(include = (int or float))
    unique = []
    for column in numeric_columns:
        unique.append(numeric_columns[column].nunique())

I later need to filter all the columns of my dataframe depending on this column(most uniques)

Upvotes: 1

Views: 798

Answers (1)

jezrael
jezrael

Reputation: 862661

Use DataFrame.select_dtypes with np.number, then get DataFrame.nunique with column by maximal value by Series.idxmax:

df = pd.DataFrame({'a':[1,2,3,4],'b':[1,2,2,2], 'c':list('abcd')})
print (df)
   a  b  c
0  1  1  a
1  2  2  b
2  3  2  c
3  4  2  d

numeric = df.select_dtypes(include = np.number)

nu = numeric.nunique().idxmax()
print (nu)
a

Upvotes: 3

Related Questions