shantanuo
shantanuo

Reputation: 32306

Find unique values across all columns

I can check the number of unique values for a given column.

len(df.createdby.unique())

But is there a method to know the unique values acorss all columns? I can run these 2 loops and get the results I need. But I am looking for a pythonic and elegant way of achieving this.

for i in df.columns:
    exec("print len(df.%s.unique())" % i)

for i in df.columns:
    print i

Upvotes: 3

Views: 2272

Answers (3)

Zero
Zero

Reputation: 76917

From 0.20.0 onwards use df.nunique()

In [234]: df = pd.DataFrame({'A': [1, 2, 3], 'B': [1, 1, 1]})

In [235]: df.nunique()
Out[235]:
A    3
B    1
dtype: int64

Upvotes: 3

jezrael
jezrael

Reputation: 862511

I think you need Series.nunique, but it is not implemented for DataFrame, so need apply:

print (df.apply(lambda x: x.nunique()))

Sample:

df = pd.DataFrame({'A':[1,1,3],
                   'B':[4,5,6],
                   'C':[7,7,7]})

print (df)
   A  B  C
0  1  4  7
1  1  5  7
2  3  6  7

print (df.apply(lambda x: x.nunique()))
A    2
B    3
C    1
dtype: int64

Upvotes: 2

Ted Petrou
Ted Petrou

Reputation: 61947

Use the drop_duplicates method

len(df.drop_duplicates())

Upvotes: 0

Related Questions