Reputation: 47
I am trying to find the frequency of unique values in a column of a pandas dataframe I know how to get the unique values like this:
data_file.visiting_states()
returns :
array(['CA', 'VA', 'MT', nan, 'CO', 'CT'], dtype=object)
and I want to return the count of those unique values and I know I cant do .value_counts() because its a numpy array
Upvotes: 1
Views: 1493
Reputation: 863501
You can use nunique
:
data_file = pd.DataFrame({'visiting_states':['CA', 'VA', 'MT', np.nan, 'CO', 'CT','CA',
'VA', 'MT', np.nan, 'CO', 'CT']})
print (data_file)
visiting_states
0 CA
1 VA
2 MT
3 NaN
4 CO
5 CT
6 CA
7 VA
8 MT
9 NaN
10 CO
11 CT
print (data_file.visiting_states.nunique())
5
print (data_file.visiting_states.nunique(dropna=False))
6
arr = np.array(['CA', 'VA', 'MT', np.nan, 'CO', 'CT'], dtype=object)
print (arr)
['CA' 'VA' 'MT' nan 'CO' 'CT']
print (len(arr))
6
Upvotes: 1