s900n
s900n

Reputation: 3375

Python Pandas: Get dataframe.value_counts() result as list

I have a DataFrame and I want to get both group names and corresponding group counts as a list or numpy array. However when I convert the output to matrix I only get group counts I dont get the names. Like in the example below:

  df = pd.DataFrame({'a':[0.5, 0.4, 5 , 0.4, 0.5, 0.6 ]})
  b = df['a'].value_counts()
  print(b)

output:

[0.4    2
0.5    2
0.6    1
5.0    1
Name: a, dtype: int64]

what I tried is print[b.as_matrix()]. Output:

[array([2, 2, 1, 1])]

In this case I do not have the information of corresponding group names which also I need. Thank you.

Upvotes: 10

Views: 22002

Answers (3)

Krissh
Krissh

Reputation: 357

most simplest way

list(df['a'].value_counts())

Upvotes: 4

Arya McCarthy
Arya McCarthy

Reputation: 8829

Convert it to a dict:

bd = dict(b)
print(bd)
# {0.40000000000000002: 2, 0.5: 2, 0.59999999999999998: 1, 5.0: 1}

Don't worry about the long decimals. They're just a result of floating point representation; you still get what you expect from the dict.

bd[0.4]
# 2

Upvotes: 11

Divakar
Divakar

Reputation: 221554

One approach with np.unique -

np.c_[np.unique(df.a, return_counts=1)]

Sample run -

In [270]: df
Out[270]: 
     a
0  0.5
1  0.4
2  5.0
3  0.4
4  0.5
5  0.6

In [271]: np.c_[np.unique(df.a, return_counts=1)]
Out[271]: 
array([[ 0.4,  2. ],
       [ 0.5,  2. ],
       [ 0.6,  1. ],
       [ 5. ,  1. ]])

We can zip the outputs from np.unique for list output -

In [283]: zip(*np.unique(df.a, return_counts=1))
Out[283]: [(0.40000000000000002, 2), (0.5, 2), (0.59999999999999998, 1), (5.0, 1)]

Or use zip directly on the value_counts() output -

In [338]: b = df['a'].value_counts()

In [339]: zip(b.index, b.values)
Out[339]: [(0.40000000000000002, 2), (0.5, 2), (0.59999999999999998, 1), (5.0, 1)]

Upvotes: 2

Related Questions