Romain Jouin
Romain Jouin

Reputation: 4848

Pandas - how to slice value_counts?

I would like to slice a pandas value_counts() :

>sur_perimetre[col].value_counts()
44341006.0    610
14231009.0    441
12131001.0    382
12222009.0    364
12142001.0    354

But I get an error :

> sur_perimetre[col].value_counts()[:5]
KeyError: 5.0

The same with ix :

> sur_perimetre[col].value_counts().ix[:5]
KeyError: 5.0

How would you deal with that ?

EDIT

Maybe :

pd.DataFrame(sur_perimetre[col].value_counts()).reset_index()[:5]

Upvotes: 1

Views: 4059

Answers (1)

Gurupad Hegde
Gurupad Hegde

Reputation: 2155

Method 1:

You need to observe that value_counts() returns a Series object. You can process it like any other series and get the values. You can even construct a new dataframe out of it.

In [1]: import pandas as pd

In [2]: df = pd.DataFrame([1,2,3,3,4,5], columns=['C1'])

In [3]: vc = df.C1.value_counts()

In [4]: type(vc)
Out[4]: pandas.core.series.Series

In [5]: vc.values
Out[5]: array([2, 1, 1, 1, 1])

In [6]: vc.values[:2]
Out[6]: array([2, 1])

In [7]: vc.index.values
Out[7]: array([3, 5, 4, 2, 1])

In [8]: df2 = pd.DataFrame({'value':vc.index, 'count':vc.values})

In [8]: df2
Out[8]: 
   count  value
0      2      3
1      1      5
2      1      4
3      1      2
4      1      1

Method2:

Then, I was trying to regenerate the error you mentioned. But, using a single column in DF, I didnt get any error in the same notation as you mentioned.

In [1]: import pandas as pd

In [2]: df = pd.DataFrame([1,2,3,3,4,5], columns=['C1'])

In [3]: df['C1'].value_counts()[:3]
Out[3]: 
3    2
5    1
4    1
Name: C1, dtype: int64      

In [4]: df.C1.value_counts()[:5]
Out[4]: 
3    2
5    1
4    1
2    1
1    1
Name: C1, dtype: int64

In [5]: pd.__version__
Out[5]: u'0.17.1'

Hope it helps!

Upvotes: 3

Related Questions