Michal
Michal

Reputation: 2007

Histogram values from Pandas DataFrame

I would like to get histogram values from a DataFrame:

%matplotlib inline
import pandas as pd
import numpy as np

df=pd.DataFrame(60*np.random.sample((100, 4)), pd.date_range('1/1/2014',periods=100,freq='D'), ['A','B','C','D'])

Taking into account pd.cut() it is possible to do it with only one column, as in example:

bins=np.linspace(0,60,5)
df.groupby(pd.cut(df.A,bins)).count()

Is it possible to get whole histogram values for all columns in one DataFrame? The desired output would look like this:

            A   B   C   D       
(0, 15]     21  10  1   2
(15, 30]    14  24  21  24
(30, 45]    10  0   22  30
(45, 60]    25  5   25  25

Upvotes: 2

Views: 2283

Answers (1)

Dickster
Dickster

Reputation: 3009

How about this technique, essentially list comphrension and a pd.concat()

np.random.seed(1)    
bins=np.linspace(0,60,5)
df=  pd.concat([df[x].groupby(pd.cut(df[x],bins)).count() for x in df.columns],axis=1)
df.index.names = [None]
print df

which for me produces:

           A   B   C   D

(0, 15]   26  20  31  23
(15, 30]  23  23  20  18
(30, 45]  24  32  24  29
(45, 60]  27  25  25  30

Upvotes: 2

Related Questions