Douwe
Douwe

Reputation: 15

Counts, bars, bins for each pandas DataFrame histogram subplot

I am making separate histograms of travel distance per departure hour. However, for making further calculations I'd like to have the value of each bin in a histogram, for all histograms.

Up until now, I have the following:

    df['Distance'].hist(by=df['Departuretime'], color = 'red', 
            edgecolor = 'black',figsize=(15,15),sharex=True,density=True)

This creates in my case a figure with 21 small histograms.

With single histograms, I'd paste counts, bins, bars = in front of the entire line and the variable counts would contain the data I was looking for, however, in this case it does not work.

Ideally I'd like a dataframe or list of some sort for each histogram, containing the density values of the bins. I hope someone can help me out! Thanks in advance!

Edit:

Data I'm using, about 2500 columns of this, Distance is float64, the Departuretime is str

Histogram output I'm receiving

Of all these histograms I want to know the y-axis value of each bar, preferably in a dataframe with the distance binning as rows and the hours as columns

Upvotes: 0

Views: 1632

Answers (1)

NonNewtonianTree
NonNewtonianTree

Reputation: 26

By using the 'cut' function you can withdraw the requested data directly from your dataframe, instead of from the graph. This is less error-sensitive.

df['DistanceBin'] = pd.cut(df['Distance'], bins=10)

Then, you can use pivot_table to obtain a table with the counts for each combination of DistanceBin and Departuretime as rows and columns respectively as you asked.

df.pivot_table(index='DistanceBin', columns='Departuretime', aggfunc='count')

Upvotes: 1

Related Questions