J.Agusti
J.Agusti

Reputation: 173

Python - Pandas histogram width

I am doing a histogram plot of a bunch of data that goes from 0 to 1. When I plot I get this

Histogram

As you can see, the histogram 'blocks' do not align with the y-axis. Is there a way to set my histogram in order to get the histograms in a constant width of 0.1? Or should I try a diferent package?

My code is quite simple:

import pandas as pd
import numpy as np
from pandas.plotting import scatter_matrix
import matplotlib.pyplot as plt

np.set_printoptions(precision=10,
                       threshold=10000,
                       linewidth=150,suppress=True)

E=pd.read_csv("FQCoherentSeparableBons5.csv")
E = E.ix[0:,1:]

E=np.array(E,float)

P0=E[:,0]

P0=pd.DataFrame(P0,columns=['P0'])

scatter_matrix(P0, alpha=0.2, figsize=(6, 6), diagonal='hist',color="red")
plt.suptitle('Distribucio p0')
plt.ylabel('Frequencia p0')
plt.show()

PD: If you are wondering about the data, I is just a random distribution from 0 to 1.

Upvotes: 0

Views: 2309

Answers (1)

hoffee
hoffee

Reputation: 488

You can pass additional arguments to the pandas histogram using the hist_kwds argument of the scatter_matrix function. If you want ten bins of width 0.1, then your scatter_matrix call should look like

scatter_matrix(P0, alpha=0.2, figsize=(6, 6), diagonal='hist', color="red", 
               hist_kwds={'bins':[i*0.1 for i in range(11)]})

Additional arguments for the pandas histogram can be found in documentation.

Here is a simple example. I've added a grid to the plot so that you can see the bins align correctly.

import numpy as np
import pandas as pd
from pandas import scatter_matrix
import matplotlib.pyplot as plt

x = np.random.uniform(0,1,100)

scatter_matrix(pd.DataFrame(x), diagonal='hist', 
               hist_kwds={'bins':[i*0.1 for i in range(11)]})
plt.xlabel('x')
plt.ylabel('frequency')
plt.grid()
plt.show()

Histogram

By default, the number of bins in the histogram is 10, but just because your data is distributed between 0 and 1 doesn't mean the bins will be evenly spaced over the range. For example, if you do not actually have a data point equal to 1, you will get a result similar to the one in your question.

Upvotes: 3

Related Questions