Reputation: 127
Suppose we have a function (like pdf of a normal distribution), and we want to approximate it with histograms under the function. I wanna specify the number of bins and draw histograms under the curve. How is it possible to do in Python? For example, a graph like below, but all spikes are under the curve, and the number of bins is a parameter.
Upvotes: 0
Views: 270
Reputation: 80329
You can use the pdf
to decide the heights of the bars:
from scipy.stats import norm
import numpy as np
N = 20
x = np.linspace(norm.ppf([0.001, 0.999]), N)
y = norm.pdf(x)
Each center of a bar will be just as high as the pdf
, so the bars will cut the curve. To only touch the curve, one could calculate the pdf
at the lowest point, being x + width/2
for positive points. As the pdf
is symmetric, abs
can be used to create a single expression for both positive and negative x-values.
Here is an animation created via the celluloid library.
import matplotlib.pyplot as plt
import numpy as np
import scipy
from scipy.stats import norm
from celluloid import Camera
fig, ax = plt.subplots(figsize=(8, 2))
fig.subplots_adjust(bottom=0.15, left=0.1, right=0.97)
ax.spines['right'].set_visible(False)
ax.spines['top'].set_visible(False)
ax.spines['bottom'].set_visible(False)
camera = Camera(fig)
x0, x1 = norm.ppf([0.001, 0.999])
x_pdf = np.linspace(x0, x1, 1000)
y_pdf = norm.pdf(x_pdf)
for N in range(10, 80):
ax.plot(x_pdf, y_pdf, 'r', lw=2)
x_bar = np.linspace(x0, x1, N)
width = x_bar[1]-x_bar[0]
y_bar = norm.pdf(np.abs(x_bar) + width/2)
ax.bar(x_bar, y_bar, width=width, fc='DeepSkyBlue', ec='k')
ax.margins(x=0)
ax.set_ylabel('probability density')
camera.snap()
animation = camera.animate(interval=600)
animation.save('gaussian_histogram.gif')
plt.show()
PS: Here is a list of related questions (collected by @TrentonMcKinney), where you can find additional explanation and ideas:
Upvotes: 2