bigmac42
bigmac42

Reputation: 71

Matplotlib histogram Not Creating Specified Number of Bins

So right now I have a bunch of data where y-values represent a recorded intensity, and x-values are the wavelength associated with said intensity. Currently, I am trying to plot a distribution of the intensities at a given wavelength, so after filtering my data to a specific wavelength (or 'x' value) it looks something like:

           y0        y1       y2  ...       y47       y48       y49
675  0.005513  0.007296  0.00572  ... -0.000084 -0.004105 -0.001181

Now, I try to create a histogram from that data by using the following code:

plt.hist(wavelength_338.iloc[[2], :-1], bins = 5, ec= 'skyblue')
plt.xlabel("Δy (y\u0305 -y)")
plt.ylabel("Count")
plt.title("Δy Distribution for 338.05 nm")
plt.show()

Note, I calculated the number of bins by using the Freedman-Diaconis rule. Here is a link to the plot:

resulting plot

It is clearly making more than 5 bins and I cannot seem to figure out why.

Upvotes: 1

Views: 1357

Answers (1)

JohanC
JohanC

Reputation: 80509

You are selecting one row of the dataframe. That row is a dataframe with one row and 49 columns. plt.hist will draw a histogram for each of the columns (each histogram will only contain one bar of height 1):

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

wavelength_338 = pd.DataFrame(np.random.randn(5, 50), columns=[f"y{i}" for i in range(50)])
one_row = wavelength_338.iloc[[2], :-1]

The row looks like:

         y0        y1        y2  ...       y46       y47       y48
2  0.111689  0.038995  0.119713  ...  0.427522  0.549125  0.668667

A histogram looks like:

plt.hist(one_row, bins=5)

histogram of one row

You could transpose the row to make it one column with 49 elements and then draw a histogram:

plt.hist(one_row.T, bins=5)

histogram of transposed rows

Upvotes: 2

Related Questions