Reputation: 111
I have a table of grades and I want all of the bins to be of the same width
i want the bins to be in the range of [0,56,60,65,70,80,85,90,95,100] when the first bin is from 0-56 then 56-60 ... with the same width
sns.set_style('darkgrid')
newBins = [0,56,60,65,70,80,85,90,95,100]
sns.displot(data= scores , bins=newBins)
plt.xlabel('grade')
plt.xlim(0,100)
plt.xticks(newBins);
Expected output
how I can balance the width of the bins?
Upvotes: 1
Views: 5791
Reputation: 1
You can use bin parameter from histplots but to get exact answer you have to use pd.cut()
to creating your own bins.
np.random.seed(101)
df = pd.DataFrame({'scores':pd.Series(np.random.randint(100,size=175)),
'bins_created':pd.cut(scores,bins=[0,55,60,65,70,75,80,85,90,95,100])})
new_data = df['bins_created'].value_counts()
plt.figure(figsize=(10,5),dpi=100)
plots = sns.barplot(x=new_data.index,y=new_data.values)
plt.xlabel('grades')
plt.ylabel('counts')
for bar in plots.patches:
plots.annotate(format(bar.get_height(), '.2f'),
(bar.get_x() + bar.get_width() / 2,
bar.get_height()), ha='center', va='center',
size=10, xytext=(0,5),
textcoords='offset points')
plt.show()
Upvotes: 0
Reputation: 260600
You need to cheat a bit. Define you own bins and name the bins with a linear range. Here is an example:
s = pd.Series(np.random.randint(100, size=100000))
bins = [-0.1, 50, 75, 95, 101]
s2 = pd.cut(s, bins=bins, labels=range(len(bins)-1))
ax = s2.astype(int).plot.hist(bins=len(bins)-
1)
ax.set_xticks(np.linspace(0, len(bins)-2, len(bins)))
ax.set_xticklabels(bins)
Output:
Old answer:
Why don't you let seaborn pick the bins for you:
sns.displot(data=scores, bins='auto')
Or set the number of bins that you want:
sns.displot(data=scores, bins=10)
They will be evenly distributed
Upvotes: 4
Reputation: 7863
You assigning a list to the bins
argument of sns.distplot()
. This specifies the edges of bins. Since these edges are not spaced evenly, the widths of bins vary.
I think that you may want to use a bar plot (sbs.barplot()
) and not a histogram. You would need to compute how many data points are in each bin, and then plot bars without the information what range of values each bar represents. Something like this:
import seaborn as sns
import matplotlib.pyplot as plt
sns.set_style('darkgrid')
import numpy as np
# sample data
data = np.random.randint(0, 100, 200)
newBins = [0,56,60,65,70,80,85,90,95,100]
# compute bar heights
hist, _ = np.histogram(data, bins=newBins)
# plot a bar diagram
sns.barplot(x = list(range(len(hist))), y = hist)
plt.show()
It gives:
Upvotes: 2
Reputation: 531
just change the list of values that are you using as binds:
newBins = numpy.arange(0, 100, 1)
Upvotes: 1