Reputation: 21
How do I create a histogram using Seaborn's displot with two y axes: one showing count and the other showing the corresponding density? I tried this code but the result does not make sense:
ax = sns.distplot( df_flavors.Freq, kde = False )
ax.set_title( 'Distribution of Flavor Purchases\nNumber Purchased', fontsize = font_title )
ax.set( ylabel = 'Count', xlabel = 'Number of Flavors Purchased' )
ax.set_xticks( range( n ))
ax.set_xticklabels( range( n ) )
##
ax2 = plt.twinx()
The DataFrame df_flavors is a large DataFrame with 2000 records, each showing how many different flavors of yogurts people bought (0 - 7 flavors). The people are respondents to a survey with n = 2000. The variable Freq is the count for each respondent. The sns.distplot produces the count on the left axis; that's ok. The ax2 = plt.twinx() produces a second y-axis but not percents on that axis, just cumulative percents; that's not ok. Any suggestions for getting just percent or density of the total 2000 on the right?
Upvotes: 1
Views: 4146
Reputation: 80509
On one axis, the histogram without the kde could be drawn. And on the other the kde without the histogram. The left y-axis will contain the count and the right the density.
import numpy as np
import seaborn as sns
from matplotlib import pyplot as plt
# generate some random test data
y = np.abs(np.random.normal(np.random.choice([5, 9, 15], 2000, p=[3/9, 5/9, 1/9]), 2, 2000))
ax = sns.distplot(y, kde=False)
ax.set_title('Distribution of Flavor Purchases\nNumber Purchased')
ax.set(ylabel='Count', xlabel='Number of Flavors Purchased')
n = 20
ax.set_xticks(range(n))
ax.set_xticklabels(range(n))
ax2 = plt.twinx()
ax2 = sns.distplot(y, kde=True, hist=False, ax=ax2)
ax2.set_ylabel('density')
plt.show()
Upvotes: 1