Nikhil Mishra
Nikhil Mishra

Reputation: 1250

What does bar plot compute in Y-axis in seaborn?

I am visualizing the titanic dataset. I created 9 different age categories and was trying to visualize the age_categories vs Survived using a bar chart. I wrote the following piece of code:

age_cats = [1, 2, 3, 4, 5, 6, 7, 8, 9]
df_train['Age_Cats'] =  pd.cut(df_train['Age'], 9, labels = age_cats)
sns.barplot(x = 'Age_Cats', y = 'Survived', hue = 'Sex', data = df_train)

enter image description here

I am not understanding what do the numbers on the Y-axis represent?

My assumption is:

{n(Survived = 1)}/{n(Survived = 1) + n(Survived = 0)} or the ratio of people survived out of all people in that category. But how is seaborn calculating it? Or do the numbers on the Y-axis represent anything else?

Upvotes: 0

Views: 479

Answers (1)

ImportanceOfBeingErnest
ImportanceOfBeingErnest

Reputation: 339200

The bar plot shows the survival rate or percentage of people who survived.

E.g. in the age class 1 60% of all males survived. In the age class 7 less than 15% of all males survived.

This is calculated by taking the mean of the survival variable for that age class. E.g. if you had 3 people, 2 of which survived, this variable could look like [1,0,1], the mean of this array is (1+0+1)/3=0.66; the bar plot would hence show a bar up to 0.66.

Upvotes: 1

Related Questions