Reputation: 151
train['Name_title'].groupby(pd.qcut(train['Name_Len'],5)).value_counts(normalize = True)
Part of the output is:
Name_Len Name_title
(11.999, 19.0] Mr. 0.803922
Miss. 0.151961
Mrs. 0.019608
Master. 0.009804
Col. 0.004902
Dr. 0.004902
Rev. 0.004902
(19.0, 23.0] Mr. 0.698718
Miss. 0.237179
Master. 0.025641
Mrs. 0.019231
Dr. 0.006410
Mlle. 0.006410
Rev. 0.006410
And I want to plot the proportion of each title for each group. Is there any way to do it straightforward?
Thank you in advance.
Upvotes: 0
Views: 40
Reputation: 46898
You can use cross tab to tabulate and use the pandas plot with stacked=True :
np.random.seed(111)
train = pd.DataFrame({'Name_title':np.random.choice(['Mr','Mrs','Miss'],20),
'Name_Len':np.random.uniform(1,50,20)})
pd.crosstab(pd.qcut(train['Name_Len'],5),
train['Name_title'],normalize='index').plot.bar(stacked=True)
If you unstack it will work too:
train['Name_title'].groupby(
pd.qcut(train['Name_Len'],5)
).value_counts(normalize = True).unstack().plot.bar(stacked=True)
Upvotes: 2