Reputation: 1
I have a DataFrame:
loan_status Principal
244 PAIDOFF 1000
245 PAIDOFF 1000
246 PAIDOFF 1000
247 PAIDOFF 1000
248 PAIDOFF 1000
249 PAIDOFF 1000
250 PAIDOFF 800
252 PAIDOFF 1000
253 PAIDOFF 1000
254 PAIDOFF 1000
255 PAIDOFF 1000
256 PAIDOFF 800
257 PAIDOFF 1000
258 PAIDOFF 1000
259 PAIDOFF 1000
260 COLLECTION 1000
261 COLLECTION 1000
262 COLLECTION 800
263 COLLECTION 800
264 COLLECTION 800
265 COLLECTION 1000
266 COLLECTION 1000
and I want the result as
hope to get your help thank you
Upvotes: 0
Views: 955
Reputation: 3660
With pandas you can create a crosstab of the two variables which gives you the counts by default. If one of the variables is numerical, an aggregate function can be applied to it. A stacked bar chart can be plotted directly from the table, like in the following example where the 'Principal' values are summed up:
import pandas as pd # v 1.1.3
# Note that if the 'values' and 'aggfunc' arguments are omitted, the
# table will contain the counts
ctab = pd.crosstab(index=df['Principal'], columns=df['loan_status'],
values=df['Principal'], aggfunc='sum')
ctab.plot.bar(stacked=True)
Upvotes: 0
Reputation: 62523
pandas.DataFrame.groupby
:.count
:import pandas as pd
import matplotlib.pyplot as plt
df.groupby(['Principal', 'loan_status'])['loan_status'].count().unstack().plot.bar(stacked=True)
plt.show()
.sum
:df.groupby(['Principal', 'loan_status'])['Principal'].sum().unstack().plot.bar(stacked=True)
plt.show()
.mean
:df.groupby(['Principal', 'loan_status'])['Principal'].mean().unstack().plot.bar(stacked=True)
plt.show()
Upvotes: 2