Reputation: 129
I have a data frame for customer transactions:
customer_id|tier |transaction_type|year|no_of_purchases
1353455 |1 |online |2012|5
1353455 |1 |retail |2012|8
1353455 |1 |retail |2014|10
1543798 |2 |retail |2012|1
The tier is their loyalty program tier. I want to see if there is difference in the transaction_type and no_of_purchases between each tier.
I did a countplot for this:
sns.countplot(x="transaction_type", y="no_of_purchase", hue"tier")
Goal: I want show for each year (x-axis) what is the count of the no_purchases for each transaction_type and how does this differ for each tier. Here is an example of how I want it to look. Ideally we would have 4 graphs, one for each of the transaction type. It would be great if we can show the count at each point too.
Upvotes: 1
Views: 7306
Reputation: 682
You can use seaborn catplot.
First, you can aggregate your data, then do the vizualization using seaborn.
data_viz = df.groupby(['year','transaction_type','tier'], as_index=False)['no_of_purchases'].sum()
sns.catplot(data=data_viz, x='year', y='no_of_purchases', hue='tier', col='transaction_type', kind='bar')
But unfortunately based on the documentation, you cannot plot the data using line plot.
There is a work around though, you can do something like this.
import matplotlib.pyplot as plt
data_viz = df.groupby(['year','transaction_type','tier'], as_index=False)['no_of_purchases'].sum()
for i in list(data_viz['transaction_type'].unique()):
viz = sns.lineplot(data=data_viz[data_viz['transaction_type'] == i], x='year', y='no_of_purchases', hue='tier')
plt.title(i)
plt.show()
Upvotes: 2