Reputation: 65
I've got data in the below format, and what I'm trying to do is to:
1) loop over each value in Region
2) For each region, plot a time series of the aggregated (across Category) sales number.
Date |Region |Category | Sales
01/01/2016| USA| Furniture|1
01/01/2016| USA| Clothes |0
01/01/2016| Europe| Furniture|2
01/01/2016| Europe| Clothes |0
01/02/2016| USA| Furniture|3
01/02/2016| USA|Clothes|0
01/02/2016| Europe| Furniture|4
01/02/2016| Europe| Clothes|0 ...
The plot should look like the attached (done in excel).
However, if I try to do it in Python using the below, I get multiple charts when I really want all the lines to show up in one figure.
Python code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.read_csv(r'C:\Users\wusm\Desktop\Book7.csv')
plt.legend()
for index, group in df.groupby(["Region"]):
group.plot(x='Date',y='Sales',title=str(index))
plt.show()
Short of reformatting the data, could anyone advise on how to get the graphs in one figure please?
Upvotes: 3
Views: 741
Reputation: 862431
You can use pivot_table
:
df = df.pivot_table(index='Date', columns='Region', values='Sales', aggfunc='sum')
print (df)
Region Europe USA
Date
01/01/2016 2 1
01/02/2016 4 3
df = df.groupby(['Date', 'Region'])['Sales'].sum().unstack(fill_value=0)
print (df)
Region Europe USA
Date
01/01/2016 2 1
01/02/2016 4 3
and then DataFrame.plot
df.plot()
Upvotes: 3