Reputation: 72
Let's say the data used is something like this
df = pd.DataFrame({'Order_id': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
'Order_date': ['10/1/2020', '10/1/2020', '11/1/2020', '11/1/2020', '12/1/2020', '12/1/2020', '12/1/2020', '12/1/2020', '13/1/2020', '13/1/2020'],
'Product_nr': [0, 2, 1, 0, 2, 0, 2, 1, 2, 0],
'Quantity': [3, 1, 6, 5, 10, 1, 2, 5, 4, 3]})
#transforming the date column into datetime
df['Order_date'] = pd.to_datetime(df['Order_date'])
and I'm trying to plot the number of ordered products per day per product over the given time span.
My initial idea would be something like
product_groups = df.groupby(['Product_nr'])
products_daily = pd.DataFrame()
for product, total_orders in product_groups:
products_daily[product.day] = total_orders.values
products_daily.plot(subplots=True, legend=False)
pyplot.show()
I know there must be a groupby('Product_nr')
and the date should be splitted into days using Grouper(freq='D')
. They should also be a for loop to combine them and then plotting them all but I really have no clue how to put those pieces together. How can I archieve this? My ultimate goal is actually to plot them per month per product for over 4 years of sales records, but given the example data here I changed it into daily.
Any suggestion or link for guides, tutorials are welcome too. Thank you very much!
Upvotes: 0
Views: 706
Reputation: 150785
You can pivot the table and use pandas' plot
function:
(df.groupby(['Order_date', 'Product_nr'])
['Quantity'].sum()
.unstack('Product_nr')
.plot(subplots=True, layout=(1,3)) # change layout to fit your data
)
Output:
Upvotes: 1