Reputation: 89
I have 15-minute timestep data of a quantity for serveral years...
Datetime | Quantity |
---|---|
01/07/2018 00:15 | 6.96 |
01/07/2018 00:30 | 6.48 |
01/07/2018 00:45 | 6.96 |
01/07/2018 01:00 | 6.72 |
. | . |
. | . |
I am using Pandas. How do I produce a bar plot with months on the horizontal axis; and a series (set of bars) for each year; with the height of each bar being the total quantity for that month & year.
Exactly like this:
Upvotes: 2
Views: 3066
Reputation: 12496
Fake dataframe creation:
df = pd.DataFrame()
df['Datetime'] = pd.date_range(start = '01/07/2018', end = '13/08/2021', freq = '15min')
df['Quantity'] = np.random.rand(len(df))
Starting from this point, you should extract month and year and save them in separate columns:
df['month'] = df['Datetime'].dt.month
df['year'] = df['Datetime'].dt.year
Then you have to compute the sum of 'Quantity'
by month for each year:
df = df.groupby(by = ['month', 'year'])['Quantity'].sum().reset_index()
After this passage, you should have a dataframe like this:
Datetime Quantity month year
0 2018-01-07 00:00:00 0.226113 1 2018
1 2018-01-07 00:15:00 0.222872 1 2018
2 2018-01-07 00:30:00 0.835484 1 2018
3 2018-01-07 00:45:00 0.775771 1 2018
4 2018-01-07 01:00:00 0.972559 1 2018
5 2018-01-07 01:15:00 0.418036 1 2018
6 2018-01-07 01:30:00 0.902843 1 2018
7 2018-01-07 01:45:00 0.012441 1 2018
8 2018-01-07 02:00:00 0.883437 1 2018
9 2018-01-07 02:15:00 0.183561 1 2018
Now the dataframe is ready to be plotted; using seaborn:
fig, ax = plt.subplots()
sns.barplot(ax = ax, data = df, x = 'month', y = 'Quantity', hue = 'year')
plt.show()
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.DataFrame()
df['Datetime'] = pd.date_range(start = '01/07/2018', end = '13/08/2021', freq = '15min')
df['Quantity'] = np.random.rand(len(df))
df['month'] = df['Datetime'].dt.month
df['year'] = df['Datetime'].dt.year
df = df.groupby(by = ['month', 'year'])['Quantity'].sum().reset_index()
fig, ax = plt.subplots()
sns.barplot(ax = ax, data = df, x = 'month', y = 'Quantity', hue = 'year')
plt.show()
Upvotes: 5
Reputation: 63
Perhaps you can extract months and years into new columns and make multiple subplots with months in the x axis, one for each year, and combine them all at the end in a unique plot. Take a look at the example below, and notice the width parameter and the displacement by the same value in plt.bar
, so that plots don't cover each other.
import pandas as pd
import matplotlib.pyplot as plt
import datetime
# create df
d1 = datetime.date(2018, 8, 30)
d2 = datetime.date(2018, 9, 30)
d3 = datetime.date(2019, 8, 30)
d4 = datetime.date(2019, 9, 30)
df = pd.DataFrame({
'date': [d1, d1, d2, d2, d3, d3, d4, d4],
'values':[10, 20, 40, 40, 50, 55, 65, 70]})
df['month'] = df.date.apply(lambda x: x.month)
df['year'] = df.date.apply(lambda x: x.year)
# make plots
fig, ax = plt.subplots()
ax = plt.bar(df[df.year == 2018].groupby(['month']).sum()['values'].index, df[df.year == 2018].groupby(['month']).sum()['values'])
ax = plt.bar(df[df.year == 2019].groupby(['month']).sum()['values'].index, df[df.year == 2019].groupby(['month']).sum()['values'])
plt.show()
Maybe creating new columns as I did won't be very efficient for you if you have a very large dataframe. To make the plots, I filtered rows by year in each line, grouped them by month and used the sum of values. The indexes are tuples (year, month)
.
Upvotes: 1