Reputation: 9
I have hourly ozone data over a multi year period in a pandas dataframe. I need to create plots of the ozone data for every day of the year (i.e. 365 plots for the year). The time series is in the following format:
time_lt
3 1980-04-24 17:00:00
4 1980-04-24 18:00:00
5 1980-04-24 19:00:00
6 1980-04-24 20:00:00
7 1980-04-24 21:00:00
8 1980-04-24 22:00:00
9 1980-04-24 23:00:00
10 1980-04-25 00:00:00
11 1980-04-25 01:00:00
12 1980-04-25 02:00:00
13 1980-04-25 03:00:00
14 1980-04-25 04:00:00
How would I group the data by every day in order to plot each? what is the most efficient way of coding this?
Thanks!
Upvotes: 0
Views: 412
Reputation: 2405
You can group on the fly:
import pandas as pd
from io import StringIO
df = pd.read_csv(StringIO(
"""id time_lt
3 1980-04-24 17:00:00
4 1980-04-24 18:00:00
5 1980-04-24 19:00:00
6 1980-04-24 20:00:00
7 1980-04-24 21:00:00
8 1980-04-24 22:00:00
9 1980-04-24 23:00:00
10 1980-04-25 00:00:00
11 1980-04-25 01:00:00
12 1980-04-25 02:00:00
13 1980-04-25 03:00:00
14 1980-04-25 04:00:00"""), sep=" \s+")
df['time_lt'] = pd.to_datetime(df['time_lt'])
>>> df.groupby(df.time_lt.dt.floor('1D')).count()
id time_lt
time_lt
1980-04-24 7 7
1980-04-25 5 5
In theory, you can write a plotting function and apply
it directly to the groupby
result. But then it will be harder to control it. Since plotting itself will still be slowest operation in this chain, you can safely do simple iteration over dates.
Upvotes: 0
Reputation: 9047
Find comments inline
df['time_lt'] = pd.to_datetime(df['time_lt'])
# you can extract day, month, year
df['day'] = df['time_lt'].dt.day
df['month'] = df['time_lt'].dt.month
df['year'] = df['time_lt'].dt.year
#then use groupby
grouped = df.groupby(['day', 'month', 'year'])
# now you can plot individual groups
Upvotes: 1