Reputation: 400
Is there a shorter or more elegant way to pivot a timeseries by year in pandas? The code below does what I want but I wonder if there is a better way to accomplish this:
import pandas
import numpy
daterange = pandas.date_range(start='2000-01-01', end='2017-12-31', freq='10T')
# generate a fake timeseries of measured wind speeds from 2000 to 2017 in 10min intervals
wind_speed = pandas.Series(data=numpy.random.rand(daterange.size), index=daterange)
# group by year
wind_speed_groups = wind_speed.groupby(wind_speed.index.year).groups
# assemble data frame with columns of wind speed data for every year
wind_speed_pivot = pandas.DataFrame()
for key, group in wind_speed_groups.items():
series = wind_speed[group]
series.name = key
series.index = series.index - pandas.Timestamp(str(key)+'-01-01')
wind_speed_pivot = wind_speed_pivot.join(series, how='outer')
print(wind_speed_pivot)
Upvotes: 0
Views: 108
Reputation: 13437
I'm not sure if this is the fastest method, as I'm adding two columns to your initial dataframe (it's possible to add just one if you want to overwrite it).
import pandas as pd
import numpy as np
import datetime as dt
daterange = pd.date_range(start='2000-01-01', end='2017-12-31', freq='10T')
# generate a fake timeseries of measured wind speeds from 2000 to 2017 in 10min intervals
wind_speed = pd.Series(data=np.random.rand(daterange.size), index=daterange)
df = wind_speed.to_frame("windspeed")
df["year"] = df.index.year
df["pv_index"] = df.index - df["year"].apply(lambda x: dt.datetime(x,1,1))
wind_speed_pivot = df.pivot_table(index=["pv_index"], columns=["year"], values=["windspeed"])
Upvotes: 1