Reputation: 85
I have some price data as timeseries. I am trying to group and average prices by year and month. Here's the catch, I want to have a year go from Jul-Jun e.g. year 2015 would be Jul 2015 - Jun 2016 (defined as Crop Year) Therefore 'Crop Month' also needs to go 1-12 for Jul-Jun. Is there anyway to achieve this is pandas. The logic in my code seems flawed and doesn't give me the result I want. I tried using a loop with some if statements but my logic got lost somewhere. I have posted the resulting dataframe below. As you can see the crop year data is wrong. Any help with how to approach the code would be great.
´´´
pxdata = ek.get_timeseries(['Wc1','BL2c1', 'W13.5-FOBRU=AGRP'], fields='CLOSE', start_date='2009-01-01')
pxdata.loc[:,'Crop Month'] = 0
pxdata.loc[:,'Crop Year'] = 0
pxdata.iloc[0,-1] = 2015
pxdata.iloc[0,-2] = 8
for i in range(1, len(pxdata)):
if pxdata.index[i].month==7:
pxdata.iloc[i,-2]=1
elif pxdata.index[i].month>pxdata.index[i-1].month:
pxdata.iloc[i,-2]=pxdata.iloc[i-1,-2]+1
else:
pxdata.iloc[i,-2]=pxdata.iloc[i-1,-2]
if pxdata.iloc[i-1,-2]==12 and pxdata.iloc[i,-2]==1 :
pxdata.iloc[i,-1]=pxdata.iloc[i-1,-1]+1
else:
pxdata.iloc[i,-1]=pxdata.iloc[i-1,-1]
´´´
Thanks in advance, guys
Upvotes: 0
Views: 69
Reputation: 56
set your Date as index, so you can use dataframe.loc['2015-07-01':'2016-06-01']
to select your data.
Then create a date-range for your time series which you want to select.
start = pd.date_range('2015-06-01', periods=4, freq='12M')
end = pd.date_range('2016-07-01', periods=4, freq='12M')
for start, end in list(zip(start,end)):
print(start, end)
Now you can select the period by loc[start:end]
.
If you want to select average values, you can use dataframe.mean()
.
Good luck.
Upvotes: 3