Moj
Moj

Reputation: 85

Redefining a year to group in pandas

I have some price data as timeseries. I am trying to group and average prices by year and month. Here's the catch, I want to have a year go from Jul-Jun e.g. year 2015 would be Jul 2015 - Jun 2016 (defined as Crop Year) Therefore 'Crop Month' also needs to go 1-12 for Jul-Jun. Is there anyway to achieve this is pandas. The logic in my code seems flawed and doesn't give me the result I want. I tried using a loop with some if statements but my logic got lost somewhere. I have posted the resulting dataframe below. As you can see the crop year data is wrong. Any help with how to approach the code would be great.

´´´

pxdata = ek.get_timeseries(['Wc1','BL2c1', 'W13.5-FOBRU=AGRP'], fields='CLOSE', start_date='2009-01-01')

pxdata.loc[:,'Crop Month'] = 0
pxdata.loc[:,'Crop Year'] = 0
pxdata.iloc[0,-1] = 2015
pxdata.iloc[0,-2] = 8
for i in range(1, len(pxdata)):
    if pxdata.index[i].month==7:
        pxdata.iloc[i,-2]=1
    elif pxdata.index[i].month>pxdata.index[i-1].month:
        pxdata.iloc[i,-2]=pxdata.iloc[i-1,-2]+1
    else:
        pxdata.iloc[i,-2]=pxdata.iloc[i-1,-2]
    if pxdata.iloc[i-1,-2]==12 and pxdata.iloc[i,-2]==1 :
        pxdata.iloc[i,-1]=pxdata.iloc[i-1,-1]+1
    else:
        pxdata.iloc[i,-1]=pxdata.iloc[i-1,-1]

´´´

Thanks in advance, guys

enter image description here

Upvotes: 0

Views: 69

Answers (1)

idonthaveanyplan
idonthaveanyplan

Reputation: 56

set your Date as index, so you can use dataframe.loc['2015-07-01':'2016-06-01'] to select your data.

pandas.DataFrame.loc

Then create a date-range for your time series which you want to select.

start = pd.date_range('2015-06-01', periods=4, freq='12M') end = pd.date_range('2016-07-01', periods=4, freq='12M') for start, end in list(zip(start,end)): print(start, end)

pandas.date_range

Now you can select the period by loc[start:end].

If you want to select average values, you can use dataframe.mean().

pandas.DataFrame.mean

Good luck.

Upvotes: 3

Related Questions