user357269
user357269

Reputation: 1913

Getting the last value for each week, with the matching date

So I start out with a pd.Series called jpm, and I would like to group it into weeks and take the last value from each week. This works with the code below, it does get the last value. But it changes corresponding index to the Sunday of the week, and I would like it to leave it unchaged.

import pandas_datareader.data as web
import pandas as pd

start = pd.datetime(2015, 11, 1)
end = pd.datetime(2015, 11, 17)

raw_jpm = web.DataReader("JPM", 'yahoo', start, end)["Adj Close"]
jpm = raw_jpm.ix[raw_jpm.index[::2]]

jpm is now

Date
2015-11-02    64.125610
2015-11-04    64.428918
2015-11-06    66.982593
2015-11-10    66.219427
2015-11-12    64.575682
2015-11-16    65.074678
Name: Adj Close, dtype: float64

I want to do some operations to it, such as

weekly = jpm.groupby(pd.TimeGrouper('W')).last()

weekly is now

Date
2015-11-08    66.982593
2015-11-15    64.575682
2015-11-22    65.074678
Freq: W-SUN, Name: Adj Close, dtype: float64

which is great, except all my dates got changed. The output I want, is:

Date
2015-11-06    66.982593
2015-11-12    64.575682
2015-11-16    65.074678

Upvotes: 2

Views: 1081

Answers (3)

user357269
user357269

Reputation: 1913

It seems a little tricky to do this in pure pandas, so I used numpy

import numpy as np
weekly = jpm.groupby(pd.TimeGrouper('W-SUN')).last()
weekly.index = jpm.index[np.searchsorted(jpm.index, weekly.index, side="right")-1]

Upvotes: 1

Nickil Maveli
Nickil Maveli

Reputation: 29711

You could provide a DateOffset by specifying the class name Week and indicating the weekly frequency W-FRI, by setting the dayofweek property as 4 [Monday : 0 → Sunday : 6]

jpm.groupby(pd.TimeGrouper(freq=pd.offsets.Week(weekday=4))).last().tail(5)

Date
2016-08-19    65.860001
2016-08-26    66.220001
2016-09-02    67.489998
2016-09-09    66.650002
2016-09-16    65.820000
Freq: W-FRI, Name: Adj Close, dtype: float64

If you want the starting date as the next monday from start date and the previous sunday from the end date, you could do this way:

from datetime import datetime, timedelta

start = datetime(2015, 11, 1)
monday = start + timedelta(days=(7 - start.weekday())) 

end = datetime(2016, 9, 30)
sunday = end - timedelta(days=end.weekday() + 1)

print (monday)
2015-11-02 00:00:00
print (sunday)
2016-09-25 00:00:00

Then, use it as:

jpm = web.DataReader('JPM', 'yahoo', monday, sunday)["Adj Close"]
jpm.groupby(pd.TimeGrouper(freq='7D')).last()

To get it all on a Sunday, as you specified the range Monday → Sunday and Sunday being the last day for the date to be considered, you could do a small hack:

monday_new = monday - timedelta(days=3)

jpm = web.DataReader('JPM', 'yahoo', monday_new, sunday)["Adj Close"]
jpm.groupby(pd.TimeGrouper(freq='W')).last().head()

Date
2015-11-01    62.863448
2015-11-08    66.982593
2015-11-15    64.145175
2015-11-22    66.082449
2015-11-29    65.720431
Freq: W-SUN, Name: Adj Close, dtype: float64

Now that you've posted the desired output, you can arrive at the result using transform method instead of taking the aggregated last, so that it returns an object that is indexed the same size as the one being grouped.

df = jpm.groupby(pd.TimeGrouper(freq='W')).transform('last').reset_index(name='Last')

df

Image

df['counter'] = (df['Last'].shift() != df['Last']).astype(int).cumsum()

Image

df.groupby(['Last','counter'])['Date'].apply(lambda x: np.array(x)[-1])   \
  .reset_index().set_index('Date').sort_index()['Last']

Date
2015-11-06    66.982593
2015-11-12    64.575682
2015-11-16    65.074678
Name: Last, dtype: float64

Note: This is capable of handling repeated entries that occur in two separate dates due to the inclusion of the counter column which bins them separately into two buckets.

Upvotes: 1

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210832

you can do it this way:

In [15]: jpm
Out[15]:
Date
2015-11-02    64.125610
2015-11-04    64.428918
2015-11-06    66.982593
2015-11-10    66.219427
2015-11-12    64.575682
2015-11-16    65.074678
Name: Adj Close, dtype: float64

In [16]: jpm.groupby(jpm.index.week).transform('last').drop_duplicates(keep='last')
Out[16]:
Date
2015-11-06    66.982593
2015-11-12    64.575682
2015-11-16    65.074678
dtype: float64

Explanation:

In [17]: jpm.groupby(jpm.index.week).transform('last')
Out[17]:
Date
2015-11-02    66.982593
2015-11-04    66.982593
2015-11-06    66.982593
2015-11-10    64.575682
2015-11-12    64.575682
2015-11-16    65.074678
dtype: float64

Upvotes: 1

Related Questions