Reputation: 709
I had the following stock daily price data from 2010/12 to 2017/12. How could i select last week's data of each year? I intend to check the performance for the last week of each year.
2017-01-05 52.99 13018070.0 52.370 53.0600 51.4000
2017-01-04 52.86 12556860.0 50.770 53.3400 50.7300
2017-01-03 50.29 15794400.0 48.800 50.3000 48.4700
2016-12-30 46.75 13593420.0 48.365 48.4000 46.3600
2016-12-29 47.77 11728250.0 48.440 48.8600 47.1800
2016-12-28 48.51 14636340.0 50.580 50.7300 48.4700
2016-12-27 50.43 5594876.0 49.690 50.5500 49.6500
2016-12-23 49.59 6966559.0 49.250 49.7200 48.9900
2016-12-22 49.44 10918300.0 50.320 50.5500 49.1711
2016-12-21 50.34 9279635.0 49.820 50.4400 49.6700
2016-12-20 49.53 9533020.0 48.990 49.7900 48.9100
2016-12-19 48.55 10323930.0 47.450 48.6700 47.4300
...
2010-12-20 ...
Upvotes: 1
Views: 1673
Reputation: 18906
You could use groupby passing the datetime year. But first we would need to remove (filter away) data that doesn't meet your criteria. Also make sure your date is datetime.
This code will check that month is equal to December (12) and that the day is larger or equal of 25 (i.e. the last 7 days of each year). If you want the last week of the year you could look at Wen's lambda function.
data = '''\
2017-12-25 52.99 13018070.0 52.370 53.0600 51.4000
2017-01-04 52.86 12556860.0 50.770 53.3400 50.7300
2017-01-03 50.29 15794400.0 48.800 50.3000 48.4700
2016-12-30 46.75 13593420.0 48.365 48.4000 46.3600
2016-12-29 47.77 11728250.0 48.440 48.8600 47.1800
2016-12-28 48.51 14636340.0 50.580 50.7300 48.4700
2016-12-27 50.43 5594876.0 49.690 50.5500 49.6500
2016-12-23 49.59 6966559.0 49.250 49.7200 48.9900
2016-12-22 49.44 10918300.0 50.320 50.5500 49.1711
2016-12-21 50.34 9279635.0 49.820 50.4400 49.6700
2016-12-20 49.53 9533020.0 48.990 49.7900 48.9100
2016-12-19 48.55 10323930.0 47.450 48.6700 47.4300'''
import io
import pandas as pd
df = pd.read_csv(io.StringIO(data), sep='\s+', header=None, parse_dates=[0])
df = df[df[0].dt.month.eq(12) & df[0].dt.day.le(25)] # remove data
# Groupby year according to: https://stackoverflow.com/a/11397052/7386332
for idx, dfx in df.groupby(df[0].map(lambda x: x.year)):
print('Dataframe containing {}\'s last week:'.format(idx))
print(dfx)
print()
Prints
Dataframe containing 2016's last week:
0 1 2 3 4 5
7 2016-12-23 49.59 6966559.0 49.25 49.72 48.9900
8 2016-12-22 49.44 10918300.0 50.32 50.55 49.1711
9 2016-12-21 50.34 9279635.0 49.82 50.44 49.6700
10 2016-12-20 49.53 9533020.0 48.99 49.79 48.9100
11 2016-12-19 48.55 10323930.0 47.45 48.67 47.4300
Dataframe containing 2017's last week:
0 1 2 3 4 5
0 2017-12-25 52.99 13018070.0 52.37 53.06 51.4
Upvotes: 2
Reputation: 323226
The data from Anton :-)
df[df.groupby(df[0].dt.year)[0].apply(lambda x : x.dt.week==x.dt.week.max())]
Out[1471]:
0 1 2 3 4 5
0 2017-12-25 52.99 13018070.0 52.370 53.06 51.40
3 2016-12-30 46.75 13593420.0 48.365 48.40 46.36
4 2016-12-29 47.77 11728250.0 48.440 48.86 47.18
5 2016-12-28 48.51 14636340.0 50.580 50.73 48.47
6 2016-12-27 50.43 5594876.0 49.690 50.55 49.65
Upvotes: 2