Reputation: 473
Currently I have a time series data frame as follows:
dfMain =
Date Portfolio Value
0 2016-07-01 1.000000e+06
1 2016-07-08 1.025168e+06
2 2016-07-15 1.028053e+06
3 2016-07-22 1.024184e+06
4 2016-07-29 1.022491e+06
5 2016-08-05 1.023241e+06
6 2016-08-12 1.030325e+06
7 2016-08-19 1.032742e+06
8 2016-08-26 1.032567e+06
9 2016-09-02 1.028614e+06
10 2016-09-09 9.930876e+05
11 2016-09-16 9.956875e+05
12 2016-09-23 1.010174e+06
13 2016-09-30 1.010388e+06
14 2016-10-07 1.004989e+06
15 2016-10-14 9.924929e+05
16 2016-10-21 9.969708e+05
17 2016-10-28 9.816373e+05
18 2016-11-04 9.563689e+05
19 2016-11-11 9.869579e+05
20 2016-11-18 9.936929e+05
21 2016-11-25 1.009625e+06
Given that the dataframe can be different (can't just pull specific rows from example) what would be the best way to pull the closest to the end of month dates from the dataframe? for example index 4 would be pulled because that is the closest to the end of month date.
Any tips would be greatly appreciated!
Upvotes: 0
Views: 56
Reputation: 402323
Group on the month number and find the last record:
df.Date = pd.to_datetime(df.Date, errors='coerce')
df.groupby(df.Date.dt.month).last()
Date Portfolio Value
Date
7 2016-07-29 1022491.0
8 2016-08-26 1032567.0
9 2016-09-30 1010388.0
10 2016-10-28 981637.3
11 2016-11-25 1009625.0
If rows aren't sorted by Date, call sort_values
first:
df.sort_values('Date').groupby(df.Date.dt.month).last()
Date Portfolio Value
Date
7 2016-07-29 1022491.0
8 2016-08-26 1032567.0
9 2016-09-30 1010388.0
10 2016-10-28 981637.3
11 2016-11-25 1009625.0
Should work in any case.
If you have dates spanning multiple years, better to groupby on the year-month:
df.sort_values('Date').groupby([df.Date.dt.year, df.Date.dt.month]).last()
Upvotes: 2
Reputation: 8631
You need to sort the dates and then find the last value for each group.
df['Date'] = pd.to_datetime(df['Date'])
grp = df.sort_values('Date').groupby(df['Date'].dt.month)
pd.DataFrame([grp.get_group(x).iloc[-1] for x in grp.groups])
Output:
Date Portfolio Value
4 2016-07-29 1022491.0
8 2016-08-26 1032567.0
13 2016-09-30 1010388.0
17 2016-10-28 981637.3
21 2016-11-25 1009625.0
Upvotes: 1