ben121
ben121

Reputation: 897

Select row if within date range

I have the dataframe below.

df = pd.DataFrame({'Date': ['01/12/2017', '01/01/2018',' 01/02/2018', '01/03/2018'], 'Output': [17, 16, 4, 21]})

I want to be able to take the latest 'output' number from the above dataframe, depending on a specific date. e.g

Date           Output
24/12/2017     17
13/01/2018     16
21/01/2018     16
14/02/2018     4
13/03/2018     21

I have looked at slicing by date range but it is almost the reverse of that.

Upvotes: 0

Views: 38

Answers (1)

jezrael
jezrael

Reputation: 862441

Use merge_asof with sort_values if Date columns are not sorted:

print (df)
         Date
0  24/12/2017
1  13/01/2018
2  21/01/2018
3  14/02/2018
4  13/03/2018

df['Date'] = pd.to_datetime(df['Date'], dayfirst=True)
df1 = pd.DataFrame({'Date': ['01/12/2017', '01/01/2018',' 01/02/2018', '01/03/2018'], 
                    'Output': [17, 16, 4, 21]})
df1['Date'] = pd.to_datetime(df1['Date'], dayfirst=True)
print (df1)
        Date  Output
0 2017-12-01      17
1 2018-01-01      16
2 2018-02-01       4
3 2018-03-01      21

#if necessary sort DataFrames by Dates
df = df.sort_values('Date')
df1 = df1.sort_values('Date')

df2 = pd.merge_asof(df, df1, on='Date')
print (df2)
        Date  Output
0 2017-12-24      17
1 2018-01-13      16
2 2018-01-21      16
3 2018-02-14       4
4 2018-03-13      21

Upvotes: 1

Related Questions