Reputation: 173
Dataframe A ('df_a') contains location-split temperature values at re-sampled 5-minute intervals:
logtime_round | location | value
2017-05-01 06:05:00 | 0 | 17
2017-05-01 06:05:00 | 1 | 14.5
2017-05-01 06:05:00 | 2 | 14.5
etc...
Dataframe B ('df_b') contains temperature values (re-sampled from hourly to daily):
logtime_round | airtemp
2017-05-01 | 10.33333
2017-05-02 | 10.42083
etc...
I have manipulated df_b so that only airtemp (format: datetime64[ns]) <= 15.5 are included, and now would like to manipulate df_a so that a new dataframe is created featuring only the same days included in df_b (I'm only interested in locations and values when outdoor air temperature was below <= 15.5).
Is this possible?
My first plan was to join the two dataframes and then look to remove any NaN airtemp values to get my desired df, however, the df_b airtemp is only featured for the first row (e.g. for 2017-05-01) with the rest as NaNs. So perhaps the df_b daily airtemp can be duplicated across all rows in the same day?
joindf = df_a.join(df_b)
Thanks!
Upvotes: 1
Views: 32
Reputation: 93181
Use merge_asof
(assuming both frames have been sorted by time):
pd.merge_asof(df_a, df_b, on='logtime_round')
Upvotes: 1