Zanam
Zanam

Reputation: 4807

Selecting minimum from dataframe based on range from another dataframe

I have a datafrme df1 as:

Type    StDt    EnDt
A   1/2/2012    1/4/2012
B   1/6/2012    1/6/2012

I have another dataframe df2 with all dates until 2019 as:

             KBWI
Date             
2012-01-02  45.00
2012-01-03  32.00
2012-01-04  14.00
2012-01-05  26.00
2012-01-06  27.00

For each row in df1, I need to use the date range StDt, EnDt to extract all rows from df2 and take its minimum to get the following:

Type    StDt    EnDt       Minimum
A   1/2/2012    1/4/2012   14.00
B   1/6/2012    1/6/2012   27.00

I am not sure how to do it efficiently as dataframes are large.

Upvotes: 2

Views: 56

Answers (2)

sammywemmy
sammywemmy

Reputation: 28709

An old question; pd.merge_asof should work nicely here, especially as one of the dataframes just has two rows (end goal is a minimum):

(pd.merge_asof(df2, df1, left_index=True, right_on='StDt')
  .query('Date <= EnDt')
  .groupby('Type')
  .min()
)
Out[204]: 
      KBWI       StDt       EnDt
Type                            
A     14.0 2012-01-02 2012-01-04
B     27.0 2012-01-06 2012-01-06

Upvotes: 0

cs95
cs95

Reputation: 402852

Perliminary preparation: conversion of all involved columns and indices to datetime.

df[['StDt', 'EnDt']] = df[['StDt', 'EnDt']].apply(pd.to_datetime, errors='coerce') 
df2.index = pd.to_datetime(df2.index, errors='coerce')

df

  Type       StDt       EnDt
0    A 2012-01-02 2012-01-04
1    B 2012-01-06 2012-01-06

df2
            KBWI
Date            
2012-01-02  45.0
2012-01-03  32.0
2012-01-04  14.0
2012-01-05  26.0
2012-01-06  27.0

One simple method is to use pd.IntervalIndex and groupby to find the minimum:

idx = pd.IntervalIndex.from_arrays(df['StDt'], df['EnDt'], closed='both')
df['Minimum'] = df2['KBWI'].groupby(idx.get_indexer_non_unique(df2.index)).min()
df

  Type       StDt       EnDt  Minimum
0    A 2012-01-02 2012-01-04     14.0
1    B 2012-01-06 2012-01-06     27.0

This works assuming df's index is also a RangeIndex (numeric, monotonically increasing).

Upvotes: 2

Related Questions