Winston
Winston

Reputation: 1428

How to apply ranged lookup on pandas dataframe?

I have the following dataframe

Name   X 
Jack   2
Ann    4
John   18

and the following lookup table

X_Min X_Max Y
      2     2
3     7     4
8     15    9
16          25

Which X_Min and X_Max is the range of lookup value X in the first dataframe. I would like to apply the lookup table to my dataframe to have the result below

Name   X    Y(Targeted)
Jack   2    2
Ann    4    4
John   18   25

Thank you very much

Upvotes: 1

Views: 33

Answers (1)

jezrael
jezrael

Reputation: 863246

Use IntervalIndex with replace first not defined value to -inf in X_Min and last X_Max missing value for np.inf:

print (df2)
   X_Min  X_Max   Y
0    NaN    2.0   2
1    3.0    7.0   4
2    8.0   15.0   9
3   16.0    NaN  25


i = pd.IntervalIndex.from_arrays(df2['X_Min'].fillna(-np.inf), 
                                 df2['X_Max'].fillna(np.inf), 'both')
print (i)
IntervalIndex([[-inf, 2.0], [3.0, 7.0], [8.0, 15.0], [16.0, inf]],
              closed='both',
              dtype='interval[float64]')

So possible filter values with DataFrame.loc:

df1['Y(Targeted)'] = df2.set_index(i).loc[df1['X'], 'Y'].values
print (df1)

   Name   X  Y(Targeted)
0  Jack   2            2
1   Ann   4            4
2  John  18           25

EDIT:

If want working only with X_Max column:

i = pd.IntervalIndex.from_arrays(df2['X_Max'].add(1).shift().fillna(-np.inf), 
                                 df2['X_Max'].fillna(np.inf), 'both')
print (i)
IntervalIndex([[-inf, 2.0], [3.0, 7.0], [8.0, 15.0], [16.0, inf]],
              closed='both',
              dtype='interval[float64]')

Upvotes: 2

Related Questions