How do you lookup in range

Question

I have 2 data frames that I would like to return the values in a range (-1, 0, +1). One of the data frames contains Id's that i would like to look up and the other data frame contains Id's & values. For example, I want to lookup 99, 55, 117 in another data frame and return 100 99 98, 56 55 54, 118 117 116. As you can see it getting the values -1 and +1 of the Id's I would like to lookup. There is a better example below.

df    = pd.DataFrame([[99],[55],[117]],columns = ['Id'])

df2    = pd.DataFrame([[100,1,2,4,5,6,8],
               [87,1,6,20,22,23,34],
               [99,1,12,13,34,45,46],
               [64,1,10,14,29,32,33],
               [55,1,22,13,23,33,35],
               [66,1,6,7,8,9,10],
               [77,1,2,3,5,6,8],
               [811,1,2,5,6,8,10], 
               [118,1,7,8,22,44,56],
               [117,1,66,44,47,87,91]],
               columns = ['Id', 'Num1','Num2','Num3','Num4','Num5','Num6'])

I would like my result to something like this below.

results    = pd.DataFrame([[87,1,6,20,22,23,34],
               [99,1,12,13,34,45,46],
               [64,1,10,14,29,32,33],
               [64,1,10,14,29,32,33],
               [55,1,22,13,23,33,35],
               [66,1,6,7,8,9,10],
               [118,1,7,8,22,44,56],
               [117,1,66,44,47,87,91]],
               columns = ['Id', 'Num1','Num2','Num3','Num4','Num5','Num6'])

inquirer · Accepted Answer

import pandas as pd
import numpy as np


ind = df2[df2['Id'].isin(df['Id'])].index
aaa = np.array([[ind[i]-1,ind[i],ind[i]+1] for i in range(len(ind))]).ravel()
aaa = aaa[(aaa <= df2.index.values[-1]) & (aaa >= 0)]
df_test = df2.loc[aaa, :].reset_index().drop(['index'], axis=1)

print(df_test)

Output

    Id  Num1  Num2  Num3  Num4  Num5  Num6
0   87     1     6    20    22    23    34
1   99     1    12    13    34    45    46
2   64     1    10    14    29    32    33
3   64     1    10    14    29    32    33
4   55     1    22    13    23    33    35
5   66     1     6     7     8     9    10
6  118     1     7     8    22    44    56
7  117     1    66    44    47    87    91

Here, in the ind list, indexes are obtained where there are the required Ids in df2.

The aaa list creates ranges for these indexes, then the lists are wrapped in np.array, ravel() is used to concatenate them. Next, the list aaa is overwritten, the elements that are greater than the maximum index df2 are removed.

Sampling occurs through loc.

Update 17.12.2022

if you need duplicate rows.

df = pd.DataFrame([[99], [55], [117], [117]], columns=['Id'])
lim_ind = df2.index[-1]


def my_func(i):
    a = df2[df2['Id'].isin([i])].index.values
    a = np.array([a - 1, a, a + 1]).ravel()
    a = a[(a >= 0) & (a <= lim_ind)]

    return a


qqq = [my_func(i) for i in df['Id']]
fff = np.array([df2.loc[qqq[i]].values for i in range(len(qqq))], dtype=object)
fff = np.vstack(fff)
result = pd.DataFrame(fff, columns=df2.columns)

print(result)

Output

    Id  Num1  Num2  Num3  Num4  Num5  Num6
0   87     1     6    20    22    23    34
1   99     1    12    13    34    45    46
2   64     1    10    14    29    32    33
3   64     1    10    14    29    32    33
4   55     1    22    13    23    33    35
5   66     1     6     7     8     9    10
6  118     1     7     8    22    44    56
7  117     1    66    44    47    87    91
8  118     1     7     8    22    44    56
9  117     1    66    44    47    87    91

How do you lookup in range

Answers (1)

Related Questions