bmaster69
bmaster69

Reputation: 131

how to get value from the nearest column?

Training dataset

df_table = pd.DataFrame()
df_table["A"] = [0.2, 0.2, 0.2, 0.2, 0.2,
                 0.4, 0.4, 0.4, 0.4, 0.4,
                 0.6, 0.6, 0.6, 0.6, 0.6,
                 1.5, 1.5, 1.5, 1.5, 1.5,
                 2.5, 2.5, 2.5, 2.5, 2.5,
                 3.0, 3.0, 3.0, 3.0, 3.0]

df_table[450] = [1, 5, 20, 30, 40,
                 1, 5, 8, 10, 20,
                 1, 5, 10, 15, 25,
                 2, 7, 15, 20, 30,
                 2, 7, 15, 20, 35,
                 2, 8, 20, 30, 40]

df_table[500] = [3, 15, 25, 60, 80,
                 4, 10, 15, 20, 30,
                 5, 10, 15, 30, 40,
                 5, 10, 20, 30, 45,
                 7, 10, 20, 35, 50,
                 8, 15, 25, 60, 80]

if the input value of is A = 0.2, and temperature = 476, I want to get the value from the column closest to 476, in this case value from the 500 which will be 3.

Similarly, how can I get both values from 450 and 500. For this instance 1 and 3. So that would be value from the closest smaller column and closest higher column.

Upvotes: 1

Views: 67

Answers (1)

Patrick Artner
Patrick Artner

Reputation: 51683

You need to calculate the correct columns to use based on the available columns and your value.

Then localize the first value of your dataframe where A fullfills your searched value and query the columns you just calculated:

import pandas as pd

df_table = pd.DataFrame()
df_table["A"] = [0.2, 0.2, 0.2, 0.2, 0.2, 0.4, 0.4, 0.4, 0.4, 0.4,
                 0.6, 0.6, 0.6, 0.6, 0.6, 1.5, 1.5, 1.5, 1.5, 1.5,
                 2.5, 2.5, 2.5, 2.5, 2.5, 3.0, 3.0, 3.0, 3.0, 3.0]

df_table[450] = [1, 5, 20, 30, 40, 1, 5, 8, 10, 20,
                  1, 5, 10, 15, 25, 2, 7, 15, 20, 30,
                  2, 7, 15, 20, 35, 2, 8, 20, 30, 40]

df_table[500] = [3, 15, 25, 60, 80, 4, 10, 15, 20, 30,
                  5, 10, 15, 30, 40, 5, 10, 20, 30, 45,
                  7, 10, 20, 35, 50, 8, 15, 25, 60, 80]

# integer columns harvested from the dataframe
T_values = [n for n in df_table.columns if isinstance(n,int)]

# what I want to query
myA = 0.2
myT = 476

# get the column names that are near myT
minColNear_myT = max(a for a in T_values if a <= myT)
maxColNear_myT = min(a for a in T_values if a >= myT)

# localize the first one where myA matches    
first_one_matching_myA = df_table.loc[(df_table['A'] == myA).idxmax()]

# output the values of the columns near myT
print(minColNear_myT, myT, first_one_matching_myA[minColNear_myT])
print(maxColNear_myT, myT, first_one_matching_myA[maxColNear_myT])

Output:

450 476 1.0
500 476 3.0

To only output the closer column compare myT to the both columns that exists and use the one where the difference is lowest:

closest = sorted( (k for k in T_values), key = lambda x:abs(x - myT) )[0]
print(closest,  myT, first_one_matching_myA[closest])

Upvotes: 1

Related Questions