Reputation: 616
I have a dataframe:
1 Amazon 1 x 0.0 1.0 2.0 3.0 4.0
2 Amazon 1 y 0.0 0.4 0.8 1.2 1.6
4 Amazon 2 x 0.0 2.0 4.0 6.0 8.0
5 Amazon 2 y 0.0 1.0 2.0 3.0 4.0
df2:
Amazon 1 1
Amazon 2 2.3
Netflix 1 4.1
Netflix 2 5.5
Given these two dataframes, I am trying to use linear interpolation to find the 'y values' for df2, using df1 breakpoints
Expected output:
Amazon 1 1 ...
Amazon 2 2.3 ...
The formula for Linear Interpolation is: y = y1 + ((x – x1) / (x2 – x1)) * (y2 – y1), where x is the known value, y is the unknown value, x1 and y1 are the coordinates that are below the known x value, and x2 and y2 are the coordinates that are above the x value.
Upvotes: 0
Views: 782
Reputation: 2405
The format of df
seems weird (data points in columns, not rows).
Below is not the cleanest solution at all:
import numpy as np
lookup_df = df1.set_index(["Name", "Segment", "Axis"]).T
def find_interp(row):
try:
res = np.interp([row["x"]], lookup_df[(row["Name"], row["Segment"], "x")], lookup_df[(row["Name"], row["Segment"], "y")])
except:
res = [np.nan]
return res[0]
>>> df2["y"] = df2.apply(find_interp, axis=1)
>>> df2
Name Segment x y
0 Amazon 1 1.0 0.40
1 Amazon 2 2.3 1.15
2 Netflix 1 4.1 NaN
3 Netflix 2 5.5 NaN
Upvotes: 1