tj judge
tj judge

Reputation: 616

Linear interpolation to find y values

I have a dataframe:

1  Amazon        1      x  0.0     1.0     2.0    3.0    4.0
2  Amazon        1      y  0.0     0.4     0.8    1.2    1.6
4  Amazon        2      x  0.0     2.0     4.0    6.0    8.0
5  Amazon        2      y  0.0     1.0     2.0    3.0    4.0

df2:

 Amazon   1       1
 Amazon   2       2.3
 Netflix  1       4.1
 Netflix  2       5.5

Given these two dataframes, I am trying to use linear interpolation to find the 'y values' for df2, using df1 breakpoints

Expected output:

   Amazon   1       1    ...
   Amazon   2       2.3  ...

The formula for Linear Interpolation is: y = y1 + ((x – x1) / (x2 – x1)) * (y2 – y1), where x is the known value, y is the unknown value, x1 and y1 are the coordinates that are below the known x value, and x2 and y2 are the coordinates that are above the x value.

Upvotes: 0

Views: 782

Answers (1)

crayxt
crayxt

Reputation: 2405

The format of df seems weird (data points in columns, not rows).

Below is not the cleanest solution at all:

import numpy as np

lookup_df = df1.set_index(["Name", "Segment", "Axis"]).T

def find_interp(row):
    try:
        res = np.interp([row["x"]], lookup_df[(row["Name"], row["Segment"], "x")], lookup_df[(row["Name"], row["Segment"], "y")])
    except:
        res = [np.nan]
    return res[0]


>>> df2["y"] = df2.apply(find_interp, axis=1)
>>> df2
      Name  Segment    x     y
0   Amazon        1  1.0  0.40
1   Amazon        2  2.3  1.15
2  Netflix        1  4.1   NaN
3  Netflix        2  5.5   NaN

Upvotes: 1

Related Questions