Reputation: 788
My problem is quite a simple one, however I can't seem to find a quick solution.
I would like to interpolate the y_model
array which is defined at x_model
coordinates
x_model = np. array([ 400., 425., 450., 475., 500., 525., 550., 575., 600.,
625., 650., 675., 700.])
y_model = np.array([ 3.30395705e-03, 3.07586379e-03, 2.90207925e-03,
2.81385536e-03, 2.78152394e-03, 2.14072505e-03,
1.64075861e-03, 9.81255709e-04, 3.61950352e-04,
2.15834081e-04, 1.35457654e-04, 8.22104550e-05,
5.84126935e-05])
To new, x_data
coordinates:
x_data = np.array([412., 443., 490., 510., 555., 670.])
So i should get a simple 1-dimensional y-data
array with 6 elements.
I thought of something like this, what's your opinion?
from scipy import interpolate
f = interpolate.interp1d(x_model, y_model, kind = 'linear')
y_data = f(x_data)
A solution was initially sought based on the primarily raised issue, limiting the question to a linear interpolation. However, on the course of the discussion some authors rightly pointed out that for such a problem a linear fit wouldn't suffice.
Should a non-linear curve fit be more suitable?
Upvotes: 0
Views: 1540
Reputation: 4236
The most simple methods would be to fit a linear function using least squares.
A simple implementation in numpy is numpy.linalg.lstsq
.
You can then use the function lambda x: m*x+c
with the computed slope m
and offset c
to get the y
value on the line for any x
.
The numpy documentation contains an example, thats looks almost like the example in your question.
My answer above is for the case of linear interpolation.
Regarding your edit from linear to general interpolation: You need to know what function you want to fit. Do you assume your data to be linear?
You can fit a linear function every point set, but the result may be bad, either because there is no linear relationship or because it is noisy data. Without having an idea what function may be appropriate or how large the amount of noise is, you cannot easily fit a function.
The same is true for higher order polynomials, except that you may change the rror from underfitting (linear regression for a function that is not linear) to overfitting (having a too complicated function that matches the given points, but is unsuitable for interpolation).
Upvotes: 2
Reputation: 84
The "scientific" way to do this, is to fit a linear function to the existing datapoints, and calculate the function then on the new x_values. This has multiple advantages, including the possible usage of uncertainties for each datapoint, and also the possibility of error propagation for the interpolated values. (f.e in combination with the very nice package 'uncertainties') You can also easily change your model function, and it does not has to be linear, but can be any function...
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
x_model = np. array([ 400., 425., 450., 475., 500., 525., 550., 575., 600., 625., 650., 675., 700.])
y_model = np.array([ 3.30395705e-03, 3.07586379e-03, 2.90207925e-03,
2.81385536e-03, 2.78152394e-03, 2.14072505e-03,
1.64075861e-03, 9.81255709e-04, 3.61950352e-04,
2.15834081e-04, 1.35457654e-04, 8.22104550e-05,
5.84126935e-05])
x_data = np.array([412., 443., 490., 510., 555., 670.])
def linFunc(x, k, d):
return k*x+d
cop, cov = curve_fit(linFunc, x_model, y_model)
xplot = np.linspace(min(x_model), max(x_model), 10**4)
plt.figure()
plt.plot(x_model, y_model, 'ro', label = 'original data')
plt.plot(xplot, linFunc(xplot, *cop), 'r--', label = "fittet function")
plt.plot(x_data, linFunc(x_data, *cop), 'bo', label = "interpolated values")
print(linFunc(x_data, *cop))
Upvotes: 1
Reputation: 2136
Numpy has the interp
function which can do that for you.
You can simply call:
y_data = np.interp(x_data,x_model,y_model)
The documentation for numpy.interp() you can find here: https://docs.scipy.org/doc/numpy/reference/generated/numpy.interp.html
Upvotes: 4
Reputation: 2316
If you want to be able to inspect your linear regression and tweak it a little bit I would recommend using scikit-learn's linear regression. The code would be the following:
import numpy as np
from sklearn.linear_model import LinearRegression
x_model = np. array([ 400., 425., 450., 475., 500., 525., 550., 575., 600.,
625., 650., 675., 700.])
y_model = np.array([ 3.30395705e-03, 3.07586379e-03, 2.90207925e-03,
2.81385536e-03, 2.78152394e-03, 2.14072505e-03,
1.64075861e-03, 9.81255709e-04, 3.61950352e-04,
2.15834081e-04, 1.35457654e-04, 8.22104550e-05,
5.84126935e-05])
lr = LinearRegression()
lr.fit(x_model[:, None], y_model)
x_data = np.array([412., 443., 490., 510., 555., 670.])
y_data = lr.predict(x_data[:, None])
print(y_data)
Note that I have to add a dimension to x
when fitting or predicting because the linear regression expects a 2-D array.
Upvotes: 1