metersk
metersk

Reputation: 12509

Is this the most efficient and accurate way to extrapolate using scipy?

I have a set of data points over time, but there is some missing data and the data is not at regular intervals. In order to get a full data set over time at regular intervals I did the following:

import pandas as pd
import numpy as np
from scipy import interpolate

x = data['time']
y = data['shares']
f = interpolate.interp1d(x, y, fill_value='extrapolate')

time = np.arange(0, 3780060, 600)

new_data = []
for interval in time:
    new_data.append(f(interval))

test = pd.DataFrame({'time': time, 'shares': y})
test_func = test_func.astype(float)

When both the original and the extrapolated data sets are plotted, they seem to line up almost perfectly, but I still wonder if there is a more efficient and/or accurate way to accomplish the above.

Upvotes: 0

Views: 104

Answers (1)

Sergey
Sergey

Reputation: 487

You should apply interpolation function only once, like this

new_data = f(time)

If you need values at regular intervals fill_value='extrapolate' is redundant, because it is just interpolation. You may use 'extrapolate' if your new interval is wider than original one. But it is bad practice.

Upvotes: 1

Related Questions