Reputation: 12338
I want to do interpolation for a Pandas series of the following structure
X
22.88 3.047
45.75 3.215
68.63 3.328
91.50 3.423
114.38 3.516
137.25 3.578
163.40 3.676
196.08 3.756
228.76 3.861
261.44 3.942
294.12 4.012
326.80 4.084
359.48 4.147
392.16 4.197
Name: Y, dtype: float64
I want to interpolate the data so that I have a new series to cover X=[23:392:1]
. I looked up the document but didn't find where I could input the new x-axis. Did I miss something? How can I do interpolation with the new x-axis?
Upvotes: 3
Views: 2053
Reputation: 28936
This can be done with pandas
's reindex
and interpolate
:
In [27]: s
Out[27]:
1
0
22.88 3.047
45.75 3.215
68.63 3.328
91.50 3.423
114.38 3.516
137.25 3.578
163.40 3.676
196.08 3.756
228.76 3.861
261.44 3.942
294.12 4.012
326.80 4.084
359.48 4.147
392.16 4.197
[14 rows x 1 columns]
In [28]: idx = pd.Index(np.arange(23, 392))
In [29]: s.reindex(s.index + idx).interpolate(method='values')
Out[29]:
1
22.88 3.047000
23.00 3.047882
24.00 3.055227
25.00 3.062573
26.00 3.069919
27.00 3.077265
28.00 3.084611
29.00 3.091957
30.00 3.099303
31.00 3.106648
32.00 3.113994
33.00 3.121340
34.00 3.128686
35.00 3.136032
36.00 3.143378
37.00 3.150724
38.00 3.158070
39.00 3.165415
40.00 3.172761
41.00 3.180107
42.00 3.187453
43.00 3.194799
44.00 3.202145
45.00 3.209491
45.75 3.215000
46.00 3.216235
47.00 3.221174
48.00 3.226112
The idea is the create the index you want (s.index + idx
), which is sorted automatically, reindex an that (which makes a bunch of NaN
s at the new points, and the interpolate to fill the NaN
s, using the values
method, which interpolates at the index points.
Upvotes: 4
Reputation: 97261
You can call numpy.interp()
directly:
import numpy as np
import pandas as pd
import io
data = """x y
22.88 3.047
45.75 3.215
68.63 3.328
91.50 3.423
114.38 3.516
137.25 3.578
163.40 3.676
196.08 3.756
228.76 3.861
261.44 3.942
294.12 4.012
326.80 4.084
359.48 4.147
392.16 4.197"""
s = pd.read_csv(io.BytesIO(data), delim_whitespace=True, index_col=0, squeeze=True)
new_idx = np.arange(23,393)
new_val = np.interp(new_idx, s.index.values.astype(float), s.values)
s2 = pd.Series(new_val, new_idx)
Upvotes: 2