Pythonleaner
Pythonleaner

Reputation: 1

Panda's data frame up sampling with interpolation on a non time series

I have a set of data with inconsistent step sizes between the x values. Now the idea is to up sample the dataframe where x will have an increment step size of 0.5 and interpolate the y values.

I have the following df:

     x        y
    10183.2 -40.1
    10187.1 -41.0
    10191.0 -41.2
    10195.0 -41.5
    10198.9 -42.0
    10202.8 -42.4
    10206.8 -42.9
    10210.7 -43.4
    10214.6 -43.8
    10218.6 -44.2
    10222.5 -44.4
    10226.4 -44.6
    10230.4 -44.8
    10234.3 -44.9
    10238.2 -45.0
    10242.2 -45.1
    10246.1 -45.2
    10250.0 -45.2
    10253.9 -45.3
    10257.9 -45.4
    10261.8 -45.5
    10265.7 -45.5
    10269.7 -45.6

What I want to achieve is:

     x        y
    10185   Nan
    10186.5 Nan
    10187   -40.00
    10187.5 Nan
    10188   Nan
    10188.5 Nan
    10189   Nan
    10189.5 Nan
    10190   Nan
    10190.5 Nan
    10191   -41.2
    10191.5 Nan
    10192   Nan
    10192.5 Nan
    10193   Nan
    10193.5 Nan
    10194   Nan
    10194.5 Nan
    10195   Nan
    10195.5 Nan
    10196   Nan
    10196.5 Nan
    10197   Nan

Where the Nan will be interpolated based on the existing points in the original df.

Is there a way to create a new df where the x points are spaced by 0.5 based on the original x data?

I have been looking into reshape but this is only used for time series.

Could anyone point me in the right direction?

Upvotes: 0

Views: 172

Answers (1)

mozway
mozway

Reputation: 260360

You can create your new index, reindex on the combination, interpolate, and subset the new rows only:

new_index = np.arange(10185, 10270, 0.5)

(df.set_index('x')
   .reindex(sorted(list(df['x'])+list(new_index)))
   .interpolate()
   .loc[new_index]
   .reset_index()
)

output:

interpolation

         x      y
0  10185.0 -40.25
1  10185.5 -40.40
2  10186.0 -40.55
3  10186.5 -40.70
4  10187.0 -40.85
...

Upvotes: 1

Related Questions