Maxime
Maxime

Reputation: 634

How to interpolate values in between points based on an step value

I have this vector :

pd.Series([19.280, 48.380, 51.240, 58.603, 60.380, 203.300, ...])

And I want to introduce intermediate values equaly spaced in between each values that would be the closest to a increment step of 4.

This gives for the begining of the vector:

pd.Series([19.280, 23.437, 27.594, 31.751, 35.909, 40.066, 44.223, 48.380, 51.240, 54.921, 58.603, 60.380, ...])

Upvotes: 2

Views: 887

Answers (2)

MagnusO_O
MagnusO_O

Reputation: 1283

Using pd.interpolate since your data is a pd series:

  • pd.interpolate fills NaN values using interpolation between the adjacient numeric values
    • the distance between the adjacient values and the number of NaNs in between sets the interpolation step
  • To get the requested interpolation step individually in between each given values pair a new series is population with the reqired_number_of_increments as NaNs
    • different number of NaNs in between each given values pair according their distance
  • round to get 'close to the increment step' with the required_number_of_increments

Code:

import pandas as pd
import numpy as np

pds = pd.Series([19.280, 48.380, 51.240, 58.603, 60.380, 203.300], dtype='float64')
pds_filled = pd.Series(dtype='float64')

step_value = 4


for i in range(pds.size):
    pds_filled = pd.concat([pds_filled, pd.Series(pds[i], dtype='float64')], 
                           ignore_index = True)  
        # Note pd.append is deprecated
    
    if i == len(pds)-1:
       break  # break after concat of the last element 
    
    no_inserts = round(((pds.shift(-1)[i] - pds[i])) / step_value ) - 1
    # print(f"i= {i}, no_inserts= {no_inserts}")

    for j in (range(0,no_inserts)):  # not executed when no_insterts = 0
        pds_filled = pd.concat([pds_filled, pd.Series(np.NaN, dtype='float64')], 
                               ignore_index = True) 
    # print(pds_filled)


# print(pds_filled)  # check the filled NaNs
pds_filled.interpolate(inplace=True)  
    # pd.interpolate() replaces NaNs with interpolated values
print(pds_filled)  # final pd.series!


## print options 
# print(pds_filled.tolist())
# print([f'{item:.3f}' for item in pds_filled.tolist()])

Result list:

['19.280', '23.437', '27.594', '31.751', '35.909', '40.066', '44.223', '48.380', '51.240', '54.922', '58.603', '60.380', '64.350', '68.320', '72.290', '76.260', '80.230', '84.200', '88.170', '92.140', '96.110', '100.080', '104.050', '108.020', '111.990', '115.960', '119.930', '123.900', '127.870', '131.840', '135.810', '139.780', '143.750', '147.720', '151.690', '155.660', '159.630', '163.600', '167.570', '171.540', '175.510', '179.480', '183.450', '187.420', '191.390', '195.360', '199.330', '203.300']

Notes:

  • no_insetrs calculation sets the steps, you can fine tune that to adapt the step.
  • Activate the commented print options in the code to cross check the interims steps and list output instead of pd series

Upvotes: 1

BeRT2me
BeRT2me

Reputation: 13242

Given:

s = pd.Series([19.280, 48.380, 51.240, 58.603, 60.380, 203.300,])

0     19.280
1     48.380
2     51.240
3     58.603
4     60.380
5    203.300
dtype: float64

Doing:

s.name = 'Value'
df = s.to_frame()

# Mark how many 4-length spaces could fit between the values.
# We'll round here, other methods are possible as well.
df['space'] = df.Value.diff().fillna(0).div(4).round().astype(int)

# Make these into lists of NaN of each length.
df['space'] = df['space'].apply(lambda x: [np.nan]*x)

# Explode these lists.
df = df.explode('space')

# Drop the helper column.
df = df.drop('space', axis=1)

# Make the duplicate values NaN.
df.loc[df.duplicated(keep='last'), 'Value'] = np.nan

# Reset the index and interpolate the values (linear is default)
df = df.reset_index(drop=True).interpolate('linear')

# Squeeze it back to a Series.
s = df.squeeze()
print(s)

Output:

0      19.280000
1      23.437143
2      27.594286
3      31.751429
4      35.908571
5      40.065714
6      44.222857
7      48.380000
8      51.240000
9      54.921500
10     58.603000
11     60.380000
12     64.350000
13     68.320000
14     72.290000
15     76.260000
16     80.230000
17     84.200000
18     88.170000
19     92.140000
20     96.110000
21    100.080000
22    104.050000
23    108.020000
24    111.990000
25    115.960000
26    119.930000
27    123.900000
28    127.870000
29    131.840000
30    135.810000
31    139.780000
32    143.750000
33    147.720000
34    151.690000
35    155.660000
36    159.630000
37    163.600000
38    167.570000
39    171.540000
40    175.510000
41    179.480000
42    183.450000
43    187.420000
44    191.390000
45    195.360000
46    199.330000
47    203.300000
Name: Value, dtype: float64

Upvotes: 2

Related Questions