Prevent negative values in df.interpolate()

Question

I'm having troubles with avoiding negative values in interpolation. I have the following data in a DataFrame:

current_country = 

idx Country     Region              Rank    Score     GDP capita    Family   Life Expect.    Freedom    Trust Gov.  Generosity  Residual    Year

289 South Sudan Sub-Saharan Africa  143     3.83200     0.393940    0.185190    0.157810    0.196620    0.130150    0.258990    2.509300    2016
449 South Sudan Sub-Saharan Africa  147     3.59100     0.397249    0.601323    0.163486    0.147062    0.116794    0.285671    1.879416    2017
610 South Sudan Sub-Saharan Africa  154     3.25400     0.337000    0.608000    0.177000    0.112000    0.106000    0.224000    1.690000    2018
765 South Sudan Sub-Saharan Africa  156     2.85300     0.306000    0.575000    0.295000    0.010000    0.091000    0.202000    1.374000    2019

And I want to interpolate the following year (2019) - shown below - using pandas' df.interpolate()

new_row =

idx Country     Region              Rank    Score   GDP capita  Family     Life Expect.  Freedom    Trust Gov.  Generosity  Residual    Year

593 South Sudan Sub-Saharan Africa  0       np.nan  np.nan      np.nan     np.nan        np.nan     np.nan      np.nan      np.nan      2015

I create the df containing null values in all columns to be interpolated (as above) and append that one to the original dataframe before I interpolate to populate the cells with NaNs.

interpol_subset = current_country.append(new_row)
interpol_subset = interpol_subset.interpolate(method = "pchip", order = 2)

This produces the following df

idx Country     Region              Rank    Score     GDP capita    Family   Life Expect.    Freedom    Trust Gov.  Generosity  Residual    Year

289 South Sudan Sub-Saharan Africa  143     3.83200     0.393940    0.185190    0.157810    0.196620    0.130150    0.258990    2.509300    2016
449 South Sudan Sub-Saharan Africa  147     3.59100     0.397249    0.601323    0.163486    0.147062    0.116794    0.285671    1.879416    2017
610 South Sudan Sub-Saharan Africa  154     3.25400     0.337000    0.608000    0.177000    0.112000    0.106000    0.224000    1.690000    2018
765 South Sudan Sub-Saharan Africa  156     2.85300     0.306000    0.575000    0.295000    0.010000    0.091000    0.202000    1.374000    2019
4   South Sudan Sub-Saharan Africa  0       2.39355     0.313624    0.528646    0.434473   -0.126247    0.072480    0.238480    0.963119    2015

The issue: In the last row, the value in "Freedom" is negative. Is there a way to parameterize the df.interpolate function such that it doesn't produce negative values? I can't find anything in the documentation. I'm fine with the estimates besides that negative value (Although they're a bit skewed)

I considered simply flipping the negative to a positive, but the "Score" value is a sum of all the other continuous features and I would like to keep it that way. What can I do here?

Here's a link to the actual code snippet. Thanks for reading.

Prevent negative values in df.interpolate()

Answers (1)

Related Questions