Nocas
Nocas

Reputation: 397

Pandas Interpolate returning ValueErrors for some methods and some sizes of dataframes

I am having some issues with interpolation of a Pandas dataframe.

Basically, I have a dataframe of 295339 rows and have artificially generated nan's to study different sampling rates and completion methods.

The issue is that when I do some combinations of my sampling rates and completion methods it all works out while for others I get the following error message,

ValueError: The number of derivatives at boundaries does not match: expected. 1, got 0+0.

The type of ValueError depends on the combination of sampling rate and completion method I'm using.

So for example, if I make one nan per hour per customer and then interpolate using either the linear or the cubic method it works. But if I sample once every four hours per customer it works for the linear method but not for the cubic method (code for the interpolation bellow):

latitude = my_frame.filter(['Customer_id', 'Lat'], axis=1)
latitude = latitude.groupby('Customer_id').apply(lambda group: group.interpolate(method= 'cubic')

The weird thing is that during my tests I limited my approach to 3 customers (representing 8500 rows) for speed purposes and no issues were raised.

So, my question is why does this happen and is there any workaround.

Upvotes: 0

Views: 1777

Answers (1)

Nocas
Nocas

Reputation: 397

I found that the issue was that for customers with fewer records I wasn't capable to interpolate using the cubic method because they did not have at least 4 known points.

Upvotes: 2

Related Questions