CentauriAurelius
CentauriAurelius

Reputation: 504

Resample or normalize trajectory data so points are evenly spaced

I have a DataFrame which contains X & Y data for many trajectories (not GPS data).

I am trying to figure out how to resample/time-normalize them so the distance between points is evenly spaced.

As they are right now, there are regions of the trajectories with higher density of points.

In the below scatterplots, I show one of the overall trajectories, and then a zoomed in portion of the trajectory to show how the density of points changes (i.e, the spacing between points is irregular).

overall trajectory

zoomed in portion with irregular spacing between points

My dataframes look like this:

     (0, 1, 1)_mean_X  (0, 1, 1)_mean_Z  ...  (2, 2, 3)_mean_X  (2, 2, 3)_mean_Z
0          -15.856713          5.002617  ...        -15.874083         -5.000582
1          -15.831320          5.003529  ...        -15.848551         -5.000925
2          -15.805927          5.004441  ...        -15.823020         -5.001268
3          -15.780534          5.005353  ...        -15.797489         -5.001611
4          -15.755141          5.006265  ...        -15.771958         -5.001955
..                ...               ...  ...               ...               ...
995         15.547392         11.280298  ...         15.257689        -12.455845
996         15.548967         11.278968  ...         15.258225        -12.457202
997         15.550542         11.277638  ...         15.258761        -12.458560
998         15.552116         11.276309  ...         15.259296        -12.459917
999         15.553691         11.274979  ...         15.259832        -12.461275

Upvotes: 0

Views: 928

Answers (1)

anon01
anon01

Reputation: 11171

Pandas has an interp function, but for processing like this I would prefer numpy/scipy. The vectorized functions are often faster than pandas. Example:

from scipy.interpolate import interp1d

x = np.logspace(0,2,300)
y = x**2
df = pd.DataFrame(np.array([x, y]).T, columns=list("xy"))

# define interpolation function:
f = interp1d(x, y)

# create new df with desired x vals, generate y with interp function:
x_new = np.linspace(x.min(),x.max(),1000)
y_new = f(x_new)
df_new = pd.DataFrame(np.array([x_new, y_new]).T, columns=["x_new", "y_new"])

Note this will fail if x_new is outside the original domain - this makes sense as it's just linear interpolation.

Upvotes: 3

Related Questions