Fit polynomial to point cloud in 3D using PolynomialFeatures

Question

I have some points which represent movement of some particle in 3D. I am trying to fit polynomial to these points so that I can have one line representing the track the particle has taken. Moreover, it is clear that the track is either polynomial degree 2 or 3.

Following a suggestion here I have written the following code to fit polynomial to my data:

%matplotlib notebook
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression

# Point cloud
data = np.array([[ 41. ,  57. ,  92. ],[ 39. ,  57.  , 92.4],[ 43. ,  57.  , 91.2], [ 23.,   47. , 119.6],
                 [ 27. ,  47. , 115.2], [ 25. ,  45. , 122. ], [ 25. ,  49. , 114. ],[ 29.,   49. , 109.6],
                 [ 29. ,  47. , 114.4], [ 27. ,  49. , 111.2], [ 23. ,  45. , 125.6], [ 31.,   49.,  106.8],
                 [ 25. ,  47. , 117.6], [ 39. ,  55. ,  95.6],[ 37.  , 53.  , 98.4], [ 35. ,  55. ,  96.8],
                 [ 33. ,  53. , 116.8], [ 23. ,  43. , 132.8], [ 25. ,  41. , 145.2],[ 25. ,  43.,  133.6],
                 [ 29. ,  51. , 106.4],[ 31.  , 53. , 121.2],[ 31., 51. , 104.8],[ 41.,   55.,   93.6],
                 [ 33. ,  51. , 103.6],[ 35.  , 53. ,  99.6],[ 37. ,  55. ,  96.4]])

x = data[:,0]
y = data[:,1]
z = data[:,2]

# sort data to avoid plotting problems
x, y, z = zip(*sorted(zip(x, y, z)))

x = np.array(x)
y = np.array(y)
z = np.array(z)

data_xy = np.array([x,y])

poly = PolynomialFeatures(degree=2)
X_t = poly.fit_transform(data_xy.transpose())

clf = LinearRegression()
clf.fit(X_t, z)
z_pred = clf.predict(X_t)
print(clf.coef_)
print(clf.intercept_)

fig = plt.figure()
ax = plt.subplot(projection='3d')
ax.plot(x, y, z_pred, 'r') # fit line
ax.scatter(x, y, z)
fig.set_dpi(150)

The problem is, I am getting a very strange result:

Any idea what's going on?

EDIT: I would expect a line that fits to the data. For example, I applied the exact same method to different data, and this is what I got:

It is also possible to suggest other methods. Thank you!

Zen · Accepted Answer

I was able to solve the problem. Actually my choice of dependent and independent variables is the issue here. If I assume X to be independent variable and Y and Z are dependent then I get better results:

from sklearn.preprocessing import PolynomialFeatures
from sklearn.metrics import mean_squared_error
from sklearn.metrics import r2_score

def polynomial_regression3d(x, y, z, degree):
    # sort data to avoid plotting problems
    x, y, z = zip(*sorted(zip(x, y, z)))

    x = np.array(x)
    y = np.array(y)
    z = np.array(z)
    
    data_yz = np.array([y,z])
    data_yz = data_yz.transpose()

    polynomial_features= PolynomialFeatures(degree=degree)
    x_poly = polynomial_features.fit_transform(x[:, np.newaxis])

    model = LinearRegression()
    model.fit(x_poly, data_yz)
    y_poly_pred = model.predict(x_poly)

    rmse = np.sqrt(mean_squared_error(data_yz,y_poly_pred))
    r2 = r2_score(data_yz,y_poly_pred)
    print("RMSE:", rmse)
    print("R-squared", r2)
    
    # plot
    fig = plt.figure()
    ax = plt.axes(projection='3d')
    ax.scatter(x, data_yz[:,0], data_yz[:,1])
    ax.plot(x, y_poly_pred[:,0], y_poly_pred[:,1], color='r')
    ax.set_xlabel('X')
    ax.set_ylabel('Y')
    ax.set_zlabel('Z')

    plt.show()
    fig.set_dpi(150)

However, when I compute R-squared, it shows a value of 0.8, which is not so bad but maybe could be improved? Perhaps, using weighted least squares can improve that.

Fit polynomial to point cloud in 3D using PolynomialFeatures

Answers (1)

Related Questions