Reputation: 1063
I am attempting a simple linear regression in Tensorflow
with only one independent variable. A plot of my data shows the coefficient should be near 1, and in fact if I run it using sklearn.linear_model.LinearRegression
I get a sensible result of about 0.90.
However running it in Tensorflow
using this tutorial produces a coefficient of very near zero. I was able to get a rational result from the Tensorflow
using randomized numbers. I have tried adjusting the learning rate or number of epochs without any meaningful effect.
The MRE includes actual data, and should produce a coefficient of 0.8975 from sklearn
but 0.00045 from Tensorflow
. I have considered that it is getting caught at a local minimum, but none of the examples I can find of such a problem work for my issue.
import numpy as np
import tensorflow as tf
from sklearn import linear_model
learning_rate = 0.1
epochs = 100
x_train = np.array([-0.00055, 0.00509, -0.0046, -0.01687, -0.0047, 0.00348,
0.00042, -0.00208, -0.01207, -0.0007, 0.00408, -0.00182,
-0.00294, -0.00113, 0.0038, -0.00645, 0.00113, 0.00268,
-0.0045, -0.00381, 0.00298, 0, -0.00184, -0.00212,
-0.00213, -0.01224, 0.00072, 0, -0.00331, 0.00534,
0.00675, -0.00285, -0.00429, 0.00489, -0.00286, 0.00158,
0.00129, 0.00472, 0.00555, -0.00467, -0.00231, -0.00231,
0.00159, -0.00463, 0.00174, 0, -0.0029,
-0.00349, 0.01372, -0.00302])
y_train = np.array([0.00125, 0.00218, -0.00373, -0.00999, -0.00441,
0.00412, 0.00158, -0.00094, -0.01513, -0.00064, 0.00416,
-0.00191, -0.00607, 0.00161, 0.00289, -0.00416,
0.00096, 0.00321, -0.00672, -0.0029, 0.00129, -0.00032,
-0.00387, -0.00162, -0.00292, -0.01367, 0.00198,
0.00099, -0.00329, 0.00693, 0.00459, -0.00294, -0.00164,
0.00328, -0.00425, 0.00131, 0.00131, 0.00524, 0.00358,
-0.00422, -0.00065, -0.00359, 0.00229, 0, 0.00196,
-0.00065, -0.00391, -0.0108, 0.01291, -0.00098])
regr = linear_model.LinearRegression()
regr.fit(x_train.reshape(-1, 1), y_train.reshape(-1, 1))
print ('Coefficients: ', regr.coef_)
weight = tf.Variable(0.)
bias = tf.Variable(0.)
for e in range(epochs):
with tf.GradientTape() as tape:
y_pred = weight*x_train + bias
loss = tf.reduce_mean(tf.square(y_pred - y_train))
gradients = tape.gradient(loss, [weight,bias])
weight.assign_sub(gradients[0]*learning_rate)
bias.assign_sub(gradients[1]*learning_rate)
print(weight.numpy(), 'weight', bias.numpy(), 'bias')
Upvotes: 1
Views: 164
Reputation: 18339
in the posted example, the training dataset x and y values are very small, which causes gradients to be very small, so while the model is training correctly on the data, it might take a few million iterations,
the scikit learn linear regression model uses least squares curve fitting so it can fit the dataset infinitely fast.
a suggestions to bring the result down to a managable 1000 iterations is to apply MinMaxScaler to have the x and y dataset between 0 and 1, which will improve gradients and reach a trained model, however you should inverse transform the results back after training, as shown in the modified code below.
import numpy as np
import tensorflow as tf
from sklearn import linear_model
from sklearn.preprocessing import MinMaxScaler
import matplotlib.pyplot as plt
learning_rate = 0.1
epochs = 1000
x_train0 = np.array([-0.00055, 0.00509, -0.0046, -0.01687, -0.0047, 0.00348,
0.00042, -0.00208, -0.01207, -0.0007, 0.00408, -0.00182,
-0.00294, -0.00113, 0.0038, -0.00645, 0.00113, 0.00268,
-0.0045, -0.00381, 0.00298, 0, -0.00184, -0.00212,
-0.00213, -0.01224, 0.00072, 0, -0.00331, 0.00534,
0.00675, -0.00285, -0.00429, 0.00489, -0.00286, 0.00158,
0.00129, 0.00472, 0.00555, -0.00467, -0.00231, -0.00231,
0.00159, -0.00463, 0.00174, 0, -0.0029,
-0.00349, 0.01372, -0.00302])
scaler1 = MinMaxScaler()
x_train = scaler1.fit_transform(x_train0.reshape(-1,1))
y_train0 = np.array([0.00125, 0.00218, -0.00373, -0.00999, -0.00441,
0.00412, 0.00158, -0.00094, -0.01513, -0.00064, 0.00416,
-0.00191, -0.00607, 0.00161, 0.00289, -0.00416,
0.00096, 0.00321, -0.00672, -0.0029, 0.00129, -0.00032,
-0.00387, -0.00162, -0.00292, -0.01367, 0.00198,
0.00099, -0.00329, 0.00693, 0.00459, -0.00294, -0.00164,
0.00328, -0.00425, 0.00131, 0.00131, 0.00524, 0.00358,
-0.00422, -0.00065, -0.00359, 0.00229, 0, 0.00196,
-0.00065, -0.00391, -0.0108, 0.01291, -0.00098])
scaler2 = MinMaxScaler()
y_train = scaler2.fit_transform(y_train0.reshape(-1,1))
regr = linear_model.LinearRegression()
regr.fit(x_train.reshape(-1, 1), y_train.reshape(-1, 1))
print ('Coefficients: ', regr.coef_, ' intercept ',regr.intercept_, )
weight = tf.Variable(0.)
bias = tf.Variable(0.)
for e in range(epochs):
with tf.GradientTape() as tape:
y_pred = weight*x_train + bias
loss = tf.reduce_mean(tf.square(y_pred - y_train))
gradients = tape.gradient(loss, [weight,bias])
weight.assign_sub(gradients[0]*learning_rate)
bias.assign_sub(gradients[1]*learning_rate)
print(weight.numpy(), 'weight', bias.numpy(), 'bias')
import matplotlib.pyplot as plt
plt.plot(x_train0,scaler2.inverse_transform(y_pred.numpy()).flatten(),'r',label='model output')
plt.scatter(x_train0,y_train0,label='training dataset')
plt.legend()
plt.show()
Coefficients: [[0.97913471]] intercept [-0.00420121]
0.96772194 weight 0.0018798028 bias
Upvotes: 1