Reputation: 575
I ran the this program: https://github.com/backstopmedia/tensorflowbook/blob/master/chapters/04_machine_learning_basics/linear_regression.py
I added "print("w=", W.eval(), "b=", b.eval())" after line 55 in above program. The result I got is:
w= [[ 3.5245235 ] [ 1.50171268]] b= 1.14499
So y=3.5245235x1 + 1.50171268x2 +1.14499.
I used the same data as above program (the file format is attached later), and ran the program https://github.com/apache/spark/blob/master/examples/src/main/java/org/apache/spark/examples/ml/JavaLinearRegressionWithElasticNetExample.java the result is:
Coefficients: [0.3827266230806965,5.1690760222564425] Intercept: 82.22008153614573 numIterations: 6 objectiveHistory: [0.5,0.41583549697777683,0.15548328325638935,0.15439025905767773,0.15432368309706285,0.15432368309449543]
so y=0.3827266230806965x1+5.1690760222564425x2 + 82.22008153614573.
I am confused how the results are so different for the same problem? The data format I used the Spark program is:
354 1:84 2:46
190 1:73 2:20
405 1:65 2:52
263 1:70 2:30
451 1:76 2:57
302 1:69 2:25
288 1:63 2:28
385 1:72 2:36
402 1:79 2:57
365 1:75 2:44
209 1:27 2:24
290 1:89 2:31
346 1:65 2:52
254 1:57 2:23
395 1:59 2:60
434 1:69 2:48
220 1:60 2:34
374 1:79 2:51
308 1:75 2:50
220 1:82 2:34
311 1:59 2:46
181 1:67 2:23
274 1:85 2:37
303 1:55 2:40
244 1:63 2:30
Upvotes: 1
Views: 318
Reputation: 575
See Tensorflow on simple linear regression. The code has the same issue. The fix will be the same as the answer there. Also, the learning_rate is too small (set it to .001) and steps need to be 100000. Choosing these initial values are very technical (one expert talked to me about this).
Upvotes: 0
Reputation: 66835
The answer is quite simple, the spark model is not linear regression.
Linear regression minimizes || y - Wx ||_2^2
Spark model is elastic net which minimizes || y - Wx ||_2^2 + a1 || W ||_2^2 + a2 || W ||_1
If you want this spark code to be a linear regression remove the regularization terms:
.setRegParam(0.3)
.setElasticNetParam(0.8);
and increase amount of iterations to make sure it converges.
Upvotes: 1