Reputation: 273
I wouldlike to have some insight about the method used to parallelize the logistic regression in the ML library, I already tried to check the source code but I didn't understand the process.
Upvotes: 0
Views: 785
Reputation: 5210
Spark uses a so called mini batch gradient descent for regression:
http://ruder.io/optimizing-gradient-descent/index.html#minibatchgradientdescent
In a nutshell, it works like this:
The actual optimisation code for Spark is from this line: https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/optimization/GradientDescent.scala#L234
Upvotes: 3