org.apache.spark.ml.regression.LinearRegression: fit, train, and predict

Question

Consider the following Spark MLlib code taken from the documentation:

import org.apache.spark.ml.regression.LinearRegression

// Load training data
val training = spark.read.format("libsvm")
  .load("data/mllib/sample_linear_regression_data.txt")

val lr = new LinearRegression()
  .setMaxIter(10)
  .setRegParam(0.3)
  .setElasticNetParam(0.8)

// Fit the model
val lrModel = lr.fit(training)

// Print the coefficients and intercept for linear regression
println(s"Coefficients: ${lrModel.coefficients} Intercept: ${lrModel.intercept}")

// Summarize the model over the training set and print out some metrics
val trainingSummary = lrModel.summary
println(s"numIterations: ${trainingSummary.totalIterations}")
println(s"objectiveHistory: [${trainingSummary.objectiveHistory.mkString(",")}]")
trainingSummary.residuals.show()
println(s"RMSE: ${trainingSummary.rootMeanSquaredError}")
println(s"r2: ${trainingSummary.r2}")

I see there is a fit method, which works similarly to train. But I can find no predict method in the API docs.

Is there supposed to be no predict function? Now, I know I can make a prediction by taking the dot product of the coefficents of the model and the point I'm trying to make a prediction for and adding the model's intercept.

But is that what the library writers' expect people to do.

Alberto Bonsanto · Accepted Answer

The method you are looking for is transform, which is part of almost all ML models. This receives a DataFrame with a column called features.

org.apache.spark.ml.regression.LinearRegression: fit, train, and predict

Answers (1)

Related Questions