Reputation: 647
This question is similar to this one. I would like to print the best model params after doing a TrainValidationSplit in pyspark. I cannot find the piece of text the other user uses to answer the question because I'm working on jupyter and the log dissapears from the terminal...
Part of the code is:
pca = PCA(inputCol = 'features')
dt = DecisionTreeRegressor(featuresCol=pca.getOutputCol(),
labelCol="energy")
pipe = Pipeline(stages=[pca,dt])
paramgrid = ParamGridBuilder().addGrid(pca.k, range(1,50,2)).addGrid(dt.maxDepth, range(1,10,1)).build()
tvs = TrainValidationSplit(estimator = pipe, evaluator = RegressionEvaluator(
labelCol="energy", predictionCol="prediction", metricName="mae"), estimatorParamMaps = paramgrid, trainRatio = 0.66)
model = tvs.fit(wind_tr_va);
Thanks in advance.
Upvotes: 2
Views: 3882
Reputation: 1517
Even simpler (1-line), just refer to the JVM object of your model
cvModel.bestModel.stages[-1]._java_obj.getMaxDepth()
Here you take your bestModel after cross-validation, call the JVM object of this model and extract maxDepth parameter using getMaxDepth()-method from the JVM object.
The list of all original JVM get-parameters can be found here https://spark.apache.org/docs/latest/api/java/org/apache/spark/ml/classification/RandomForestClassificationModel.html
Also, you can browse other get-parameters for other models and extract them referring to the original JVM object of any model
<yourModel>.stages[<yourModelStage>]._java_obj.<getParameter>()
Hope it helps.
Upvotes: 4
Reputation: 40370
It follows indeed the same reasoning described in the answer about How to get the maxDepth from a Spark RandomForestRegressionModel given by @user6910411.
You'll need to patch the TrainValidationSplitModel
, PCAModel
and DecisionTreeRegressionModel
as followed :
TrainValidationSplitModel.bestModel = (
lambda self: self._java_obj.bestModel
)
PCAModel.getK = (
lambda self: self._java_obj.getK()
)
DecisionTreeRegressionModel.getMaxDepth = (
lambda self: self._java_obj.getMaxDepth()
)
Now you can use it to get the best model and extract k
and maxDepth
bestModel = model.bestModel
bestModelK = bestModel.stages[0].getK()
bestModelMaxDepth = bestModel.stages[1].getMaxDepth()
PS: You can patch models to get specific parameters the same way described above.
Upvotes: 4