AttributeError: 'str' object has no attribute 'fit' - Pyspark

Question

I am trying to run the script below in PySpark3 and receiving the error message that follows. I am using this has something to do with formatting but I am not sure how to go about doing so. Any help would be much appreciated.

train,test = df.randomSplit([0.7,0.3])

models = ["LinearRegression()","DecisionTreeRegressor()","RandomForestRegressor()","GBTRegressor()"]

for model in models:

    # Fit our model
    M = model
    fitModel = M.fit(train)

    # Load the Summary
    trainingSummary = fitModel.summary

#     trainingSummary.residuals.show()
    print("Training RMSE: %f" % trainingSummary.rootMeanSquaredError)
    print("Training r2: %f" % trainingSummary.r2)

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
 in ()
      8     # Fit our model
      9     M = model
---> 10     fitModel = M.fit(train)
     11 
     12     # Load the Summary

AttributeError: 'str' object has no attribute 'fit'

Statmonger · Accepted Answer

I think this way is actually more efficient....

This way you can still iterate through a list.

def ClassTrainEval(model):

    fitModel = model.fit(train)

    # Load the Summary
    trainingSummary = fitModel.summary

    print("Training RMSE: %f" % trainingSummary.rootMeanSquaredError)
    print("Training r2: %f" % trainingSummary.r2)

models = [LogisticRegression(),NaiveBayes(),OneVsRest(),LinearSVC()] 

for model in models:
    ClassTrainEval(classifier)

AttributeError: 'str' object has no attribute 'fit' - Pyspark

Answers (2)

Related Questions

AttributeError: &#39;str&#39; object has no attribute &#39;fit&#39; - Pyspark

Answers (2)

Related Questions

AttributeError: 'str' object has no attribute 'fit' - Pyspark