PySpark on Windows: Hive issues

Question

I'm trying to run LogisticRegressionWithLBFGS from Mllib and I get many Hive issues:

py4j.protocol.Py4JJavaError: An error occurred while calling o337.trainLogisticRegressionModelWithLBFGS.
: org.apache.spark.sql.AnalysisException: java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient;

The fact is I didn't even install Hive... But why does this function rely on Hive? It is written nowhere in the documentation... Is it a prerequisite to install Hive to run any Mllib function?

PySpark on Windows: Hive issues

Answers (1)

Related Questions