Reputation: 389
I get this exception in the spark application submitted with spark-submit (2.4.0)
User class threw exception: org.apache.spark.sql.AnalysisException: Multiple sources found for parquet (org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat, org.apache.spark.sql.execution.datasources.parquet.DefaultSource), please specify the fully qualified class name.;
My application is:
val sparkSession = SparkSession.builder()
.config("spark.sql.warehouse.dir", warehouseLocation)
I'm unable to figure out where this duplicate source for parquet is coming from:
Here is my spark-submit:
spark-submit-2.4.0 --master yarn-cluster \ --files="/etc/hive/hive-site.xml" \ --driver-class-path="/etc/hadoop/:/usr/lib/spark-packages/spark2.4.0/jars/:/usr/lib/spark-packages/spark2.4.0/lib/spark-assembly.jar:/usr/lib/hive/lib/"
Any suggestion ?
Upvotes: 1
Views: 1609
Reputation: 389
There was a mixup with the version of spark-submit (2.4) that I was using and the default SPARK_HOME pointing to an older version, just in case anyone else has the same issue.
Upvotes: 2