Reputation: 8967
How to read Parquet file using Spark Core API?
I know using Spark SQL has some methods to read parquet file. But we cannot use Spark SQL for our projects.
Do we have to use newAPIHadoopFile
method on JavaSparkContext
to do this?
I am using Java to implement Spark Job.
Upvotes: 7
Views: 6010
Reputation: 41
Use the below code:
SparkSession spark = SparkSession.builder().master("yarn").appName("Application").enableHiveSupport().getOrCreate();
Dataset<Row> ds = spark.read().parquet(filename);
Upvotes: 2