Reputation: 21
while reading any csv, it is always converting into 3 stages whether csv file has small size or big or only it has headers in file. and there is always three jobs that has one stage per job. and my application has no any transformation and action.It is only loading csv.
public class WordCount {
public static void main(String[] args) throws InterruptedException {
SparkSession spark = SparkSession.builder().appName("Java Spark
Application").master("local").getOrCreate();
Dataset<Row> df = spark.read()
.format("com.databricks.spark.csv")
.option("inferschema", "true")
.option("header", "true")
.load("/home/ist/OtherCsv/EmptyCSV.csv");
spark.close();
}}
Spark UI images:
Questions:
Upvotes: 0
Views: 1580
Reputation: 1
By default csv,json and parquet will create 2 jobs but if we enable inferSchema for csv file then it will create 3jobs.
Upvotes: 0