Reputation: 499
I was having dataframe which I wrote to a CSV by using below code:
df.write.format("csv").save(base_path+"avg.csv")
As i am running spark in client mode, above snippets created a folder name avg.csv and the folder contains some file with part-* .csv on my worker node or nested folder then file part-*.csv.
Now when I am trying to read avg.csv I am getting path doesn't exist.
df.read.format("com.databricks.spark.csv").load(base_path+"avg.csv")
Can anybody tell where am I doing wrong ?
Upvotes: 1
Views: 197
Reputation: 1588
Part-00**
files are output of distributively computed files (like MR, spark). So, it will be always a folder created with part files when you try to store, as this is an output of some distributed storage which is to be kept in mind.
So, try using:
df.read.format("com.databricks.spark.csv").load(base_path+"avg.csv/*")
Upvotes: 2