Reputation: 2821
I am trying to read a delta / parquet in Databricks using the follow code in Databricks
df3 = spark.read.format("delta").load('/mnt/lake/CUR/CURATED/origination/company/opportunities_final/curorigination.presentation.parquet')
However, I'm getting the following error:
A partition path fragment should be the form like `part1=foo/part2=bar`. The partition path: curorigination.presentation.parquet
This seemed very straightforward, but not sure why I'm getting the error
Any thoughts?
The file structure looks like the following
Upvotes: 1
Views: 10052
Reputation: 2043
The error shows that delta lake thinks that you have wrong partition path naming.
If you have any partition column in your delta table, for example year month day, your path should look like
/mnt/lake/CUR/CURATED/origination/company/opportunities_final/year=yyyy/month=mm/day=dd/curorigination.presentation.parquet
and, you just need to do
df = spark.read.format("delta").load("/mnt/lake/CUR/CURATED/origination/company/opportunities_final")
If you just read it as parquet
, you can just do
df = spark.read.parquet("/mnt/lake/CUR/CURATED/origination/company/opportunities_final")
because you don't need to read the absolute path of the parquet file.
Upvotes: 3
Reputation: 2764
The above error mainly happens because of incorrect path format curorigination.presentation.parquet
. please check your delta location and also check whether delta file is created or not :
%fs ls /mnt/lake/CUR/CURATED/origination/company/opportunities_final/
I reproduced the same thing in my environment. First of all, I created a data frame with a parquet file.
df1 = spark.read.format("parquet").load("/FileStore/tables/")
display(df1)
After that I just converted the parquet file into delta format and saved the file into this location/mnt/lake/CUR/CURATED/origination/company/opportunities_final/demo_delta1
.
df1.coalesce(1).write.format('delta').mode("overwrite").save("/mnt/lake/CUR/CURATED/origination/company/opportunities_final/demo_delta1")
#Reading delta file
df3 = spark.read.format("delta").load("/mnt/lake/CUR/CURATED/origination/company/opportunities_final/demo_delta")
display(df3)
Upvotes: 0