Unable to read Databricks Delta / Parquet File with Delta Format

Question

I am trying to read a delta / parquet in Databricks using the follow code in Databricks

df3 = spark.read.format("delta").load('/mnt/lake/CUR/CURATED/origination/company/opportunities_final/curorigination.presentation.parquet')

However, I'm getting the following error:

A partition path fragment should be the form like `part1=foo/part2=bar`. The partition path: curorigination.presentation.parquet

This seemed very straightforward, but not sure why I'm getting the error

Any thoughts?

The file structure looks like the following

Jonathan · Accepted Answer

The error shows that delta lake thinks that you have wrong partition path naming.

If you have any partition column in your delta table, for example year month day, your path should look like

/mnt/lake/CUR/CURATED/origination/company/opportunities_final/year=yyyy/month=mm/day=dd/curorigination.presentation.parquet

and, you just need to do

df = spark.read.format("delta").load("/mnt/lake/CUR/CURATED/origination/company/opportunities_final")

If you just read it as parquet, you can just do

df = spark.read.parquet("/mnt/lake/CUR/CURATED/origination/company/opportunities_final")

because you don't need to read the absolute path of the parquet file.

Answers (2)