Databricks AutoLoader - how to handle spark write transactional (_SUCCESS file) on Azure Data Lake Storage?

Question

Databricks spark write method (df.write.parquet) for parquet files is transactional. After successfully writing to Azure Data Lake Storage, the file _SUCCESS is created in the path where parquet files were loaded.

Example of the folder on ADLS including the _SUCCESS file: image showing the example of the folder on ADLS including the _SUCCESS file

Is it possible to configure AutoLoader to load parquet files only in case the write is done with success (_SUCCESS file appeared in the folder)? In other words, if listing by AutoLoader folders doesn't include _SUCCESS files, parquets files from those folders shouldn't be processed by AutoLoaer.

I was looking for the right option in documentation, but it seems like none of the options can help me.

Databricks AutoLoader - how to handle spark write transactional (_SUCCESS file) on Azure Data Lake Storage?

Answers (1)

Related Questions