Reputation: 598

Unable to read Parquet files(Created from Databricks Job/Notebook) from Azure Data Factory

I am getting error while I am reading Parquet file created by Databricks on ADLS. While I read these files using Databricks it works perfectly fine and I am able to read and write data into these files from Databricks. However with DataFactory it is giving below error.

Error: Parquet file contained column 'txn', which is of a non-primitive, unsupported type.

However there is no txn column create by me from Databricks.

Upvotes: 0

Answers (3)

Hick

Reputation: 459

You are probably reading all files recursively because databricks delta parquet creates a folder with logs and such, within this folder there are additional tables with more columns, hence the txn.

So, remove the recursively option from the source of the copy data activity:

Upvotes: 0

Mark Kromer MSFT

Reputation: 3838

When you have complex data types in your parquet source files in ADF, you need to use a data flow without a dataset schema. Then you can work with structs, maps, arrays, etc: https://www.youtube.com/watch?v=Wk0C76wnSDE

Upvotes: 0

Vamsi Bitra

Reputation: 2764

This error mainly happens because of unsupported data type. When you pass to the column in parquet file make sure you are using the supported datatype.

Supported datatype mapping for parquet file refer this Microsoft document .

Upvotes: 1

Unable to read Parquet files(Created from Databricks Job/Notebook) from Azure Data Factory

Answers (3)

Related Questions