harin pagidimuntala
harin pagidimuntala

Reputation: 1

Azure synapse dedicated SQL pools exports to ADLS storage account via polybase have .parq extensions instead of .parquet

Azure synapse dedicated SQL pools exports to ADLS storage account via polybase have .parq extensions instead of .parquet extensions.

enter image description here

CREATE EXTERNAL DATA SOURCE [SomeExternalDataSourcename] WITH(TYPE=HADOOP, LOCATION=N'abfss://[email protected]/foldername') ;

CREATE EXTERNAL FILE FORMAT [ff_Parquet] WITH (FORMAT_TYPE = PARQUET)

CREATE EXTERNAL TABLE [staging_schema].[table] WITH(LOCATION='folder/schema_table', DATA_SOURCE=[SomeExternalDataSourcename], FILE_FORMAT=[ff_Parquet]) AS SELECT * FROM [schema].[table];

The result is below

enter image description here

we were expecting .parquet extension. Is there anyway to generate exports directly with .parquet extensions ?

Upvotes: 0

Views: 224

Answers (1)

Aswin
Aswin

Reputation: 7156

Dedicated SQL pool creates .parq file only when using externat tables to create a file. I tried and got the .parq extension file only. When searched for similar issue, got this in Microsoft QnA platform . Unfortunately, the file extension cannot be changed to the standard ".parquet" extension in the ADLS gen2 using dedicated SQL pool. Workaround is to rename the files only. You can use ADF pipeline to rename the file.

  • Get the list of files using get metadata activity.
  • Then take the for-each activity and inside for-each take the copy activity. Take the source files that has .parq as extension in the file name from the get metadata activity output. In the sink dataset , split the source data filename with the dot(.) symbol and give the <filename>.parquet as the filename.
  • Delete the source files using delete activity. Delete activity is to be in sequential with copy activity.
  • Execute the pipeline.

You can log your feedback about this in Microsoft azure feedback platform.

Reference: https://learn.microsoft.com/en-us/answers/questions/1200090/possible-bug-or-issue-in-synapse-dedicated-sql-poo

Upvotes: 0

Related Questions