Andy Jones
Andy Jones

Reputation: 1482

Azure SQL Data Warehouse Polybase Query to Azure Data Lake Gen 2 returns zero rows

Why does an Azure SQL Data Warehouse Polybase Query to Azure Data Lake Gen 2 return many rows for a single file source, but zero rows for the parent folder source?

I created:

Everything works fine when my external table points to a specific file, i.e.

CREATE EXTERNAL TABLE [ext].[Time]
(
    [TimeID] int NOT NULL,
    [HourNumber] tinyint NOT NULL,
    [MinuteNumber] tinyint NOT NULL,
    [SecondNumber] tinyint NOT NULL,
    [TimeInSecond] int NOT NULL,
    [HourlyBucket] varchar(15) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL
)
WITH
(
    LOCATION = '/Time/time001.txt',
    DATA_SOURCE = ADLSDataSource,
    FILE_FORMAT = uncompressedcsv,
    REJECT_TYPE = value,
    REJECT_VALUE = 2147483647
);
SELECT * FROM [ext].[Time];

Many rows returned, therefore I am confident all items mentioned above are configured correctly.

The Time folder in Azure Data Lake Gen 2 contains many files, not just time001.txt. When I change my external table to point at a folder, and not an individual file, the query returns zero rows, i.e.

CREATE EXTERNAL TABLE [ext].[Time]
(
    [TimeID] int NOT NULL,
    [HourNumber] tinyint NOT NULL,
    [MinuteNumber] tinyint NOT NULL,
    [SecondNumber] tinyint NOT NULL,
    [TimeInSecond] int NOT NULL,
    [HourlyBucket] varchar(15) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL
)
WITH
(
    LOCATION = '/Time/',
    DATA_SOURCE = ADLSDataSource,
    FILE_FORMAT = uncompressedcsv,
    REJECT_TYPE = value,
    REJECT_VALUE = 2147483647
);
SELECT * FROM [ext].[Time];

Zero rows returned

I tried:

But always zero rows. I also followed the instructions at https://learn.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-load-from-azure-data-lake-store

I tested all the files within the folder and individually each returns many rows of data.

I queried all the files from Blob storage and not ADLS Gen2 and the "Folder" query returns all rows as expected.

How do I query all files in a folder "as one" from Azure Data Lake Gen2 storage using Azure SQL Data Warehouse and Polybase?

Upvotes: 2

Views: 451

Answers (1)

Renan Carvalho
Renan Carvalho

Reputation: 11

I was facing the exactly same issue: the problem was on the Data Source protocol.

Script with error:

   CREATE EXTERNAL DATA SOURCE datasourcename
   WITH (
       TYPE = HADOOP,
       LOCATION = 'abfss://[email protected]',
       CREDENTIAL = credential_name

Script that solves issue:

   CREATE EXTERNAL DATA SOURCE datasourcename
   WITH (
       TYPE = HADOOP,
       LOCATION = 'abfss://[email protected]',
       CREDENTIAL = credential_name

The only change needed was the LOCATION.

Thanks to the Microsoft Support Team for helping me on this.

Upvotes: 1

Related Questions