Shahzad Badar
Shahzad Badar

Reputation: 83

Create External Table in Azure SQL Data warehouse to a wild card based file or folder path

I know we can create an External table in Azure SQL Data warehouse pointing to a LOCATION that is either a file path or a folder path. Can this file or folder path be based on a wild card pattern instead of an explicit path.

Here my file path is a location in Azure Data Lake Store.

-- Syntax for SQL Server

-- Create a new external table  
CREATE EXTERNAL TABLE [ database_name . [ schema_name ] . | schema_name. ] table_name   
    ( <column_definition> [ ,...n ] )  
    WITH (   
        **LOCATION = 'folder_or_filepath'**,  
        DATA_SOURCE = external_data_source_name,  
        FILE_FORMAT = external_file_format_name  
        [ , <reject_options> [ ,...n ] ]  
    )  
[;]  

Upvotes: 3

Views: 3555

Answers (1)

wBob
wBob

Reputation: 14389

Polybase / External Tables do not support wildcards at this time. Simply have one folder per external table you require. If you feel this is an important missing feature you can create a request and vote for it here:

https://feedback.azure.com/forums/307516-sql-data-warehouse

Bear in mind Polybase (in Azure SQL Data Warehouse) can now read files either in blob storage or in Azure Data Lake Storage (ADLS). Therefore as another workaround, Azure Data Lake Analytics (ADLA) and U-SQL support Polybase, so you could use U-SQL to move the files you want from blob store into your lake, eg

// Move data from blob store to data lake
// add filename and structure as one file
DECLARE @inputFilepath string = "wasb://[email protected]/someFilter/{filepath}.csv";
DECLARE @outputFilepath string = "output/special folder/output.csv";


@input =
    EXTRACT 
            ...     // your column list
            filepath string
    FROM @inputFilepath
    USING Extractors.Csv()


@input =
    SELECT * FROM @input
    WHERE filename.Contains("yourFilter");


// Export as csv
OUTPUT @input
TO @outputFilepath
USING Outputters.Csv(quoting:false);


// Now the data is in Data Lake which Polybase can also use as a source

Upvotes: 3

Related Questions