Stomik
Stomik

Reputation: 1

How do I get wildcard partition pruning in mssql 2022 , polybase and a s3-storage to work

I have problems with partition pruning when using wildcards in an OPENROWSET query. I have tried to use the pattern shown in this documentation. https://learn.microsoft.com/en-us/sql/relational-databases/polybase/polybase-performance?view=sql-server-ver16.

Executing an OPENROWSET query with wildcards in the bulk path and the arguments in the WHERE clause seems not to work.

I have an on-premises SQL Server 2022 (16.0.4120.1) with PolyBase enabled. The external source is a S3-compatible object storage (Minio) running on a Linux Red Hat server.

My version of the query. The s3 storage bucket is partitioned in source/year/month/date/file.txt

For the query below the path is: testbucket/files_txt/year=2024/month=05/date=06/*.txt.

SELECT 
    r.filepath() AS filepath,
    r.filepath(1) AS [year],
    r.filepath(2) AS [month],
    COUNT_BIG(*) AS [rows]
FROM OPENROWSET( 
        BULK '/testbucket/files_txt/year=*/month=*/date=*/*.txt',
        DATA_SOURCE = 's3_ds', 
        FORMAT = 'CSV' 
    ) 
WITH ( 
    col1 varchar(800) 
) AS [r] 
WHERE 
    r.filepath(1) COLLATE Latin1_General_BIN = '2024'
    AND r.filepath(2) COLLATE Latin1_General_BIN = '05'
    AND r.filepath(3) COLLATE Latin1_General_BIN = '06'
GROUP BY 
    r.filepath(),
    r.filepath(1),
    r.filepath(2)
ORDER BY filepath;

When execute the query as above. It takes several minutes to get back a result. However. When I run the query and replace the wildcards with the where year, month and date figures in the bulk path it brings back a result in seconds. SQL Server gave the message that adding collation would enable push down. But not sure if that is posible in this case. Howver. Adding of the collation did not seem to have any effect.

I know that I can create dynamic SQL to fix this. However I thought this would work better since there are several examples in the documentation. Or have I missed something?

Upvotes: 0

Views: 40

Answers (0)

Related Questions