Reputation: 43
I'm designing a Dataware house in Azure Synapse using SQL Pool, but I'm facing some design questions.
Context: My plan is to load Partitioned Parquet files using Azure Data Lake Storage (ADLS), then, with SQL pool create External Tables to query those files.
My questions are:
Thanks for your time. Josh
Upvotes: 0
Views: 1448
Reputation: 89091
Is it better in terms of performance to provide the solution just with the external tables?
No. Internal Tables are distributed columnstores, with multiple levels of caching, and typically out-perform external parquet tables. Internal tables additionally support batch-mode scanning, columnstore ordering, segment elimination, partition elimination, materialized views, and resultset caching.
Is it possible to perform partitioning in external tables?
This is not currently possible in Dedicated SQL Pools, see Folder Partition Elimination
How does affect the user concurrency to the external tables and the internal tables?
Concurrency is a matter of query performance. The faster your queries perform, the faster sessions give up their concurrency slot. So anything that improves query performance improves the effective concurrency (the number of concurrent users you can support with reasonable query runtime).
Serverless SQL Pools currently have more advanced capabilities for working with data stored as Parquet or Delta in the Data Lake.
Upvotes: 2