m1nkeh
m1nkeh

Reputation: 1397

Azure Data Warehouse Insert to Huge Table

The agreed pattern to insert new data into an already existing table in Azure Data Warehouse seems to be...

create table dbo.MyTable_New
with (distribution = round_robin)
as

select
    Col1
    ,Col2
from dbo.MyTable

union all

select
    Col1
    ,Col2
from dbo.MyNewStuff;

Now, what we are seeing is that on really large tables this will degrade over time, and it's obvious why you are reading everything you have already, and re-inserting it.. this seems sub-optimal to me...

Does anyone have any alternative approaches that they have seen work for them, i am thinking of things like partition switching for example...

Upvotes: 1

Views: 766

Answers (1)

wBob
wBob

Reputation: 14379

Agreed with who? CTAS can be an effective option, but it really depends on your data volumes, SLAs, business processes etc. If you are inserting small volumes into an already large table for example, I'm not sure CTAS makes sense, why not INSERT? You can use partition switching in Azure SQL Data Warehouse (eg here). Other options include Azure Data Factory, SSIS (which now supports Azure SQL Data Warehouse), Polybase, plain INSERT etc

Please review the following article which offers a number of strategies for loading data into Azure SQL Data Warehouse:

https://blogs.msdn.microsoft.com/sqlcat/2017/05/17/azure-sql-data-warehouse-loading-patterns-and-strategies/

Upvotes: 2

Related Questions