Reputation: 53
For ingesting data from an external storage location into Snowflake when de-duping is necessary, I came across two ways:
Option 1:
Option 2:
I believe if the merge statement wasn't necessary to enforce primary key and remove duplicates, Snowpipe was the clear winner because it copies changed data directly into a table in one step. However, since staging and merging the data is necessary, which option is better?
Thank you!
Upvotes: 2
Views: 1179
Reputation: 1
We're using the external table approach. Our costs are significantly higher than anticipated though this may be because we chose to use serverless tasks given the volume of tables we're refreshing from ADLS.
Our snowflake rep suggested going the snowpipe route. The answer is always test and see what works best for you.
One thought is that you avoid the merge task entirely. Insert all data and build a view/dynamic table to show the latest version of each record.
Upvotes: 0