SamSnow
SamSnow

Reputation: 1

How to Model and Load Type 2 Dimension Table in Snowflake Cloud Database

Would like to know how can I create and load Type 2 (Slowly Changing) dimension table in snowflake cloud database. Most of the cases, the loading of this table involves updates and inserts of records up to 1-50 per day. If we follow the traditional approach then will the snowflake storage will grow significantly in shot time as every small change it will create a separate storage per load.

Upvotes: 0

Views: 693

Answers (2)

Chris
Chris

Reputation: 690

You could bring your data into a permanent table with INSERT/UPDATE/DELETE/MERGE statements. This table would represent the "current state" of your dimension. On this table you could create a STREAM object which tracks the changes, then setup a stored procedure to gather the data from the stream and write those changes to an SCD2 table. The procedure can be invoked by a task running on a schedule or invoked from another task that successfully updates the "current state" table. Finally, the SCD2 table could be transient to avoid added costs of fail-safe and added time-travel (as Mike mentioned in his answer).

Flow:
Data into "current state" table --> Task running proc reads from stream --> Updates SCD2

Documentation:
https://docs.snowflake.net/manuals/user-guide/streams.html

Storage is cheap, but it is good to be cost aware.

Upvotes: 1

Mike Walton
Mike Walton

Reputation: 7339

On inserts and updates, Snowflake creates new micropartitions, but based on what your time-travel settings for the table, then the stale micropartitions will be dropped. So, I wouldn't worry too much about it (space is cheap, too). If you are concerned with the time-travel or fail-safe creating an extra cost, you can use a Transient table, instead of permanent tables. These will not have fail-safe costs.

https://docs.snowflake.net/manuals/user-guide/data-time-travel.html

https://docs.snowflake.net/manuals/user-guide/tables-temp-transient.html#comparison-of-table-types

Hope that helps. Otherwise, there isn't any difference on how you should handle Type 2 dimensions.

Upvotes: 1

Related Questions