ajcoder
ajcoder

Reputation: 237

Historical data migration from Teradata to Snowflake

What are the steps to be taken to migrate historical data load from Teradata to Snowflake? Imagine there is 200TB+ of historical data combined from all tables.

I am thinking of two approaches. But I don't have enough expertise and experience on how to execute them. So looking for someone to fill in the gaps and throw some suggestions

Approach 1- Using TPT/FEXP scripts

  1. I know that TPT/FEXP scripts can be written to generate files for a table. How can I create a single script that can generate files for all the tables in the database. (Because imagine creating 500 odd scripts for all the tables is impractical).
  2. Once you have this script ready, how is this executed in real-time? Do we create a shell script and schedule it through some Enterprise scheduler like Autosys/Tidal?
  3. Once these files are generated , how do you split them in Linux machine if each file is huge in size (because the recommended size is between 100-250MB for data loading in Snowflake)
  4. How to move these files to Azure Data Lake?
  5. Use COPY INTO / Snowpipe to load into Snowflake Tables.

Approach 2

  1. Using ADF copy activity to extract data from Teradata and create files in ADLS.
  2. Use COPY INTO/ Snowpipe to load into Snowflake Tables.

Which of these two is the best suggested approach ? In general, what are the challenges faced in each of these approaches.

Upvotes: 1

Views: 988

Answers (1)

Michael Golos
Michael Golos

Reputation: 2069

Using ADF will be a much better solution. This also allows you to design DataLake as part of your solution. You can design a generic solution that will import all the tables provided in the configuration. For this you can choose the recommended file format (parquet) and the size of these files and parallel loading.

The challenges you will encounter are probably a poorly working ADF connector to Snowflake, here you will find my recommendations on how to bypass the connector problem and how to use DataLake Gen2: Trouble loading data into Snowflake using Azure Data Factory

More about the recommendation on how to build Azure Data Lake Storage Gen2 structures can be found here: Best practices for using Azure Data Lake Storage Gen2

Upvotes: 0

Related Questions