Reputation: 349
We are trying to take our data from an AWS S3 (external stage) and load it into a Snowflake internal stage. Snowflake should act as our data lake, and can reduce the amount of storage we use from AWS. Is there any built in functionality that can transfer data from external stage --> internal stage?
The goal is to load the data into the internal Snowflake stage and subsequently delete the data from AWS. We want Snowflake to be the data lake.
Upvotes: 0
Views: 1743
Reputation: 59
You've got to stop thinking that a "data lake" means a bunch of raw data files stored in a cloud bucket somewhere. That's the 2010 version of a data lake. In Snowflake, you can load the raw data into tables that mirror those files (either structured column-by-column, or semi-structured JSON,XML,Parquet...). Think of these tables as your "raw" zone. With Streams and Tasks, you can automate the curation of the data in the raw zone into a second set of tables - the "curated" zone. Another set of Streams/Tasks might go another step and pre-aggregate the curated data into an "aggregated" zone. The design of the workflows is up to you. The cloud storage just becomes a "landing area" for raw extracted data, and can be deleted after ingestion into Snowflake. You now have a single platform for your raw data, curated data, and aggregated data. Hook up a data governance tool like Alation or Collibra to maintain the lineage of the data through its journey.
-Paul-
Upvotes: 0
Reputation: 1170
An internal stage would just be a different S3 bucket utilized by Snowflake. So it's not really "reducing" the amount of storage, just changing its location. If you still wanted to do this, you could GET from your external stage and PUT to the internal stage. Or you could just load from the external stage to your tables in Snowflake via any of the available methods.
Upvotes: 1
Reputation: 3455
What do you mean internal stage
?
If you are planning to load into Snowflake tables, your scenario is perfect use case for Snowpipe, for more info Automating Snowpipe for Amazon S3
Upvotes: 1