Reputation: 237
The concept of Internal Stage is misleading or I am interpreting this incorrectly. Please correct my understanding. According to the documentation
Upvotes: 6
Views: 2477
Reputation: 6229
An external stage is managed by you - the customer - and you can arrange files / secure the files in them however you like. Then, when you want to load data from an external stage into Snowflake, you just reference those external stages.
An internal stage is managed by Snowflake and you can PUT
files into them, everything else about them is managed by Snowflake. The storage of Snowflake internal stages is abstracted away from you. When I say PUT
, this is a command that you can run using the Snowflake CLI that takes a local file and uploads it into an internal stage.
As to why internal stages exist? I suppose something along the lines of:
For flexibility, you can use Snowflake's internal blob storage (whatever that may be) or you can use your own storage to stage your data.
You can use Snowflake and load data into tables quickly without having blob storage of your own.
It makes it easier for non-administrator users. End-users of Snowflake can load data into their own tables without having to know how to use s3/azure blob/GCS etc. Each user gets their own little internal stage area at ~
like a home directory. Also, each table gets their own internal stage that you can put
into.
Upvotes: 4
Reputation: 453
Unique to Snowflake is the concept of stage, it is the last place before data is loaded to a target table.
All content hosted as files externally or internally must be copied into a Snowflake table (COPY command) to take advantage of Snowflake's proprietary micropartitions storage mechanism and things like zero copy cloning. Alternatively you could still keep the files in an S3 bucket as an external stage but register the file as an external table to Snowflake and be able to run SQL on it. These are csv, parquet, avro, orc and json. Of course you don't get the benefits as listed above.
Basically, everything is a file before loading to Snowflake tables (which by the way, with improved compression algorithms achieves better compression too)
For your reading: https://docs.snowflake.com/en/user-guide/data-load-overview.html
Upvotes: 3
Reputation: 7339
Internal stage is the storage that Snowflake provides and bills back to you. External stage is a reference to storage that is owned and paid for by the customer.
You are correct that this is still a public cloud resource, but internal stage is not accessible by anything other than Snowflake or the Snowflake connectors. Therefore, it is "internal".
Upvotes: 1