imanibew
imanibew

Reputation: 21

What is the use of encryption option for snowflake external stages?

In short:

Snowflake provides encryption option when we are creating an external stage. Below are the options (from https://docs.snowflake.com/en/sql-reference/sql/create-stage.html)

[ ENCRYPTION = ( [ TYPE = 'AWS_CSE' ] [ MASTER_KEY = '<string>' ] |
                   [ TYPE = 'AWS_SSE_S3' ] |
                   [ TYPE = 'AWS_SSE_KMS' [ KMS_KEY_ID = '<string>' ] |
                   [ TYPE = NONE ] ) ]

While we know AWS_CSE is used for client-side encryption (where data in encrypted/decrypted by client using Customer Owned Master key respectively during put/get into/from snowflake external stages), what is the use of AWS_SSE_S3 and AWS_SSE_KMS options?

In detail:

In our scenario, we have an S3 bucket (OUR_S3_BUCKET) with encryption set at bucket level as SSE-KMS, created an incoming directory and uploaded a file covid_data.csv.

S3://OUR_S3_BUCKET/incomig/covid_data.csv.

To access this file, we have created storage integration by referring to the S3 bucket and have created three external stage in Snowflake.

EXTERNAL STAGE 1 (without encryption):

CREATE OR REPLACE STAGE TEST_STG_NOENC
URL='S3://OUR_S3_BUCKET/incomig/'
STORAGE_INTEGRATION = INBOUND_S3;

EXTERNAL STAGE 2 (with AWS_SSE_S3):

CREATE OR REPLACE STAGE TEST_STG_SSE_S3
URL='S3://OUR_S3_BUCKET/incomig/'
STORAGE_INTEGRATION = INBOUND_S3
ENCRYPTION = ( TYPE = 'AWS_SSE_S3');

EXTERNAL STAGE 3 (with AWS_SSE_MKS):

CREATE OR REPLACE STAGE TEST_STG_SSE_KMS
URL='S3://OUR_S3_BUCKET/incomig/'
STORAGE_INTEGRATION = INBOUND_S3
ENCRYPTION = ( TYPE = 'AWS_SSE_KMS'  KMS_KEY_ID = 'arn:aws:kms:region:account_no:key/KMS_KEY_ID');

We are able to access the covid_data.csv data by selecting from all three external stages.

select t.$1, t.$2, t.$3
    from @<<All 3 external stages>> (file_format => OUR_CSV_FILE_FORMAT ) t;

Even though our S3 bucket is encrypted using SSE_KMS, we are able to access the files using a stage (TEST_STG_NOENC) without encryption option.

In these scenario, what is the use of AWS_SSE_S3 and AWS_SSE_KMS encryption options and how it helps?

Upvotes: 2

Views: 1223

Answers (1)

Thundzz
Thundzz

Reputation: 695

We had an interesting discussion with snowflake support regarding this topic, so we thought we might aswell share what we found here:

As of right now (August 2022):

  • The ENCRYPTION setting in the stage configuration does not affect reading from the stage towards snowflake. If the policies are properly configured on the AWS side, reading from snowflake should work regardless of what settings are configured on the snowflake stage. For a bucket encrypted with a KMS key, this basically means the role used by snowflake needs to have the rights to access the bucket, its objects, as well as the KMS keys that have been used to encrypt the objects snowflake will need to read (Please note that each object on a bucket can be encrypted differently)

  • The ENCRYPTION setting in the stage configuration is used when snowflake writes to the S3-backed stage. Depending on the configuration of the bucket and that of the stage, different outcomes are possible, but basically it goes the following way: the configuration of the stage seems to win and override the default S3 configuration. If the stage does not specify anything regarding encryption, the default S3 configuration will be used.

Please note that for the "writing" part, the best would be to test your own use-case in order to be certain of the outcome before implementing the solution for production.

Currently, the snowflake docs pertaining to these parameters is not very clear about what happens when different configurations mismatch. We will try to ask them to update it so that behavior can be predicted just by reading and understanding documentation.

Upvotes: 1

Related Questions