nareto
nareto

Reputation: 185

Snowflake periodic offload to S3 stage

I would like to set up a task on Snowflake that runs every Monday shortly after 12 AM and offloads all the timestamped data from the previous week in an S3 stage, in a subdirectory that has the date range in the name (e.g. s3://my_bucket/data/2022-07-04---2022-07-11/)

I created the stage

  CREATE OR REPLACE STAGE s3_stage
  URL='s3://my_bucket/'
  CREDENTIALS=(AWS_KEY_ID='...' AWS_SECRET_KEY='...')
  FILE_FORMAT = (TYPE ='PARQUET');

and this worked as expected:

COPY INTO @s3_stage/sample (SELECT * FROM MY_TABLE LIMIT 100);   

To get the full data from last week I came up with the following query, using CTEs

WITH prev_monday AS (
    SELECT timestamp_from_parts(year(current_date()), month(current_date()), day(current_date()) - EXTRACT ('dayofweek_iso',current_date()) + 1 -7,0,0,0) as date
),
last_monday AS (
    SELECT timestamp_from_parts(year(current_date()), month(current_date()), day(current_date()) - EXTRACT ('dayofweek_iso',current_date()) + 1,0,0,0) as date
)
SELECT  * FROM 
MY_TABLE as t,
prev_monday as pm,
last_monday as lm 
where t.timestamp >= pm.date and t.timestamp < lm.date;

This also works as expected.

I also can generate the requested s3 path using again the same CTEs and concat:

WITH prev_monday AS (
    SELECT timestamp_from_parts(year(current_date()), month(current_date()), day(current_date()) - EXTRACT ('dayofweek_iso',current_date()) + 1 -7,0,0,0) as date
),
last_monday AS (
    SELECT timestamp_from_parts(year(current_date()), month(current_date()), day(current_date()) - EXTRACT ('dayofweek_iso',current_date()) + 1,0,0,0) as date
)
select concat('@s3_stage/data', pm.date::DATE, '---', lm.date::DATE,'/')
FROM
prev_monday as pm,
last_monday as lm ;

I'm having however trouble unifying everything, because of two problems:

  1. Apparently WITH prev_monday AS .... COPY INTO .... is not valid syntax
  2. The s3 path, argument to COPY INTO, is not specified as a string (i.e. there are no single quotes). So I can't use concat to generate this

Any ideas or other approaches I could try?

Upvotes: 0

Views: 134

Answers (1)

Gokhan Atil
Gokhan Atil

Reputation: 10059

As Nick said, it's possible to put your all logic (and SQLs) in a stored procedure and call them. Here is a doc showing how you can call multiple SQLs in a JS procedure:

https://community.snowflake.com/s/article/Executing-Multiple-SQL-Statements-in-a-Stored-Procedure

This one also could be useful to start writing your stored procedure:

Create stage to copy data from S3 to Snowflake from within Snowflake Scripting block?

Upvotes: 1

Related Questions