user3165854
user3165854

Reputation: 1655

Snowpipe auto ingest configuration looks like it triggers all pipes

I am new to Snowflake data warehouse and want to load data using snowpipe and auto ingest from an external stage (S3 bucket) which contains multiple folders (one for each database) and inside those folders there are multiple files with 1 for each table. The files in the different folders will be updated at different times and I'm currently using auto-ingest to go S3 to SQS to Snowpipe.

My example S3 bucket is setup as follows:

myS3bucket.s3amazonaws.com
    /db1 
       account.csv
       product.csv
    /db2
        invoice.csv
        employees.csv

My current understanding is that I would create one pipe per table so account.csv will relate to 1 pipe which will populate the the account table and then product.csv will relate to a different pipe to populate the product table etc.

At the moment it looks like all the pipes will be triggered if I add a new file to the bucket as the notification is on the bucket.

My questions are:

Thanks for any help

Upvotes: 0

Views: 1345

Answers (2)

DanielNilsson
DanielNilsson

Reputation: 46

When using Auto-Ingest Snowflake creates an SQS per snowflake account, all notifications from S3 goes to that same SQS, and is then passed on to each pipe based on each pipe's stage definition (bucket and path). If the stages have overlaps in the paths, SQS will potentially pass the notification to multiple pipes.

See the "Important" section under "Create a Pipe with Auto-Ingest Enabled" in the "Automating Snowpipe for Amazon S3" documentation.

https://docs.snowflake.com/en/user-guide/data-load-snowpipe-auto-s3.html#step-2-create-a-pipe-with-auto-ingest-enabled

Upvotes: 1

David Garrison
David Garrison

Reputation: 2880

If your stages include the file path then when S3 sends a message that a file has been added, Snowflake will work out which stages are affected, so only the pipes that are pointed at that folder will pull in the file.

if you have three pipes, with three stages, and those stages point at these folders:

  • 1: blahblah
  • 2: blahblah/db1
  • 3: blahblah/db2

then when a file is created in blahblah/db1 it will trigger the pipes 1 and 2, but not 3.

Upvotes: 1

Related Questions