Reputation: 1655
I am new to Snowflake data warehouse and want to load data using snowpipe and auto ingest from an external stage (S3 bucket) which contains multiple folders (one for each database) and inside those folders there are multiple files with 1 for each table. The files in the different folders will be updated at different times and I'm currently using auto-ingest to go S3 to SQS to Snowpipe.
My example S3 bucket is setup as follows:
myS3bucket.s3amazonaws.com
/db1
account.csv
product.csv
/db2
invoice.csv
employees.csv
My current understanding is that I would create one pipe per table so account.csv will relate to 1 pipe which will populate the the account table and then product.csv will relate to a different pipe to populate the product table etc.
At the moment it looks like all the pipes will be triggered if I add a new file to the bucket as the notification is on the bucket.
My questions are:
Thanks for any help
Upvotes: 0
Views: 1345
Reputation: 46
When using Auto-Ingest Snowflake creates an SQS per snowflake account, all notifications from S3 goes to that same SQS, and is then passed on to each pipe based on each pipe's stage definition (bucket and path). If the stages have overlaps in the paths, SQS will potentially pass the notification to multiple pipes.
See the "Important" section under "Create a Pipe with Auto-Ingest Enabled" in the "Automating Snowpipe for Amazon S3" documentation.
Upvotes: 1
Reputation: 2880
If your stages include the file path then when S3 sends a message that a file has been added, Snowflake will work out which stages are affected, so only the pipes that are pointed at that folder will pull in the file.
if you have three pipes, with three stages, and those stages point at these folders:
blahblah
blahblah/db1
blahblah/db2
then when a file is created in blahblah/db1
it will trigger the pipes 1 and 2, but not 3.
Upvotes: 1