Reputation: 77
Could you help me to load a couple of parquet files to Snowflake.
I've got about 250 parquet-files which stored in AWS stage.
250 files = 250 different tables.
I'd like to dynamically load them into Snowflake tables.
So, I need:
Could anyone help me how to do that? Does exist the most efficient way to realize it? (by using GUI Snowflake, for example). Can't find it.
Thanks.
Upvotes: 0
Views: 5238
Reputation: 131
This process is automatable provided you can use SnowSQL, and ideally bash or Powershell scripting.
At a pinch you can generate the SQL in Excel using 1 row per file and 1 column per sql command, then for the last column concat all the previous sql command columns together with a newline character.
The steps are as follows
Obtain each filename.
Option A
Option B
I prefer option A as its cleaner and I have found Infer_schema tricky with filepaths.
Incidentally the example syntax for Create table using infer schema on a parquet file isnt listed as an example.
This is the correct syntax.
CREATE STAGE my_stage;
CREATE OR REPLACE FILE FORMAT my_parquet_format TYPE = parquet;
PUT 'file:///<filepath>/<filename or wildcard>.snappy.parquet' @my_stage;
CREATE TABLE mytable
USING TEMPLATE (
SELECT ARRAY_AGG(OBJECT_CONSTRUCT(*))
FROM TABLE(
INFER_SCHEMA(
LOCATION=>'@my_stage',
FILE_FORMAT=>'my_parquet_format'
)
));
COPY INTO mytable FROM @my_stage FILE_FORMAT = (FORMAT_NAME= 'my_parquet_format') MATCH_BY_COLUMN_NAME=CASE_INSENSITIVE;
Upvotes: 1
Reputation: 524
If the schema of the files is same you can put them in a single stage and use the Infer-Schema function. This will give you the schema of the parquet files. https://docs.snowflake.com/en/sql-reference/functions/infer_schema.html
In case all files have different schema then I'm afraid you have to infer the schema on each file.
Upvotes: 1