Pankaj
Pankaj

Reputation: 2746

Snowflake table stages

Can someone please help to understand what does following means as per snowflake documentation - "Table stages do not support transforming data while loading it (i.e. using a query as the source for the COPY command)." - documentation link :- https://docs.snowflake.com/en/user-guide/data-load-local-file-system-create-stage.html. The COPY command support following transformations and they seem to be working well for table level stages too, so not sure what does it means as per documentation?

  1. Column reordering
  2. column omission
  3. casts using a SELECT statement

I tried all three and they worked fine, so could not understand what does that statement means as per documentation : Case-1 (CAST)

snowflake1#COMPUTE_WH@TEST_DB.PUBLIC>copy into test_tab from (select s.$1,s.$2,s.$3,s.$4,s.$5,reverse(s.$6::String) from @%test_tab s) file_format=(skip_header=1);
+------------------+--------+-------------+-------------+-------------+-------------+-------------+------------------+-----------------------+-------------------------+
| file             | status | rows_parsed | rows_loaded | error_limit | errors_seen | first_error | first_error_line | first_error_character | first_error_column_name |
|------------------+--------+-------------+-------------+-------------+-------------+-------------+------------------+-----------------------+-------------------------|
| MOCK_DATA.csv.gz | LOADED |        1000 |        1000 |           1 |           0 | NULL        |             NULL |                  NULL | NULL                    |
+------------------+--------+-------------+-------------+-------------+-------------+-------------+------------------+-----------------------+-------------------------+
1 Row(s) produced. Time Elapsed: 0.856s

Case-2 [Column re-ordering] Also, tried column re-ordering [re-ordered column number 2 and 3] and that worked fine as well -

snowflake1#COMPUTE_WH@TEST_DB.PUBLIC>copy into test_tab from (select cast(s.$1 as String),s.$3,s.$2,s.$4,s.$5,reverse(s.$6::String) from @%test_tab s) file_format=(skip_header=1);
    +------------------+--------+-------------+-------------+-------------+-------------+-------------+------------------+-----------------------+-------------------------+
    | file             | status | rows_parsed | rows_loaded | error_limit | errors_seen | first_error | first_error_line | first_error_character | first_error_column_name |
    |------------------+--------+-------------+-------------+-------------+-------------+-------------+------------------+-----------------------+-------------------------|
    | MOCK_DATA.csv.gz | LOADED |        1000 |        1000 |           1 |           0 | NULL        |             NULL |                  NULL | NULL                    |
    +------------------+--------+-------------+-------------+-------------+-------------+-------------+------------------+-----------------------+-------------------------+
    1 Row(s) produced. Time Elapsed: 0.947s

Case-3 [column omission, omitted column IP_address] -

snowflake1#COMPUTE_WH@TEST_DB.PUBLIC>copy into test_tab(id,first_name,last_name,email,gender) from (select cast(s.$1 as String),s.$3,s.$2,s.$4,s.$5 from @%test_tab s) file_format=(skip_header=1);
    +------------------+--------+-------------+-------------+-------------+-------------+-------------+------------------+-----------------------+-------------------------+
    | file             | status | rows_parsed | rows_loaded | error_limit | errors_seen | first_error | first_error_line | first_error_character | first_error_column_name |
    |------------------+--------+-------------+-------------+-------------+-------------+-------------+------------------+-----------------------+-------------------------|
    | MOCK_DATA.csv.gz | LOADED |        1000 |        1000 |           1 |           0 | NULL        |             NULL |                  NULL | NULL                    |
    +------------------+--------+-------------+-------------+-------------+-------------+-------------+------------------+-----------------------+-------------------------+
    1 Row(s) produced. Time Elapsed: 0.831s
    
    snowflake1#COMPUTE_WH@TEST_DB.PUBLIC>select * from test_tab limit 1;
    +----+------------+-----------+-------------------+--------+------------+
    | ID | FIRST_NAME | LAST_NAME | EMAIL             | GENDER | IP_ADDRESS |
    |----+------------+-----------+-------------------+--------+------------|
    |  1 | Menicomb   | Hedwiga   | [email protected] | Male   | NULL       |
    +----+------------+-----------+-------------------+--------+------------+
    1 Row(s) produced. Time Elapsed: 0.293s

Copying into table stage - UNLOADING: Case-1) Normal copy to table level stage - worked fine.

snowflake1#COMPUTE_WH@TEST_DB.PUBLIC>create table test_tab_copy as 
select * from TEST_TAB where 1=2;
+-------------------------------------------+
| status                                    |
|-------------------------------------------|
| Table TEST_TAB_COPY successfully created. |
+-------------------------------------------+
1 Row(s) produced. Time Elapsed: 1.336s
snowflake1#COMPUTE_WH@TEST_DB.PUBLIC>select * from TEST_TAB_COPY;
+----+------------+-----------+-------+--------+------------+
| ID | FIRST_NAME | LAST_NAME | EMAIL | GENDER | IP_ADDRESS |
|----+------------+-----------+-------+--------+------------|
+----+------------+-----------+-------+--------+------------+
0 Row(s) produced. Time Elapsed: 0.185s
snowflake1#COMPUTE_WH@TEST_DB.PUBLIC>copy into @%test_tab_copy from (select * from test_tab);
+---------------+-------------+--------------+
| rows_unloaded | input_bytes | output_bytes |
|---------------+-------------+--------------|
|          1000 |       52413 |        21265 |
+---------------+-------------+--------------+
1 Row(s) produced. Time Elapsed: 1.085s
snowflake1#COMPUTE_WH@TEST_DB.PUBLIC>list @%test_tab_copy;
+-------------------+-------+----------------------------------+-------------------------------+
| name              |  size | md5                              | last_modified                 |
|-------------------+-------+----------------------------------+-------------------------------|
| data_0_0_0.csv.gz | 21280 | 71a630eef7eaba5d5f84e1afe39db66a | Tue, 15 Mar 2022 17:23:10 GMT |
+-------------------+-------+----------------------------------+-------------------------------+
1 Row(s) produced. Time Elapsed: 0.284s

Case-2) Copy to table level stage with columns re-ordered (first_name,last_name) - worked fine.

snowflake1#COMPUTE_WH@TEST_DB.PUBLIC>create table test_tab_copy as select * from TEST_TAB where 1=2;
+-------------------------------------------+
| status                                    |
|-------------------------------------------|
| Table TEST_TAB_COPY successfully created. |
+-------------------------------------------+
1 Row(s) produced. Time Elapsed: 0.622s
snowflake1#COMPUTE_WH@TEST_DB.PUBLIC>copy into @%test_tab_copy from (select id,last_name,first_name,email,gender,ip_add
                                     ress from test_tab);
+---------------+-------------+--------------+
| rows_unloaded | input_bytes | output_bytes |
|---------------+-------------+--------------|
|          1000 |       52413 |        21221 |
+---------------+-------------+--------------+
1 Row(s) produced. Time Elapsed: 0.947s

Case-3) Copy to table level stage with cast (string) - worked fine.

snowflake1#COMPUTE_WH@TEST_DB.PUBLIC>create table test_tab_copy as select * from TEST_TAB where 1=2;
+-------------------------------------------+
| status                                    |
|-------------------------------------------|
| Table TEST_TAB_COPY successfully created. |
+-------------------------------------------+
1 Row(s) produced. Time Elapsed: 0.638s
snowflake1#COMPUTE_WH@TEST_DB.PUBLIC>copy into @%test_tab_copy from (select id,last_name,first_name,email,gender,ip_add
                                     ress::String from test_tab);
+---------------+-------------+--------------+
| rows_unloaded | input_bytes | output_bytes |
|---------------+-------------+--------------|
|          1000 |       52413 |        21221 |
+---------------+-------------+--------------+
1 Row(s) produced. Time Elapsed: 0.384s

Case-4) Copy to table level stage with column omitted (omitted column IP_ADDRESS) - worked fine.

snowflake1#COMPUTE_WH@TEST_DB.PUBLIC>create table test_tab_copy as select * from TEST_TAB where 1=2;
+-------------------------------------------+
| status                                    |
|-------------------------------------------|
| Table TEST_TAB_COPY successfully created. |
+-------------------------------------------+
1 Row(s) produced. Time Elapsed: 0.610s
snowflake1#COMPUTE_WH@TEST_DB.PUBLIC>select * from test_tab_copy;
+----+------------+-----------+-------+--------+------------+
| ID | FIRST_NAME | LAST_NAME | EMAIL | GENDER | IP_ADDRESS |
|----+------------+-----------+-------+--------+------------|
+----+------------+-----------+-------+--------+------------+
0 Row(s) produced. Time Elapsed: 0.179s
snowflake1#COMPUTE_WH@TEST_DB.PUBLIC>copy into @%test_tab_copy from (select id,last_name,first_name,email,gender from t
                                     est_tab);
+---------------+-------------+--------------+
| rows_unloaded | input_bytes | output_bytes |
|---------------+-------------+--------------|
|          1000 |       48413 |        20997 |
+---------------+-------------+--------------+
1 Row(s) produced. Time Elapsed: 0.527s
snowflake1#COMPUTE_WH@TEST_DB.PUBLIC>

Upvotes: 2

Views: 823

Answers (1)

Clark Perucho
Clark Perucho

Reputation: 466

Casting, column reordering and omission are all supported transformations in COPY command.

Some of the unsupported transformations are:

  • Filtering (WHERE clause, LIMIT, FETCH, TOP)
  • FLATTEN Function

Here's the documentation for more details: https://docs.snowflake.com/en/user-guide/data-load-transform.html

Upvotes: 1

Related Questions