user13274071
user13274071

Reputation: 43

Parquet file load into snowflake table Issue

I am trying to load data from parquet file in AWS S3 into snowflake table. But getting the below error. Could you please help.

SQL compilation error: PARQUET file format can produce one and only one column of type variant or object or array. 
Use CSV file format if you want to load more than one column.

Parquet file schema

 |-- uuid: string (nullable = true)
 |-- event_timestamp: timestamp (nullable = true)
 |-- params: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- id: string (nullable = true)
 |    |    |-- name: string (nullable = true)
 |    |    |-- type: string (nullable = true)
 |    |    |-- value: string (nullable = true)

Here is the sample data. uuid,event_timestamp,params 3f230ea5-dd52-4cf9-bdde-b79201eb1001,2020-05-10 17:06:21.524,[{id=501, type=custom, name=filtering, value=true}, {id=502, type=custom, name=select, value=false}]

snowflake table

create or replace table temp_log (
      uuid string, 
      event_timestamp timestamp, 
      params array);

I am using the below copy command to load data

 copy into temp_log
 from '<<s3 path>>'
 pattern = '*.parquet'
 storage_integration = <<integration object>
 file_format = (
     type = parquet
     compression = snappy
 )
 ;

Upvotes: 2

Views: 7462

Answers (4)

Julian Eccleshall
Julian Eccleshall

Reputation: 554

I use these 2 sql to load the data into table,

first: use this SQL to create the table sql

with cols as (
select COLUMN_NAME || '   ' || TYPE  col
  from table(
    infer_schema(
      location=>'@LANDING/myFile.parquet'
      , file_format=>'LANDING.default_parquet'
      )
  ) 
),
temp as (
    select 'create or replace table myTable (' col1
    union
    select listagg(col, ',') col1
    from cols
    union
    select ') ' col1
)
select listagg(col1) 
from temp

Second, use this SQL to create the copy into SQL load the data into table

with cols as (
select expression 
  from table(
    infer_schema(
      location=>'@LANDING/myFile.parquet'
      , file_format=>'LANDING.default_parquet'
      )
  )
),
temp as (
    select 'copy into myTable from ( select ' col1
    union
    select listagg(expression, ',') col1
    from cols
    union
    select 'from @LANDING/myFile.parquet ) ' col1
)
select listagg(col1) 
from temp 

Upvotes: 0

Vzzarr
Vzzarr

Reputation: 5650

In my case the error message was raised because I was running the command as

COPY INTO my_db.my_schema.my_table
FROM (
    SELECT *
    FROM @my_stage
) FILE_FORMAT = ( TYPE = PARQUET );

Instead each column should be specified as $1:my_column in the 'select statement', for example:

COPY INTO my_db.my_schema.my_table
FROM (
    SELECT $1:my_column1, $1:my_column2, $1:my_column3
    FROM @my_stage
) FILE_FORMAT = ( TYPE = PARQUET );

Upvotes: 0

user13274071
user13274071

Reputation: 43

This issue was resolved after creating table as below

create or replace table temp_log (

     logcontent VARIANT);

Upvotes: 0

NickW
NickW

Reputation: 9768

This documentation explains how to load parquet data into multiple columns: Loading Parquet

UPDATE

I'm not sure if the comment below is a response to my answer and, if it is, what the relevance of it is? Did you read the document and, if you did, what part of it do you still have questions about?

You need to have your data in a stage (presumably an external stage in your case), or possibly in an external table, and then load from that into your table using "COPY INTO table FROM (SELECT..." with the $1:.. notation allowing you to select the appropriate elements from you parquet structure.

From the documentation:

/* Load the Parquet data into the relational table.                                                           */
/*                                                                                                            */
/* A SELECT query in the COPY statement identifies a numbered set of columns in the data files you are        */
/* loading from. Note that all Parquet data is stored in a single column ($1).                                */
/*                                                                                                            */
/* Cast element values to the target column data type.                                                        */

copy into cities
  from (select
  $1:continent::varchar,
  $1:country:name::varchar,
  $1:country:city.bag::variant
  from @sf_tut_stage/cities.parquet);

Upvotes: 2

Related Questions