Alsin
Alsin

Reputation: 1618

Load AVRO into Snowflake

I have an AVRO file with this schema:

{
  "type":"record",
  "name":"row",
  "fields":
  [
    {"name":"Column1","type":"int"},
    {"name":"Column2","type":"string"}
  ]
}

Now I'm reading from the staged file.

SELECT t.$1
FROM @my_stage/my_file.avro
  ( FILE_FORMAT => 'MY_AVRO' )
  t
;

I'm getting this error:

AVRO file format can produce one and only one column of type variant, object, or array. Load data into separate columns using the MATCH_BY_COLUMN_NAME copy option or copy with transformation.

I tried to find out what Snowflake sees in this file, using functions like INFER_SCHEMA and METADATA$FILE_ROW_NUMBER, but I always got the same error.

Is it possible to load data from such files? How should the AVRO schema be changed if not possible? What Snowflake expects from an AVRO file?

I found no clear instructions in the Snowflake documentation.

Resolution: I was able to load the file when I re-created a stage with a proper table type.

Upvotes: 1

Views: 729

Answers (1)

Lukasz Szozda
Lukasz Szozda

Reputation: 175736

I cannot reproduce it using provided code sample.

Source file: https://learn.microsoft.com/en-us/azure/databricks/_static/examples/episodes.avro

CREATE FILE FORMAT MY_AVRO 
    TYPE=AVRO,
    TRIM_SPACE=TRUE,
    REPLACE_INVALID_CHARACTERS=TRUE;

SELECT t.$1
FROM @"TEST"."PUBLIC"."TEST"/episodes.avro(FILE_FORMAT => 'MY_AVRO') AS t;

Output: enter image description here


Infer schema capability works too:

enter image description here

enter image description here


EDIT:

The error may be caused by misunderstanding how Avro is structured:

SELECT t.$1, t.$2
FROM @"TEST"."PUBLIC"."TEST"/episodes.avro(FILE_FORMAT => 'MY_AVRO') AS t;

AVRO file format can produce one and only one column of type variant, object, or array. Load data into separate columns using the MATCH_BY_COLUMN_NAME copy option or copy with transformation.

There is only single column so there is no $2 etc.

SELECT 
   t.$1:"air_date"::TEXT
  ,t.$1:"doctor"::TEXT
  ,t.$1:"title"::TEXT
  ,METADATA$FILE_ROW_NUMBER
FROM @"TEST"."PUBLIC"."TEST"/episodes.avro(FILE_FORMAT => 'MY_AVRO') AS t;

enter image description here

Upvotes: 0

Related Questions