Reputation: 1618
I have an AVRO file with this schema:
{
"type":"record",
"name":"row",
"fields":
[
{"name":"Column1","type":"int"},
{"name":"Column2","type":"string"}
]
}
Now I'm reading from the staged file.
SELECT t.$1
FROM @my_stage/my_file.avro
( FILE_FORMAT => 'MY_AVRO' )
t
;
I'm getting this error:
AVRO file format can produce one and only one column of type variant, object, or array. Load data into separate columns using the MATCH_BY_COLUMN_NAME copy option or copy with transformation.
I tried to find out what Snowflake sees in this file, using functions like INFER_SCHEMA and METADATA$FILE_ROW_NUMBER, but I always got the same error.
Is it possible to load data from such files? How should the AVRO schema be changed if not possible? What Snowflake expects from an AVRO file?
I found no clear instructions in the Snowflake documentation.
Resolution: I was able to load the file when I re-created a stage with a proper table type.
Upvotes: 1
Views: 729
Reputation: 175736
I cannot reproduce it using provided code sample.
Source file: https://learn.microsoft.com/en-us/azure/databricks/_static/examples/episodes.avro
CREATE FILE FORMAT MY_AVRO
TYPE=AVRO,
TRIM_SPACE=TRUE,
REPLACE_INVALID_CHARACTERS=TRUE;
SELECT t.$1
FROM @"TEST"."PUBLIC"."TEST"/episodes.avro(FILE_FORMAT => 'MY_AVRO') AS t;
Infer schema capability works too:
EDIT:
The error may be caused by misunderstanding how Avro is structured:
SELECT t.$1, t.$2
FROM @"TEST"."PUBLIC"."TEST"/episodes.avro(FILE_FORMAT => 'MY_AVRO') AS t;
AVRO file format can produce one and only one column of type variant, object, or array. Load data into separate columns using the MATCH_BY_COLUMN_NAME copy option or copy with transformation.
There is only single column so there is no $2
etc.
SELECT
t.$1:"air_date"::TEXT
,t.$1:"doctor"::TEXT
,t.$1:"title"::TEXT
,METADATA$FILE_ROW_NUMBER
FROM @"TEST"."PUBLIC"."TEST"/episodes.avro(FILE_FORMAT => 'MY_AVRO') AS t;
Upvotes: 0