Little Bobby Tables
Little Bobby Tables

Reputation: 5351

Create AWS Athena table from Parquet file with an array of structs as a column

I am trying to create an AWS Athena table from a Parquet file stored in S3 using the following declaration, for example:

create table "db"."fufu" (
  foo array<
    struct<
      bar: int, 
      bam: int
    >
  >
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
WITH SERDEPROPERTIES ('serialization.format' = '1') 
LOCATION 's3://yada/yada/'
TBLPROPERTIES ('has_encrypted_data'='false');

I consistently getting the following error:

line 3:11: mismatched input '<' expecting {'(', 'array', '>'} (service: amazonathena; status code: 400; error code: invalidrequestexception; request id: ...)

The syntax seems legit, and the file loads perfectly fine using spark's parquet lib, with a struct field of array type of struct type.

Any idea what can cause this error?

Upvotes: 1

Views: 3064

Answers (1)

Davide Stefanutti
Davide Stefanutti

Reputation: 71

You need to remove double quotes from the database name and from the table name. You also need to add external before table.

create external table db.fufu (
  foo array<
    struct<
      bar: int, 
      bam: int
    >
  >
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
WITH SERDEPROPERTIES ('serialization.format' = '1') 
LOCATION 's3://eth-test-ds/test/'
TBLPROPERTIES ('has_encrypted_data'='false');

Upvotes: 3

Related Questions