cosbor11
cosbor11

Reputation: 16044

AWS Athena: Querying by an attributes of a struct with an array

I crawled data using aws glue to import json data from an s3 folder that contains data where the root braces is an array like this:

[{id: '1', name: 'rick'},{id: '2', name: 'morty'}]

This ends up resulting in a schema like this:

array<struct<expand:string,id:string,name:string>>

How do I query by name in Athena?

If I try this:

SELECT * FROM people_s3_buckets WHERE name = "rick";

I get the following error:

SYNTAX_ERROR: Column 'name' cannot be resolved

Perhaps, there is a way to setup the Glue crawler to add just the elements within the array and avoid the nesting alltogether?

Upvotes: 12

Views: 26576

Answers (1)

Ilya Kisil
Ilya Kisil

Reputation: 2668

In order to query fields of elements within an array, you would need to UNNEST it first. Assuming that structure array<struct<expand:string,id:string,name:string>> corresponds to column members, you would need to do

SELECT
    *
FROM
    people_s3_buckets,
    UNNEST(members) as t(member)
WHERE
    member.name = 'rick'

Note, you need to use single quotes instead fo double quotes.

Here is the official AWS docs on handling arrays in AWS Athena: Querying Arrays

Upvotes: 19

Related Questions