Reputation: 2457
Have a Athena table like this:
userid visited
001a ['New York','Los Angeles']
002b ['Atlanta']
0030 ['New York','Paris','Albany']
....
Not quite sure how to query such list type column. How could I find
0001a
00030
'New York'
'Los Angeles'
'Paris'
'Atlanta'
'Albany'
Upvotes: 1
Views: 2088
Reputation: 143088
Assuming that visited
is array type from presto:
WITH dataset AS (
SELECT *
FROM (
VALUES ('001a', ARRAY['New York','Los Angeles']),
('002b', ARRAY['Atlanta', 'Los Angeles'])
) AS t (id, visited))
For the first one use contains
function:
select id
from dataset
where contains(visited, 'Atlanta'); -- returns 002b
for the second one you can use set_union
function, but it is not available for me in Athena (or I'm using it wrong =), so you can workaround with:
select array_distinct(flatten(array_agg(visited)))
from dataset; -- returns `{New York,Los Angeles,Atlanta}`
If visited
is a string you can turn it into array via casting it to JSON and then to array.
Upvotes: 1