Reputation: 2070
I have a nested JSON structure where I want to unnest one JSON subtree. Either A or B is filled and event.type is marked as 'A' or 'B'. Here's two simplified examples:
{
"event": {
"event_type": "A",
"time": 1599692445083,
"A" : {
"name": "item1",
"revenue": 100
}
}
}
}
{
"event": {
"event_type": "B",
"time": 1599692445083,
"B" : {
"items" : [
{"name": "item2", revenue" : 10},
{"name": "item3", revenue" : 20},
]
}
}
}
}
Query works so far, but I have the issue that UNNEST just works with cross join and thus I'm loosing all "A" events as I would need left join. Please note I do multiple aggregates, so calculating the sum alone on an easier way would be sufficient.
select
from_unixtime( (floor(event.time/1000) / (60 * 60)) *60*60) as event_hour,
count(*) filter(where event.event_type = 'A') as A_items,
count(*) filter(where event.event_type = 'B') as B_items
FROM mydb.event_table
left join unnest(event.B.items) as t(b)
WHERE
year=2020 and month=9 and day=18 and hour=1
GROUP BY
from_unixtime( (floor(event.time/1000) / (60 * 60)) *60*60),
For cross join unnest query executes, but as A does not have an item Array I do not get any entries counted. For left join unnest I got an error message (funny enough it states 'left' there as well):
mismatched input 'where' expecting {'join', 'cross', 'inner', 'left', 'right', 'full', 'natural', 'using', 'on', 'tablesample'}
Is there a way to use left join in unnest in Athena?
Upvotes: 1
Views: 5488
Reputation: 20770
Athena is based on Presto .172
LEFT JOIN
with UNNEST
was added in Presto 319
Add support for INNER and OUTER joins involving UNNEST.
I would assume Athena does not support this until they
In the meantime, you can run latest greatest version of Presto:
Also, as @GMB pointed out, the JOIN
requires the ON
clause (can be as simple as ON true
).
Upvotes: 4
Reputation: 2070
There is a work around for the mentioned use case to not loose entries when UNNESTING using cross join. It looks ugly but does the job until Athena will support newer Presto functionality.
Instead of
left join unnest(event.B.items) as t(b)
one can rewrite the left join to a cross join with coalesce:
cross join unnest(coalesce(event.B.items, array[null])) as t(b)
Upvotes: 2