Reputation: 1967
I'm trying to process some data from s3 logs in Athena that has a complex type I cannot figure out how to work with.
I have a table with rows such as:
data
____
"[{\"k1\":\"value1\", \"key2\":\"value2\"...}]"
I'd like to treat it as (1) an array to extract the first element, and then that first element as the JSON that it is.
Everything is confused because the data naturally is a string, that contains an array, that contains json and I don't even know where to start
Upvotes: 0
Views: 461
Reputation: 12901
You can use the following combination of JSON commands:
SELECT
JSON_EXTRACT_SCALAR(
JSON_EXTRACT_SCALAR('"[{\"k1\":\"value1\", \"key2\":\"value2\"...}]"','$'),
'$[0].k1'
)
The inner JSON_EXTRACT_SCALAR will return the JSON ARRAY [{"k1":"value1", "key2":"value2"...}]
and the outer will return the relevant value value1
Another similar option is to use CAST(JSON :
SELECT
JSON_EXTRACT_SCALAR(
CAST(JSON '"[{\"k1\":\"value1\", \"key2\":\"value2\"...}]"' as VARCHAR),
'$[0].k1'
)
Upvotes: 1