SimonB
SimonB

Reputation: 1055

Amazon Athena parsing JSON

Struggling to parse some json. This is the format, where there are no fixed names/keys - everything is dynamic.

{ "{condition-operator}" : { "{condition-key}" : "{condition-value}" }}

An an example of values:

{
  "bool":{"aws:viaawsservice":"true"},
  "stringequals":{
        "ec2:createaction":[
            "CreateSecurityGroup",
            "CreateVolume",
            "CreateSnapshot",
            "RunInstances"
        ]
    }
}

I've managed to extract the 'operator' and 'key' values. (See below)

However, my result for 'values' is problematic.

One value is 'true', the other ["CreateSecurityGroup","CreateVolume","CreateSnapshot","RunInstances"]

Neither of which I seem able to use or cast as an UNNESTable array.

To be honest, getting woefully lost in what's going on !! I need to be able to unnest these, to get 1 row per value (so 5 values/rows in total)

Any guidance appreciated !

with cte as (
  select '{"bool":{"aws:viaawsservice":"true"},"stringequals":{"ec2:createaction":["CreateSecurityGroup","CreateVolume","CreateSnapshot","RunInstances"]}}'
    as sample
)
select 
  ,ct.ct as condition_operator
  ,map_keys(cast(ct.cb as map<varchar,json>))[1] as condition_key
  , map_values(cast(ct.cb as map<varchar,json>))[1] as condition_values
from 
  cte
  CROSS JOIN UNNEST(map_keys(cast(json_parse(cte.sample)as map<varchar,json>)),map_values(cast(json_parse(cte.sample)as map<varchar,json>))) ct(ct,cb)
  -- CROSS JOIN UNNEST( ## something here ##) as values(v)



condition_ope..   condition_key       condition_values 
(string(255))    (string(255))       (json) 
bool             aws:viaawsservice   "true" 
stringequals     ec2:createaction ["CreateSecurityGroup","CreateVolume","CreateSnapshot","RunInstances"]

Upvotes: 1

Views: 564

Answers (1)

Guru Stron
Guru Stron

Reputation: 143203

You can use try, which results in null in case of failure, and attempt to cast data to array of varchar and fallback to either cast to varchar (which will fail in case of json object in value) or just using json_format:

select ct.ct as condition_operator,
    ct_key,
    ct_value
from cte
    CROSS JOIN UNNEST(
        map_keys(cast(json_parse(cte.sample) as map < varchar, json >)),
        map_values(cast(json_parse(cte.sample) as map < varchar, json >))
    ) ct(ct, cb)
    CROSS JOIN UNNEST(
        map_keys(cast(ct.cb as map < varchar, json >)),
        map_values(cast(ct.cb as map < varchar, json >))
    ) ct1(ct_key, ct_value_json)
    CROSS JOIN UNNEST(
        coalesce(try(cast(ct_value_json as array < varchar >)),array [ json_format(ct_value_json) ]
        )
    ) ct2(ct_value)

Output:

condition_operator ct_key ct_value
bool aws:viaawsservice true
stringequals ec2:createaction CreateSecurityGroup
stringequals ec2:createaction CreateVolume
stringequals ec2:createaction CreateSnapshot
stringequals ec2:createaction RunInstances

Upvotes: 1

Related Questions