Mike
Mike

Reputation: 801

Parsing nested JSON fields in Snowflake

I am pretty new to Snowflake and I am now trying to parse a JSON field and pull its attributes to return in the response.

I tried a few variations but every time, the attribute is populating as null.

attributes column in my table has this JSON:

{
    "Status": [
        "ACTIVE"
    ],
    "Coverence": [
        {
            "Sub": [
                {
                    "EndDate": [
                        "2020-06-22"
                    ],
                    "Source": [
                        "Test"
                    ],
                    "Id": [
                        "CovId1"
                    ],
                    "Type": [
                        "CovType1"
                    ],
                    "StartDate": [
                        "2019-06-22"
                    ],
                    "Status": [
                        "ACTIVE"
                    ]
                }
            ]
        }
    ]
}

What I tried:

SELECT DISTINCT *
    from
        (
            TRIM(mt."attributes":Status, '[""]')::string as STATUS,
            TRIM(r.value:"Sub"."Id", '[""]')::string as ID,
            TRIM(r.value:"Sub"."Source", '[""]')::string as SOURCE
            from "myTable" mt,
            lateral flatten ( input => mt."attributes":"Coverence", outer => true) r
        )
    GROUP BY
        STATUS,
        ID,
        SOURCE;

Later I tried:

SELECT DISTINCT *
    from
        (
            TRIM(mt."attributes":Status, '[""]')::string as STATUS,
            TRIM(r.value:"Id", '[""]')::string as ID,
            TRIM(r.value:"Source", '[""]')::string as SOURCE
            from "myTable" mt,
            lateral flatten ( input => mt."attributes":"Coverence":"Sub", outer => true) r
        )
    GROUP BY
        STATUS,
        ID,
        SOURCE;

But nothing worked. The STATUS is populating as expected. But ID and SOURCE are populating null.

Am I missing something or have I done something dumb? Please shed some light.

Upvotes: 2

Views: 7161

Answers (4)

sol25
sol25

Reputation: 149

If you know the schema in advance it should be quite easy but cumbersome. You need to basically flatten all arrays to the lowest level as in (not complete):

select status.value as status, end_date.value as end_date from json, 
lateral flatten(json_field:Status) as status,
lateral flatten(json_field:Coverence) as coverence,
lateral flatten(coverence.value:Sub) as sub,
lateral flatten(sub.value:EndDate) as end_date

where json is simply select parse_json(<your json content>) as json_field.

If you would like to try a more dynamic schema inference you can try using recursive => true attribute for flatten function (ref: https://docs.snowflake.com/en/sql-reference/functions/flatten).

Upvotes: 0

Nick S
Nick S

Reputation: 11

A bit late but for anyone who comes across this in the future, wouldn't really know how to accomplish this SQL-wise, but seeing its JSON from an external API I have used this tool called GetOurData. I know they do have pre-built connectors for api's but If you contact their Dev team (what I had to do) they'll make you a custom connection for your API. I think they're still in beta so it should be free as well.

Upvotes: 0

H Roy
H Roy

Reputation: 635

All your elements are array type and the overall JSON is not making much sense... so to access all your individual elements, you have to use [] notation and then you can access the element values. You don't need to use flatten also, if you just have to access individual elements via index.

Upvotes: 0

Lukasz Szozda
Lukasz Szozda

Reputation: 175556

Assuming that Coverence could contain multiple Sub, therefore FLATTEN twice. At lowest level only first element is chosen (EndDate[0], Source[0] etc):

SELECT 
   mt."attributes":Status[0]::TEXT AS Status
  ,r2.value:EndDate[0]::TEXT AS EndDate
  ,r2.value:Source[0]::TEXT AS Source
  ,r2.value:Id[0]::TEXT AS Id
FROM myTable AS mt,
LATERAL FLATTEN(input => mt."attributes",
                path => 'Coverence',
                outer => true) r1,
LATERAL FLATTEN(input => r1.value,
                path => 'Sub',
                outer => true) r2; 

Output:

enter image description here

Upvotes: 5

Related Questions