jay
jay

Reputation: 1463

Extracting values from an array using snowflake

I have data stored in following format in one of my columns;

[
{
  "arrival_date": "2022-02-15T08:00:00.000Z",
  "cargo_available_timestamp": "2022-02-16T13:00:00.000Z",
  "cargo_type": "unable_to_provide",
  "carton_count": null,
  "lfd": "2022-02-17T08:00:00.000Z"
},
{
  "arrival_date": "2022-02-16T08:00:00.000Z",
  "flight_status": "in_transit",
  "flight_status_other": null
}
 ]

I am trying to extract he value for lfd as shown in the data, using following code;

select col1,
       json_extract_path_text(get(col1,0),'lfd') as value
from table

But it seems the get() command is not getting an array. And I am getting following error:

SQL compilation error: error line 4 at position 20 Invalid argument types for 
 function 'GET': (VARCHAR(16777216), NUMBER(1,0))

When I checked the data type for col1 it is varchar. May I know how can I parse this varchar to extract value for lfd. thanx

Upvotes: 2

Views: 10174

Answers (1)

Simeon Pilgrim
Simeon Pilgrim

Reputation: 25903

So with a CTE to provide fake data, and parse the JSON for us:

WITH fake_data AS (
    SELECT parse_json(column1) as json
    FROM VALUES 
    ('[
    {
      "arrival_date": "2022-02-15T08:00:00.000Z",
      "cargo_available_timestamp": "2022-02-16T13:00:00.000Z",
      "cargo_type": "unable_to_provide",
      "carton_count": null,
      "lfd": "2022-02-17T08:00:00.000Z"
    },
    {
      "arrival_date": "2022-02-16T08:00:00.000Z",
      "flight_status": "in_transit",
      "flight_status_other": null
    }
     ]')
 )
 select 
    json[0] as array_0
    ,json[1] as array_1
    ,array_0:lfd as lfd_0
    ,array_1:lfd as lfd_1
 from fake_data;

we get:

ARRAY_0 ARRAY_1 LFD_0 LFD_1
{ "arrival_date": "2022-02-15T08:00:00.000Z", "cargo_available_timestamp": "2022-02-16T13:00:00.000Z", "cargo_type": "unable_to_provide", "carton_count": null, "lfd": "2022-02-17T08:00:00.000Z" } { "arrival_date": "2022-02-16T08:00:00.000Z", "flight_status": "in_transit", "flight_status_other": null } "2022-02-17T08:00:00.000Z"

Thus if you know you JSON array will always be in order you can use:

 select 
    json[0]:lfd as lfd
    ,to_timestamp_ntz(lfd) as lfd_as_timestamp
 from fake_data;
LFD LFD_AS_TIMESTAMP
"2022-02-17T08:00:00.000Z" 2022-02-17 08:00:00.000

Now if you are not always sure the order of you array, or you need to pick an array element, you will want to FLATTEN the array.

WITH fake_data AS (
    SELECT parse_json(column1) as json
    FROM VALUES 
    ('[
    {
      "arrival_date": "2022-02-15T08:00:00.000Z",
      "cargo_available_timestamp": "2022-02-16T13:00:00.000Z",
      "cargo_type": "unable_to_provide",
      "carton_count": null,
      "lfd": "2022-02-17T08:00:00.000Z"
    },
    {
      "arrival_date": "2022-02-16T08:00:00.000Z",
      "flight_status": "in_transit",
      "flight_status_other": null
    }
     ]')
     ,('[
       {
      "arrival_date": "2022-02-16T08:00:00.000Z",
      "flight_status": "in_transit",
      "flight_status_other": null
    },
    {
      "arrival_date": "2022-02-15T08:00:00.000Z",
      "cargo_available_timestamp": "2022-02-16T13:00:00.000Z",
      "cargo_type": "unable_to_provide",
      "carton_count": null,
      "lfd": "2022-02-18T08:00:00.000Z"
    }
     ]')
 )
 select 
    to_timestamp_ntz(f.value:lfd) as lfd_ntz
 from fake_data d, table(flatten(input=>d.json)) f
 where lfd_ntz is not null;
LFD_NTZ
2022-02-17 08:00:00.000
2022-02-18 08:00:00.000

Upvotes: 1

Related Questions