Reputation: 187
I have a table field with a value like this:
{
"0fc8a2a1-e334-43b8-9311-ce46da9cd32c": {
"alert": "345",
"channel": "ios_push",
"name": "Variant 1"
},
"4344d89b-7f0d-4453-b2c5-d0d4a39d7d25": {
"channel": "ios_push",
"name": "Control Group",
"type": "control"
}
}
I want to know if there is some way to get the "0fc8a2a1-e334-43b8-9311-ce46da9cd32c" and "4344d89b-7f0d-4453-b2c5-d0d4a39d7d25" values.
Upvotes: 4
Views: 5362
Reputation: 3149
Although a year have past since this question was opened, but I found how to use it using Redshift native capabilities.
Redshift has a SUPER column type that stores native JSONs: https://docs.aws.amazon.com/redshift/latest/dg/r_SUPER_type.html
And can be queried using PartiQL:
https://docs.aws.amazon.com/redshift/latest/dg/query-super.html
https://partiql.org/tutorial.html
Assuming a table called MyTable
with a column named data
of type SUPER that can store JSON, I have created the following query:
SELECT
key
FROM
MyTable AS t,
UNPIVOT t.data AS value AT key;
x AS y AT z
means "foreach Y in X"
where z
is the index of object y
inside an array x
x
and y
, but z
is the key, instead of the index (When you think about it, also array can be represented in an Object manner, where the index is the key: `{0: 'a', 1: 'b', ... }The query results are:
create temp table if not exists MyTable(
data SUPER
);
insert into MyTable VALUES (json_parse('{"a": 1, "b":2}'));
select key from MyTable as t, unpivot t.data as value at key;
Upvotes: 9
Reputation: 5084
Redshift is not good with JSON, and especially not good with arbitrary JSON keys (as @GMB mentioned). Also it's not good with nested data structures.
So actually, you have 2 problems:
CROSS JOIN
with seq
table in queries belowbelow) - described in this SO answer.you can implement json parsing in python and register it as a user defined function https://docs.aws.amazon.com/redshift/latest/dg/udf-python-language-support.html
function:
create or replace function f_py_json_keys (a varchar(65535))
returns varchar(65535)
stable
as $$
import json
return ",".join(json.loads(a).keys())
$$ language plpythonu;
query:
with input(json) as (
select '{
"0fc8a2a1-e334-43b8-9311-ce46da9cd32c": {
"alert": "345",
"channel": "ios_push",
"name": "Variant 1"
},
"4344d89b-7f0d-4453-b2c5-d0d4a39d7d25": {
"channel": "ios_push",
"name": "Control Group",
"type": "control"
}
}'::varchar
), seq(idx) as (
select 1 UNION ALL
select 2 UNION ALL
select 3 UNION ALL
select 4 UNION ALL
select 5
), input_with_occurences as (
select f_py_json_keys(json) as keys,
regexp_count(keys, ',') + 1 as number_of_occurrences
from input
)
select
split_part(keys, ',', idx) as id
from input_with_occurences cross join seq
where idx <= number_of_occurrences
Redshift has some regex functions. That's a working example that would do the job for the payload you specified:
with input(json) as (
select '{
"0fc8a2a1-e334-43b8-9311-ce46da9cd32c": {
"alert": "345",
"channel": "ios_push",
"name": "Variant 1"
},
"4344d89b-7f0d-4453-b2c5-d0d4a39d7d25": {
"channel": "ios_push",
"name": "Control Group",
"type": "control"
}
}'::varchar
), seq(idx) as (
select 1 UNION ALL
select 2 UNION ALL
select 3 UNION ALL
select 4 UNION ALL
select 5
), input_with_occurences as (
select *,
regexp_count(json,
'\\{?\\"([0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})\\":\\s\\{[\\w\\s\\":,]+\\}') as number_of_occurrences
from input
)
select
REGEXP_SUBSTR(json, '\\{?\\"([0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})\\":\\s\\{[\\w\\s\\":,]+\\}', 1, idx, 'e') as id
from input_with_occurences cross join seq
where idx <= number_of_occurrences
result looks like:
+------------------------------------+
|id |
+------------------------------------+
|0fc8a2a1-e334-43b8-9311-ce46da9cd32c|
|4344d89b-7f0d-4453-b2c5-d0d4a39d7d25|
+------------------------------------+
Upvotes: 4