Reputation: 189
I'm using sql in amazon redshift. I have a column called attributes which looks like this:
[{"name": "Size", "value": "Small"},{"name": "Color", "value": "Blue"},{"name": "Material", "value": "Cotton"}]
or
[{"name": "Material", "value": "silk"},{"name": "Color", "value": "Pink"}]
In order to extract the material value, I'm trying to use regex. I have the initial code such:
select REGEXP_SUBSTR(attributes,'Material.*') as string_value
Result:
Material", "value": "Cotton"}]
and
Material", "value": "silk"},{"name": "Color", "value": "Pink"}]
How can I take this a step further and get just the value of material, so output should look like:
Cotton
silk
I'm new to sql and new to regex so would really appreciate your help.
Upvotes: 1
Views: 187
Reputation: 520898
You should probably not be resorting to using regex to parse JSON content, especially given that your JSON content is nested inside an array. I am admittedly not an expert in the Redshift JSON API, but this SO question was very helpful. Consider the following approach:
CREATE VIEW seq_0_to_3 AS
SELECT 0 AS i UNION ALL
SELECT 1 UNION ALL
SELECT 2
);
WITH exploded_array AS (
SELECT id, JSON_EXTRACT_ARRAY_ELEMENT_TEXT(json_col, seq.i) AS json
FROM yourTable, seq_0_to_3 AS seq
WHERE seq.i < JSON_ARRAY_LENGTH(json_col)
)
SELECT JSON_EXTRACT_PATH_TEXT(json, 'value') AS Material
FROM exploded_array
WHERE JSON_EXTRACT_PATH_TEXT(json, 'name') = 'Material';
This answer assumes that your JSON arrays would always have 3 elements for the size, color, and material. If the count would differ from 3, then update the view seq_0_to_3
above with the correct number of elements.
Upvotes: 1
Reputation: 1269473
You can use regular expressions and capture groups:
select REGEXP_SUBSTR(attributes, '"Material", "value": "([^"]*)"') as string_value
Upvotes: 1