nodev_101
nodev_101

Reputation: 109

Extracting JSON value in redshift

I am trying to get JSON extract from a column value in redshift.

The column value is like:

[{'IDIndex': '0001', 'History': 4, 'Name': '08-SA-21-C1', 'ActiveFlag': 1, 'Category': 3, 'TotalCount': 0, 'Count1': 0, 'Count2': 0, 'Count3': 0, 'Count4': 0, 'Count5': 0}, {'IDIndex': '0002', 'History': 2, 'Name': '09-GA-5-E1', 'ActiveFlag': 1, 'Category': 1, 'TotalCount': 0, 'Count1': 0, 'Count2': 0, 'Count3': 0, 'Count4': 0, 'Count5': 0}, {'IDIndex': '0003', 'History': 2, 'Name': '12-GA-9-C2', 'ActiveFlag': 1, 'Category': 3, 'TotalCount': 0, 'Count1': 0, 'Count2': 0, 'Count3': 0, 'Count4': 0, 'Count5': 0}]

I need output value like:

IDIndex  History
            
0001       4
0002       2
0003       2

I have tried with

JSON_EXTRACT_PATH_TEXT(
  JSON_EXTRACT_ARRAY_ELEMENT_TEXT( <column>, 0)  , 'IDIndex'). 

But still getting below error:

ERROR: JSON parsing error Detail: ----------------------------------------------- error: JSON parsing error code: 8001 context: invalid json array object [{'IDIndex': '0001', 'History': 4, 'Name': '08-SA-21-C1', 'ActiveFlag': 1, 'Category': 3, 'TotalCount': 0, 'Count1': 0, 'Count2': 0, 'Count3': 0, 'Count4': 0, 'Count5': 0}, {'IDIndex': '0002', 'History': 2, 'Name': '09-GA-5-E1', ' query: 897562381 location: funcs_json.hpp:202 process: query5_191_897562381 [pid=30632] ----------------------------------------------- [ErrorId: 1-67b57803-519935642b414283203e6bde]

Can anyone please help on to resolve this issue.

Upvotes: 0

Views: 50

Answers (1)

Bill Weiner
Bill Weiner

Reputation: 11092

JSON requires double quotes. You can make this swap in the text using REPLACE().

=====================UPDATE=====================

It sounds like what you want to do is unnest the array. To do this you will want to cast your JSON to a SUPER data type (after it is in correct JSON format). Then use Redshift’s unnesting capabilities to get what you want.

I cobbled the example below together for you. First off read https://docs.aws.amazon.com/redshift/latest/dg/query-super.html#unnest to understand what is going on. Next you need to set your Redshift session into case sensitive mode so you can reference the JSON values that have mixed case names - first line of the code below.

Your example data is provided in the first CTE. The replace function fixes up your JSON quotes (at least for this simple example) in the second CTE and also casts the JSON to a super.

The top select examines the data values in a few ways with the last 2 columns being the unnested index and history value.

SET enable_case_sensitive_identifier TO true;

with data as (
    select '[{\'IDIndex\': \'0001\', \'History\': 4, \'Name\': \'08-SA-21-C1\', \'ActiveFlag\': 1, \'Category\': 3, \'TotalCount\': 0, \'Count1\': 0, \'Count2\': 0, \'Count3\': 0, \'Count4\': 0, \'Count5\': 0}, {\'IDIndex\': \'0002\', \'History\': 2, \'Name\': \'09-GA-5-E1\', \'ActiveFlag\': 1, \'Category\': 1, \'TotalCount\': 0, \'Count1\': 0, \'Count2\': 0, \'Count3\': 0, \'Count4\': 0, \'Count5\': 0}, {\'IDIndex\': \'0003\', \'History\': 2, \'Name\': \'12-GA-9-C2\', \'ActiveFlag\': 1, \'Category\': 3, \'TotalCount\': 0, \'Count1\': 0, \'Count2\': 0, \'Count3\': 0, \'Count4\': 0, \'Count5\': 0}]'::text as c
),
d2 as ( 
select 
    json_parse(replace(c,chr(39),'"')) as d
from data
)
select d,
    d[0]."IDIndex",
    a."IDIndex", a."History"
from d2 as s, s.d as a;

Upvotes: 0

Related Questions