Reputation: 356
I have a Kinesis stream that dumps raw data into redshift as it is in JSON format. Below is the initial data structure and sample.
ApproximateArrivalTimestamp | data
--------------------------------------------------------------------------------------
2022-08-04 10:18:21.075 | {"A" : "value of A" , "B" : "Value of B" , "C" : { "D" : "Value of D"}}
My requirement is to process incoming data periodically and put this JSON data into another table separated by columns utilizing schedulers.
Target Table
ApproximateArrivalTimestamp | A | B | D
--------------------------------------------------------------------------------
2022-08-04 10:18:21.075 | value of A | value of B | value of D
There are 2 questions here,
ApproximateArrivalTimestamp
, but is there a better way?json_extract_path_text
function to extract data from each row in a loop or something?Any help or sample query based on the mentioned table will be appreciated.
Thanks for your help in advance !
Upvotes: 2
Views: 371
Reputation: 11092
As for #1 - a timestamp is a common way to do this. An exact answer depends on your data solution. Does reprocessing a row cause an error or just an inefficiency? Etc
For #2 - I assume that the json keys match the columns already established in the target table. Looping is not a good way to think about SQL processing, rather joining your json with the list of target table columns will extract the data you are desiring.
Upvotes: 0