Reputation: 8015
I am currently figuring out how to do a bit more complex data migration in my database and whether it is even possible to do in SQL (not very experienced SQL developer myself).
Let's say that I store JSONs in one of my text columns in a Postgres table wtih roughly the following format:
{"type":"something","params":[{"value":"00de1be5-f75b-4072-ba30-c67e4fdf2333"}]}
Now, I would like to migrate the value
part to a bit more complex format:
{"type":"something","params":[{"value":{"id":"00de1be5-f75b-4072-ba30-c67e4fdf2333","path":"/hardcoded/string"}}]}
Furthermore, I also need to reason whether the value
contains a UUID pattern, and if not, use slightly different structure:
{"type":"something-else","params":[{"value":"not-id"}]} ---> {"type":"something-else","params":[{"value":{"value":"not-id","path":""}}]}
I know I can define a procedure and use REGEX_REPLACE
: REGEXP_REPLACE(source, pattern, replacement_string,[, flags])
but I have no idea how to approach the reasoning about whether the content contains ID or not. Could someone suggest at least some direction or hint how to do this?
Upvotes: 1
Views: 119
Reputation: 71471
To replace value
s at any depth, you can use a recursive CTE
to run replacements for each value of a value
key, using a conditional to check if the value is a UUID, and producing the proper JSON object accordingly:
with recursive cte(v, i, js) as (
select (select array_to_json(array_agg(distinct t.i))
from (select (regexp_matches(js, '"value":("[\w\-]+")', 'g'))[1] i) t), 0, js from (select '{"type":"something","params":[{"value":"00de1be5-f75b-4072-ba30-c67e4fdf2333"}, {"value":"sdfsa"}]}' js) t1
union all
select c.v, c.i+1, regexp_replace(
regexp_replace(c.js, regexp_replace((c.v -> c.i)::text, '[\\"]+', '', 'g'),
case when not ((c.v -> c.i)::text ~ '\w+\-\w+\-\w+\-\w+\-\w+') then
json_build_object('value', regexp_replace((c.v -> c.i)::text, '[\\"]+', '', 'g'), 'path', '')::text
else json_build_object('id', regexp_replace((c.v -> c.i)::text, '[\\"]+', '', 'g'), 'path', '/hardcoded/path')::text end, 'g'),
'(")(?=\{)|(?<=\})(")', '', 'g')
from cte c where c.i < json_array_length(c.v)
)
select js from cte order by i desc limit 1
Output:
{"type":"something","params":[{"value":{"id" : "00de1be5-f75b-4072-ba30-c67e4fdf2333", "path" : "/hardcoded/path"}}, {"value":{"value" : "sdfsa", "path" : ""}}]}
On a more complex JSON input string:
{"type":"something","params":[{"value":"00de1be5-f75b-4072-ba30-c67e4fdf2333"}, {"value":"sdfsa"}, {"more":[{"additional":[{"value":"00f41be5-g75b-4072-ba30-c67e4fdf3777"}]}]}]}
Output:
{"type":"something","params":[{"value":{"id" : "00de1be5-f75b-4072-ba30-c67e4fdf2333", "path" : "/hardcoded/path"}}, {"value":{"value" : "sdfsa", "path" : ""}}, {"more":[{"additional":[{"value":{"id" : "00f41be5-g75b-4072-ba30-c67e4fdf3777", "path" : "/hardcoded/path"}}]}]}]}
Upvotes: 0
Reputation: 3183
You can use jsonb
function for extract data and change them. At the end you should extend data.
Sample data structure and query result: dbfiddle
select
(t.data::jsonb || jsonb_build_object(
'params',
jsonb_agg(
jsonb_build_object(
'value',
case
when e.value->>'value' ~* '^[0-9A-F]{8}-[0-9A-F]{4}-4[0-9A-F]{3}-[89AB][0-9A-F]{3}-[0-9A-F]{12}$' then
jsonb_build_object('id', e.value->>'value', 'path', '/hardcoded/string')
else
jsonb_build_object('value', 'not-id', 'path', '')
end
)
)
))::text
from
test t
cross join jsonb_array_elements(t.data::jsonb->'params') e
group by t.data
PS:
If your table had id
or unique field you can change group by t.data
to do things like that:
select
(t.data::jsonb || jsonb_build_object(
'params',
jsonb_agg(
jsonb_build_object(
'value',
case
when e.value->>'value' ~* '^[0-9A-F]{8}-[0-9A-F]{4}-4[0-9A-F]{3}-[89AB][0-9A-F]{3}-[0-9A-F]{12}$' then
jsonb_build_object('id', e.value->>'value', 'path', '/hardcoded/string')
else
jsonb_build_object('value', 'not-id', 'path', '')
end
)
)
))::text
from
test t
cross join jsonb_array_elements(t.data::jsonb->'params') e
group by t.id
Upvotes: 1