Reputation: 49
I've been doing some research on how to replace a subset of string of characters of a single row base on the values of the columns of other rows, but was not able to do so since the update are only for the first row values of the other table. So I'm planning to insert this in a loop in a plpsql function.
Here are the snippet of my tables. Main table:
Table "public.tbl_main"
Column | Type | Modifiers
-----------------------+--------+-----------
maptarget | text |
expression | text |
maptarget | expression
-----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
43194-0 | 363787002:70434600=(386053000:704347000=(414237002:704320005=259470008,704318007=118539007,704319004=50863008),704327008=122592007,246501002=703690001,370132008=30766002)
Look-up table:
Table "public.tbl_values"
Column | Type | Modifiers
-----------------------+--------+-----------
conceptid | bigint |
term | text |
conceptid | term
-----------+------------------------------------------
386053000 | Patient evaluation procedure (procedure)
363787002 | Observable entity (observable entity)
704347000 | Observes (attribute)
704320005 | Towards (attribute)
704318007 | Property type (attribute)
I want to create a function that will replace all numeric values in the tbl_main.expression
columns with their corresponding tbl_values.term
using the tbl_values.conceptid
as the link to each numeric values in the expression string.
I'm stuck currently in the looping part since I'm a newbie in LOOP
of plpgsql. Here is the rough draft of my function.
--create first a test table
drop table if exists tbl_test;
create table tbl_test as select * from tbl_main limit 1;
--
create or replace function test ()
RETURNS SETOF tbl_main
LANGUAGE plpgsql
AS $function$
declare
resultItem tbl_main;
v_mapTarget text;
v_expression text;
ctr int;
begin
v_mapTarget:='';
v_expression:='';
ctr:=1;
for resultItem in (select * from tbl_test) loop
v_mapTarget:=resultItem.mapTarget;
select into v_expression expression from ee;
raise notice 'parameter used: %',v_mapTarget;
raise notice 'current expression: %',v_expression;
update ee set expression=replace(v_expression, new_exp::text, term) from (select new_exp::text, term from tbl_values offset ctr limit 1) b ;
ctr:=ctr+1;
raise notice 'counter: %', ctr;
v_expression:= (select expression from ee);
resultItem.expression:= v_expression;
raise notice 'current expression: %',v_expression;
return next resultItem;
end loop;
return;
end;
$function$;
Any further information will be much appreciated.
My Postgres version:
PostgreSQL 9.3.6 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2, 64-bit
Upvotes: 1
Views: 666
Reputation: 4198
There's also another way, without creating functions... using "WITH RECURSIVE". Used it with lookup talbe of thousands of rows.
You'll need to change following table names and columns to your names:
tbl_main, strsourcetext, strreplacedtext;
lookuptable, strreplacefrom, strreplaceto.
WITH RECURSIVE replaced AS (
(SELECT
strsourcetext,
strreplacedtext,
array_agg(strreplacefrom ORDER BY length(strreplacefrom) DESC, strreplacefrom, strreplaceto) AS arrreplacefrom,
array_agg(strreplaceto ORDER BY length(strreplacefrom) DESC, strreplacefrom, strreplaceto) AS arrreplaceto,
count(1) AS intcount,
1 AS intindex
FROM tbl_main, lookuptable WHERE tbl_main.strsourcetext LIKE '%' || strreplacefrom || '%'
GROUP BY strsourcetext)
UNION ALL
SELECT
strsourcetext,
replace(strreplacedtext, arrreplacefrom[intindex], arrreplaceto[intindex]) AS strreplacedtext,
arrreplacefrom,
arrreplaceto,
intcount,
intindex+1 AS intindex
FROM replaced WHERE intindex<=intcount
)
SELECT strsourcetext,
(array_agg(strreplacedtext ORDER BY intindex DESC))[1] AS strreplacedtext
FROM replaced
GROUP BY strsourcetext
Upvotes: 0
Reputation: 657982
Looping is always a measure of last resort. Even in this case it is substantially cheaper to concatenate a query string using a query, and execute it once:
CREATE OR REPLACE FUNCTION f_make_expression(_expr text, OUT result text) AS
$func$
BEGIN
EXECUTE (
SELECT 'SELECT ' || string_agg('replace(', '') || '$1,'
|| string_agg(format('%L,%L)', conceptid::text, v.term), ','
ORDER BY conceptid DESC)
FROM (
SELECT conceptid::bigint
FROM regexp_split_to_table($1, '\D+') conceptid
WHERE conceptid <> ''
) m
JOIN tbl_values v USING (conceptid)
)
USING _expr
INTO result;
END
$func$ LANGUAGE plpgsql;
Call:
SELECT *, f_make_expression(expression) FROM tbl_main;
However, if not all conceptid
have the same number of digits, the operation could be ambiguous. Replace conceptid
with more digits first to avoid that - ORDER BY conceptid DESC
does that - and make sure that replacement strings do not introduce ambiguity (numbers that might be replaced in the the next step). Related answer with more on these pitfalls:
The token $1
is used two different ways here, don't be misled:
regexp_split_to_table($1, '\D+')
This one references the first function parameter _expr
. You could as well use the parameter name.
|| '$1,'
This concatenates into the SQL string a references to the first expression passed via USING
clause to EXECUTE
. Parameters of the outer function are not visible inside EXECUTE
, you have to pass them explicitly.
It's pure coincidence that $1
(_expr
) of the outer function is passed as $1
to EXECUTE
. Might as well hand over $7
as third expression in the USING
clause ($3
) ...
I added a debug function to the fiddle. With a minor modification you can output the generated SQL string to inspect it:
Here is a pure SQL alternative. Probably also faster:
CREATE OR REPLACE FUNCTION f_make_expression_sql(_expr text)
RETURNS text AS
$func$
SELECT string_agg(CASE WHEN $1 ~ '^\d'
THEN txt || COALESCE(v.term, t.conceptid)
ELSE COALESCE(v.term, t.conceptid) || txt END
, '' ORDER BY rn) AS result
FROM (
SELECT *, row_number() OVER () AS rn
FROM (
SELECT regexp_split_to_table($1, '\D+') conceptid
, regexp_split_to_table($1, '\d+') txt
) sub
) t
LEFT JOIN tbl_values v ON v.conceptid = NULLIF(t.conceptid, '')::int
$func$ LANGUAGE sql STABLE;
In Postgres 9.4 this can be much more elegant with two new features:
ROWS FROM
to replacing the old (weird) technique to sync set-returning functionsWITH ORDINALITY
to get row numbers on the fly reliably:
CREATE OR REPLACE FUNCTION f_make_expression_sql(_expr text)
RETURNS text AS
$func$
SELECT string_agg(CASE WHEN $1 ~ '^\d'
THEN txt || COALESCE(v.term, t.conceptid)
ELSE COALESCE(v.term, t.conceptid) || txt END
, '' ORDER BY rn) AS result
FROM ROWS FROM (
regexp_split_to_table($1, '\D+')
, regexp_split_to_table($1, '\d+')
) WITH ORDINALITY AS t(conceptid, txt, rn)
LEFT JOIN tbl_values v ON v.conceptid = NULLIF(t.conceptid, '')::int
$func$ LANGUAGE sql STABLE;
SQL Fiddle demonstrating all for Postgres 9.3.
Upvotes: 0