Reputation: 1083
I have the following schema:
DROP SCHEMA IF EXISTS s CASCADE;
CREATE SCHEMA s;
CREATE TABLE "s"."t1"
(
"c1" BigSerial PRIMARY KEY,
"c2" BigInt NOT NULL,
"c3" BigInt
)
WITH (OIDS=FALSE);
INSERT INTO s.t1 (c2, c3) VALUES (10, 100);
INSERT INTO s.t1 (c2, c3) VALUES (20, 200);
INSERT INTO s.t1 (c2, c3) VALUES (30, 300);
INSERT INTO s.t1 (c2, c3) VALUES (40, 400);
PREPARE updateplan (BigInt, BigInt) AS
update s.t1
SET c3 = $2
WHERE c2 = $1;
EXECUTE updateplan (20, 250);
PREPARE updatearrayplan(BigInt[], BigInt[]) AS
for i in size($1)
DO
update s.t1
SET c3 = $2[$i]
WHERE c2 = $1[$i]
END FOR
EXECUTE updatearrayplan({20, 30}, {275, 375})
/* 20, 200 -> 20, 275 */
/* 30, 300 -> 30, 375 */
After execution of updatearrayplan I am expecting the rows to have these values 20 -> 275 , 30 -> 375
Is there a way to update multiple rows with different column values passed in as array. Also is there a guarantee that the order of the arrays will be maintained.
Upvotes: 0
Views: 1159
Reputation: 36127
Try:
WITH arrays AS(
SELECT * from
unnest(
ARRAY[20, 30],
ARRAY[275, 375]
) as xy(x,y)
)
UPDATE t1
SET c3 = a.y
FROM arrays a
WHERE c2 = a.x;
See a description of unnest
function here: click
EDIT
@kordiroko Sorry. I tried out the whole day modifying your solution. Couldn't make it work.
Could be that you have an older PostgreSQL version. I tested it on version 9.5, it took me only a couple of minutes to get it work, just copy/paste and change two parameters in the query:
create table t1(
c2 BIGINT,
c3 bigint
);
insert into t1( c2, c3 )
select x, x * 100
from generate_series( 1,1000000 ) x;
CREATE OR REPLACE FUNCTION updatefunc1(BigInt[], BigInt[])
RETURNS void as $$
BEGIN
FOR i IN array_lower($1, 1) .. array_upper($1, 1)
LOOP
update t1
SET c3 = $2[i]
WHERE c2 = $1[i];
END LOOP;
END;
$$
LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION updatefunc2(BigInt[], BigInt[])
RETURNS void as $$
BEGIN
WITH arrays AS(
SELECT * from
unnest( $1, $2 ) as xy(x,y)
)
UPDATE t1
SET c3 = a.y
FROM arrays a
WHERE c2 = a.x;
END;
$$
LANGUAGE plpgsql;
select updatefunc1(ARRAY[20], ARRAY[275]);
select updatefunc2(ARRAY[30], ARRAY[555]);
select * from t1 where c2 in (20,30);
Let me know if this is correct or there is a better solution.
It is very correct but ... it's a little slow.
I tested your function for only 100 records:
select updatefunc1(
array( select * from generate_series(1,100)),
array( select 22222 from generate_series(1,100))
);
it took over 12 seconds:
Result (cost=20.00..20.31 rows=1 width=0) (actual time=12259.095..12259.096 rows=1 loops=1) Output: updatefunc1(($0)::bigint[], ($1)::bigint[]) InitPlan 1 (returns $0)
Now compare it to my function, but for 100.000 records:
select updatefunc2(
array( select * from generate_series(1,100000)),
array( select 22222 from generate_series(1,100000))
);
a result is 1 second 150 ms:
Result (cost=20.00..20.31 rows=1 width=0) (actual time=1150.018..1150.123 rows=1 loops=1) Output: updatefunc2(($0)::bigint[], ($1)::bigint[]) InitPlan 1 (returns $0)
The above results mean, that your function is :
( 12 / 100 ) / ( 1.150 / 100000 ) = 10434,78
times slooooooooooooooooooooooooooweeeeeeeeeeeeeeeeeeeeeeer,
and in % this is only 1043400 % slower
EDIT 2
My version is 9.2.15. It throws up syntax errors
Below is a version that should work on earlier verions of PostgreSQL:
CREATE OR REPLACE FUNCTION updatefunc3(BigInt[], BigInt[])
RETURNS void as $$
BEGIN
WITH arrays AS(
SELECT arr1[ rn ] as x, arr2[ rn ] as y
FROM (
SELECT $1 as arr1, $2 as arr2, generate_subscripts($1, 1) As rn
) x
)
UPDATE t1
SET c3 = a.y
FROM arrays a
WHERE c2 = a.x;
END;
$$
LANGUAGE plpgsql;
select updatefunc3(ARRAY[40,82,77], ARRAY[333,654]);
select * from t1 where c2 in (40,82,77);
And a speed test of uptadint 100,000 rows is:
select updatefunc3(
array( select * from generate_series(1,100000)),
array( select 22222 from generate_series(1,100000))
);
Result (cost=20.00..20.31 rows=1 width=0) (actual time=1361.358..1361.460 rows=1 loops=1) Output: updatefunc3(($0)::bigint[], ($1)::bigint[]) InitPlan 1 (returns $0)
The time of updating 100k rows is below 1.5 seconds
EDIT 3
@kordiko : Could you please tell me why your query is so much better. My function goes through each row and updates the elements one by one. Your function also appears to do the same. Is it that all the equivalent rows are updated ed simultaneously in your query.
This is because my function runs only one update command regardless of number of elements in arrays, while your function updates elements one by one - for 100 elements it runs 100 update commands. For 1000 elements it runs 1000 update commands.
I've done my tests on a table with 1000000 rows, but without any index. In my function the update reads the table content only once (doing a full table scan), and update matching row. Your function performs 100 updates, and each one does a full table scan.
If you create and index on col2
, then the speed of your function inceases dramatically, see the below test (please note that a number of elements in this test is increased from 100 to 100000:
create INDEX t1_c2_ix on t1( c2 );
select updatefunc1(
array( select * from generate_series(1,100000)),
array( select 22222 from generate_series(1,100000))
);
Result (cost=20.00..20.31 rows=1 width=0) (actual time=**3430.536**..3430.636 rows=1 loops=1)
Output: updatefunc1(($0)::bigint[], ($1)::bigint[])
InitPlan 1 (returns $0)
Now a time is only about 3,5 second.
And a test of my function after creating the index:
select updatefunc3(
array( select * from generate_series(1,100000)),
array( select 22222 from generate_series(1,100000))
);
Result (cost=20.00..20.31 rows=1 width=0) (actual time=1270.619..1270.724 rows=1 loops=1) Output: updatefunc3(($0)::bigint[], ($1)::bigint[]) InitPlan 1 (returns $0)
The time remains the same, but is still 100% faster than your function.
Upvotes: 1
Reputation: 1083
My answer:
CREATE OR REPLACE FUNCTION s.updatefunc1(BigInt[], BigInt[])
RETURNS void as $$
BEGIN
FOR i IN array_lower($1, 1) .. array_upper($1, 1)
LOOP
update s.t1
SET c3 = $2[i]
WHERE c2 = $1[i];
END LOOP;
END;
$$
LANGUAGE plpgsql;
select s.updatefunc1(ARRAY[20], ARRAY[275]);
This does work. I get my desired answer:
SELECT c2, c3 FROM s.t1;
c2 | c3
----+-----
10 | 100
30 | 300
40 | 400
20 | 275 --> Updated
Let me know if this is correct or there is a better solution.
Upvotes: 0