DiStas
DiStas

Reputation: 21

postgresql cursor is slow on update

As a short foreword, I'm new to postgresql. Further, the postgresql version I need the advice on is 8.1. The reason for that is postgresql 8.1 is the last implemented and supported version of this language by ParAccel.

Postgresql cursor, at least in 8.1, is very slow with a DML operation, such as UPDATE or INSERT (haven't tested DELETE, yet assume it would be the same). This is just an example to demonstrate:

create table tab_cur_DML_test (col_key int,col_dml varchar(50));

populate it with some records from some table:

insert into tab_cur_DML_test (col_key, col_dml)
select card_id, card_no
from card_dim;

tab_cur_DML_test now has few thousand records with two fields only

create or replace function fn_cursor_DML_test() returns void as
$body$
declare
    v_col_key                   card_dim.card_id%type;
    v_col_dml                   card_dim.card_no%type;
    cur                     refcursor;
    v_count_proccessed_recs         bigint := 0 ;
begin
    open cur for select card_id, card_no from card_dim;
    loop
        fetch cur into v_col_key, v_col_dml;
        if not found then exit; end if;

        update tab_cur_DML_test
        set col_dml = v_col_dml
        where col_key = v_col_key;

        v_count_proccessed_recs := v_count_proccessed_recs + 1;

        if v_count_proccessed_recs%10 = 0 then
                raise info '%', v_count_proccessed_recs;
        end if;
    end loop;

end;
$body$
language plpgsql volatile;

After running it:

select * from fn_cursor_DML_test();

the speed comes out to be about one thousands records per 30 seconds.

Again, this is just a plain update which can be done as a set-based operation. I used it here just for simulating the row-by-row processing with the cursor. In a similar real task situation when a row-by-row processing is required, in situations when using plain sql won't do or, otherwise, will be too bulky and/or complex, the use of cursor with such low processing speed simply becomes not a viable option.

I suspect this happens due to the context switch in the database engine. My question is are there any possible workarounds (or just some certain way) to significantly improve row-by-row logic inside postgresql 8.1 cursor, if that matters - in ParAccel (v. 4.0) ?

Thank you!

Stanislav

Upvotes: 2

Views: 1323

Answers (1)

kirylm
kirylm

Reputation: 101

ParAccel is not PostgreSQL. These are two different products with different features designed for different purposes. ParAccel is based on PostgreSQL, but it utilizes columnar storage, MPP and optimizer was completely re-written. It just happened that they kept PL extension (most likely for reason to ease orchestration of ETL in some cases), but they dropped its support. Other competitors like Vertica - do not even have pgPL extension in their products.

The symptom that you are experiencing has nothing to do with PostgreSQL. This is the precise reason why ParAccel does not support pgPL/SQL parser. The database was not designed for row-by-row processing, because its columnar. Single updates/inserts/deletes will take up as much resources as you would perform the same operation on millions of rows. Why do you need to use PL here? Just run an update. ...and read more about columnar databases. If you need to run row-by-row operations and you don't have much choice, ParAccel is not the right product for your need.

http://en.wikipedia.org/wiki/Column-oriented_DBMS
http://www.paraccel.com/resources/resources-2.php#.UdNv0m024V0

Upvotes: 2

Related Questions