RGPT
RGPT

Reputation: 574

Optimizing SQL function reading multiple tables (remove UNION ALL)

We have a function in SQL (postgresql) that needs to be reading units from multiple week tables (up to a maximum of units and a maximum of weeks).

The units are in several tables (similar) that are defined in another table. Each table has > 22,000,000 units.

In several cases we will need to search several tables to have enough units for what we need, but in some other we might only need one.

Basically we need a large sample from the most recent weeks for a given store.

We currently have:

CREATE OR REPLACE FUNCTION get_units(_cursor REFCURSOR, 
                                    _obs_time timestamp without time zone, 
                                    _unit_store integer, _unit_limit integer, 
                                    _week_limit integer) RETURNS REFCURSOR
    LANGUAGE plpgsql
    AS $BODY$
DECLARE
    _week_cursor REFCURSOR; 
    _table_name TEXT;
    _query_sql TEXT;
    _command TEXT := '(SELECT ''0001-01-01'' AS obs_time,
                              0::smallint detail
                       WHERE FALSE)';
    _week_count INTEGER;
    _result_count INTEGER := 0;
    _current_unit_limit INTEGER;
BEGIN

    OPEN _week_cursor FOR
        SELECT table_name 
          FROM week_table 
         WHERE create_time < _obs_time 
      ORDER BY create_time desc
         LIMIT _week_limit;

    _current_unit_limit := _unit_limit;  

    LOOP
        FETCH _week_cursor INTO _table_name;
        EXIT WHEN NOT FOUND;

        _query_sql := 
         'FROM ' || _table_name || ' u
         WHERE u.unit_store = ' || _unit_store || ' ';

        EXECUTE 'SELECT count(*) ' || _query_sql || ' LIMIT ' || _current_unit_limit INTO _week_count;

        _result_count := _result_count + _week_count;
        _current_unit_limit := _unit_limit - _result_count;

        IF _week_count > 0 THEN
            _command := _command || ' UNION ALL 
                        (SELECT u.obs_time obs_time,
                             u.detail detail '
                         || _query_sql
                         || ' ORDER BY u.obs_time DESC'
                         || ' LIMIT ' || _week_count || ')';
        END IF;
        IF (_result_count >= _unit_limit) THEN
            EXIT;
        END IF;
    END LOOP;
    CLOSE _week_cursor;
    OPEN _cursor FOR EXECUTE _command;

    RETURN _cursor;
END;
$BODY$;

The results are good in most cases but in some of the worst case scenarios (several weeks with only few records each) we things are slower.

Any suggestion one what can be improved and how? For example, removing the UNION ALL, temporary table? Something that would make a significant change.

Upvotes: 2

Views: 174

Answers (1)

haki
haki

Reputation: 9759

I set up an example based on my comment

declare
   type tt is table of t1%rowtype;
   tt_arr_temp tt;
   tt_arr tt := tt();
   c sys_refcursor;
   cursor c_tables is 
    select  table_name 
    from    user_tables 
    where   table_name in ('T1','T2');
begin
    for x in c_tables loop
        open c for 'select * from '||x.table_name;
        fetch c bulk collect into tt_arr_temp limit 3;
        close c;
        -- union collections
        tt_arr :=  tt_arr multiset union tt_arr_temp;
    end loop;

    -- print -- just for debug
    for i in tt_arr.first .. tt_arr.last loop
        dbms_output.put_line(tt_arr(i).id || ' , ' || tt_arr(i).name);
    end loop;
exception when others then 
    close c;
end;
/

the result will be in tt_arr. you can do what ever you want with it.

This is my output

SQL> select * from t1;

        ID N
---------- -
         1 A
         2 B
         3 C
         4 D
         5 E
         6 F

6 rows selected.

SQL> select * from t2;

        ID N
---------- -
        20 B
        30 C

SQL>
SQL> declare
  2     type tt is table of t1%rowtype;
....
 25  end;
 26  /
1 , A
2 , B
3 , C
20 , B
30 , C

PL/SQL procedure successfully completed.

Upvotes: 1

Related Questions