Sneha
Sneha

Reputation: 45

How to automate the subpartition creation for future dates on existing partition table

how to create a Unix job/ PL/SQL code which will automate the creation of sub partitions for future dates on a existing partitioned table on half yearly or yearly basis?

I have a table which is list partitioned on source system column and range sub partitioned on create date column. Every time data comes in for future dates, we have to go and create sub partition manually for those dates then data will get loaded, if sub partition doesn't exist the load is failing. I want to automate the process of sub partition creation on this table by creating a Unix job. This job shall create the partitions for future 6 months at start from where ever the table is currently having sub partition. Later on every quarterly cycle it should create sub partition for next 6 months automatically.

Your help is much appreciated. Thank you.

Code:

create table transactions (
  run_date date,
  source_id number,
  source_name VARCHAR2(10),
  region VARCHAR2(3)
)
partition by list (source_name)
subpartition by range (run_date)
subpartition template
(
  subpartition SP1 values less than (to_date ('2020-01-02','YYYY-MM-DD')),
  subpartition SP2 values less than (to_date ('2020-02-02','YYYY-MM-DD')),
  subpartition SP3 values less than (to_date ('2020-03-02','YYYY-MM-DD'))
)
(
  partition P_CA values ('CA')
  (
    subpartition CA_SP1 values less than (to_date ('2020-01-02','YYYY-MM-DD')),
    subpartition CA_SP2 values less than (to_date ('2020-02-02','YYYY-MM-DD')),
    subpartition CA_SP3 values less than (to_date ('2020-03-02','YYYY-MM-DD'))
  ),
  partition P_SP values ('SP')
  (
    subpartition SP_SP1 values less than (to_date ('2020-01-02','YYYY-MM-DD')),
    subpartition SP_SP2 values less than (to_date ('2020-02-02','YYYY-MM-DD')),
    subpartition SP_SP3 values less than (to_date ('2020-03-02','YYYY-MM-DD'))
  )
);

create index idx1 on transactions (source_name);
create index idx2 on transactions (run_date);

Upvotes: 0

Views: 3363

Answers (3)

Wernfried Domscheit
Wernfried Domscheit

Reputation: 59476

I would recommend to do it the other way around, i.e. INTERVAL partition by run_date and subpartition by source_name. Would be this:

create table transactions (
  run_date date,
  source_id number,
  source_name VARCHAR2(10),
  region VARCHAR2(3)
)
PARTITION BY RANGE (run_date) INTERVAL (INTERVAL '1' MONTH)
    SUBPARTITION BY LIST (source_name)
        SUBPARTITION TEMPLATE (
        SUBPARTITION P_CA VALUES ('CA'),
        SUBPARTITION P_SP VALUES ('SP'),
        SUBPARTITION P_OTHERS VALUES (DEFAULT )
        )
(PARTITION VALUES LESS THAN ( DATE '2020-01-02') );

For this table you don't have to write any script/job, Oracle creates partitions and subpartitions automatically while INSERT whenever needed.

I think LOCAL indexes would be better (honestly I don't have the default in my head, so I specify it explicitly):

create index idx1 on transactions (source_name) LOCAL;
create index idx2 on transactions (run_date) LOCAL;

Upvotes: 1

KayaNatsumi
KayaNatsumi

Reputation: 414

Automation of some task is quite large request but below I managed to prepare a procedure that is extending partition subpartitions based on their values stored in DB and can be called at anytime to add new subpartitions that will follow dates and IDs of already existing subpartitions. Now this was prepared on Oracle SQL Live and might need a little adjustment on your database, but this should provide you with enough code to start poking around. At the end there is a call to this created procedure with number of subpartitions that you want to add, this should work until you reach some Oracle limit on partition number or any other limit that may apply.

Now lets get to what this monstrosity does. Basically script checks table user_tab_subpartitions and finds existing subpartitions from which it detects what is the largest number of the subpartition name and extracts last date that was used in this subpartition, next it adds one to the subpartition number and one month to last date plus number of cycles. For testing purposes comment out execute immediate v_sql; to see in output what commands are generated without actually executing them.

Be advised that creating too many subpartitions may in time lead to worse performance. Also never run any code that you don't fully understand in production as this might not work for you as expected!

create or replace procedure add_subpartitions(p_add_sp_count number) is
  c_table_name constant varchar2(250) := 'TRANSACTIONS';
  c_ca_sp_name constant varchar2(10) := 'CA_SP';
  c_sp_sp_name constant varchar2(10) := 'SP_SP';
  c_date_format constant varchar2(10) := 'YYYY-MM-DD';
  v_high_value long;
  v_high_value_char varchar2(4000);
  v_high_value_date date;
  v_sql varchar2(32000);
begin
  -- Unless we are adding subpartitions there is nothing to do
  if p_add_sp_count <= 0 then
    return;
  end if;
  --
  -- Locate existing subpartitions and provide us with their name without number and largest number in name
  -- each partition will have one exact entry about largest subpartititon
  for i in (select table_name,
                   partition_name,
                   partition_position,
                   case 
                     when subpartition_name like c_ca_sp_name||'%'
                       then c_ca_sp_name
                     when subpartition_name like c_sp_sp_name||'%'
                       then c_sp_sp_name
                   end subpartition_name,
                   max(to_number(replace(replace(subpartition_name, c_ca_sp_name, null), c_sp_sp_name, null))) largest_sp_id
              from user_tab_subpartitions
             where table_name = c_table_name
             group by table_name, partition_name, partition_position, case 
                                                                        when subpartition_name like c_ca_sp_name||'%'
                                                                          then c_ca_sp_name
                                                                        when subpartition_name like c_sp_sp_name||'%'
                                                                          then c_sp_sp_name
                                                                      end
             order by partition_position asc) loop
     -- Loop to generate as many subpartitions as we need per partition
     for loop_index in 1 .. p_add_sp_count loop
       -- Get value of subpartition as it is LONG we cannot manipulate its value in parent loop due to group by
       select high_value
         into v_high_value
         from user_tab_subpartitions
        where table_name = i.table_name
          and partition_name = i.partition_name
          and subpartition_name = i.subpartition_name||i.largest_sp_id;
       -- Some mumbo jumbo with LONG data type to get it into VARCHAR2 and get its month value
       v_high_value_char := substr(v_high_value, 1, 4000);
       v_high_value_char := regexp_replace(v_high_value_char, '.+('||regexp_replace(c_date_format, '[YMD]', '[0-9]')||').+', '\1');
       v_high_value_date := to_date(v_high_value_char, c_date_format);
       -- Generate SQL query
       v_sql := 'alter table '||i.table_name||' modify partition '||i.partition_name||' add subpartition '||i.subpartition_name||(i.largest_sp_id + loop_index)||' values less than (to_date('''||to_char((v_high_value_date + (interval '1' month * loop_index)), c_date_format)||''','''||c_date_format||'''))';
       -- Execute SQL query
       dbms_output.put_line(v_sql||';');
       execute immediate v_sql;
     end loop;
   end loop;
end;

And call it as:

begin 
  add_subpartitions(6);
end;
/

Last call can be scheduled in lets say crontab with call like:

echo -e "begin add_subpartitions(6); end;\n/" | sqlplus -S <LOGIN_INFO>

or:

sqlplus -S <LOGIN_INFO> @<path to script>

Upvotes: 0

Chris Saxon
Chris Saxon

Reputation: 9825

You can create a procedure that loops through the partitions, adding subpartitions as needed.

This adds subpartitions with dates ending 6 months from now by adding 6 months to sysdate and using this to generate the upper bounds:

create table transactions (
  run_date date,
  source_id number,
  source_name VARCHAR2(10),
  region VARCHAR2(3)
)
partition by list (source_name)
subpartition by range (run_date)
subpartition template
(
  subpartition SP1 values less than (to_date ('2020-01-02','YYYY-MM-DD')),
  subpartition SP2 values less than (to_date ('2020-02-02','YYYY-MM-DD')),
  subpartition SP3 values less than (to_date ('2020-03-02','YYYY-MM-DD'))
)
(
  partition P_CA values ('CA'),
  partition P_SP values ('SP')
);

insert into transactions 
  values ( date'2020-07-15' , 1, 'CA', 'REG');
  
ORA-14400: inserted partition key does not map to any partition
  
declare
  year_month varchar2(10);
  stmt varchar2(1000);
begin
  year_month := to_char ( add_months ( sysdate, 6 ), 'YYYYMM' );
  
  for ps in (
    select * from user_tab_partitions
    where  table_name = 'TRANSACTIONS'
  ) loop
  
    stmt := '
      alter table transactions
        modify partition ' || ps.partition_name || '
        add subpartition sp' || ps.partition_name || '_' || year_month || q'!
        values less than ( 
          to_date ( '!' || year_month || q'!', 'YYYYMM' ) 
        )!';
        
    dbms_output.put_line ( stmt );
    
    execute immediate stmt;
  
  end loop;
  
end;
/

insert into transactions 
  values ( date'2020-07-15' , 1, 'CA', 'REG');

select partition_name, subpartition_name
from   user_tab_subpartitions
where  table_name = 'TRANSACTIONS';

PARTITION_NAME    SUBPARTITION_NAME   
P_CA              P_CA_SP1             
P_CA              P_CA_SP2             
P_CA              P_CA_SP3             
P_CA              SPP_CA_202101        
P_SP              SPP_SP_202101        
P_SP              P_SP_SP1             
P_SP              P_SP_SP2             
P_SP              P_SP_SP3

Depending on how your load works, there's still the possibility you'll miss dates. To avoid this, add a maxvalue subpartition as a catch-all. Then change the above to a split subpartition operation.

Or - if possible - you can make this much easier by flipping the partition/subpartition columns.

This allows you to use interval partitioning, which handles new partitions for you:

drop table transactions
  cascade constraints purge;
  
create table transactions (
  run_date date,
  source_id number,
  source_name VARCHAR2(10),
  region VARCHAR2(3)
)
partition by range (run_date)
interval ( interval '1' month )
subpartition by list (source_name)
subpartition template
(
  subpartition P_CA values ('CA'),
  subpartition P_SP values ('SP')
)
(
  partition P1 values less than (to_date ('2020-01-02','YYYY-MM-DD'))
);

insert into transactions 
  values ( date'2020-07-15' , 1, 'CA', 'REG');
  
select partition_name, subpartition_name
from   user_tab_subpartitions
where  table_name = 'TRANSACTIONS';

PARTITION_NAME    SUBPARTITION_NAME   
P1                P1_P_CA              
P1                P1_P_SP              
SYS_P1748         SYS_SUBP1747         
SYS_P1748         SYS_SUBP1746  

You'll only need to update the template when you get new source_name values. Again, you can add a default partition as a catch-all to avoid missing any values.

Upvotes: 1

Related Questions