Purushottam Nawale
Purushottam Nawale

Reputation: 477

How to Split a PostgreSQL Table into Partitions by a Nullable Column Without Using INSERT INTO?

I have a table source_aepsfinswitch_tmp_details containing a very large dataset. My goal is to partition this table into two partitions based on the reconciliation_date_time column:

  1. If reconciliation_date_time is NULL, the row should go to one partition.
  2. If reconciliation_date_time is NOT NULL, the row should go to another partition.

Due to the huge size of the data, I want to avoid using INSERT INTO statements to move the data manually. Instead, I tried the following approach:

BEGIN;
CREATE TABLE IF NOT EXISTS public.source_aepsfinswitch_tmp_details_new (
    LIKE public.source_aepsfinswitch_tmp_details EXCLUDING CONSTRAINTS,
    PRIMARY KEY (uniqueid, reconciliation_date_time)
) PARTITION BY RANGE (reconciliation_date_time);

ALTER TABLE public.source_aepsfinswitch_tmp_details RENAME TO source_aepsfinswitch_tmp_details_old;
ALTER TABLE public.source_aepsfinswitch_tmp_details_new RENAME TO source_aepsfinswitch_tmp_details;
ALTER TABLE public.source_aepsfinswitch_tmp_details ATTACH PARTITION public.source_aepsfinswitch_tmp_details_old DEFAULT;
COMMIT;

However, when I run this, I get the following error:

SQL Error [42804]: ERROR: column "reconciliation_date_time" in child table must be marked NOT NULL

How can I achieve my goal without hitting this error? I tried both LIST and RANGE, but neither worked. Is there a way to partition the table as described without requiring the column reconciliation_date_time to be NOT NULL? Is there any way to split the table into partitions without using INSERT INTO?


Thanks to Laurenz Albe's suggestion, I was able to attach the partition. However, how can I create two tables based on the parent table? I am getting the following error when executing the query below:

CREATE TABLE IF NOT EXISTS public.source_aepsfinswitch_tmp_details_new (LIKE public.source_aepsfinswitch_tmp_details EXCLUDING CONSTRAINTS) PARTITION BY LIST (reconciliation_date_time);    
ALTER TABLE public.source_aepsfinswitch_tmp_details RENAME TO source_aepsfinswitch_tmp_details_old;
ALTER TABLE public.source_aepsfinswitch_tmp_details_new RENAME TO source_aepsfinswitch_tmp_details;
ALTER TABLE public.source_aepsfinswitch_tmp_details ATTACH PARTITION public.source_aepsfinswitch_tmp_details_old DEFAULT;
CREATE TABLE IF NOT EXISTS public.source_aepsfinswitch_tmp_details_null PARTITION OF public.source_aepsfinswitch_tmp_details FOR VALUES IN (NULL);

SQL Error [23514]: ERROR: updated partition constraint for default partition "source_aepsfinswitch_tmp_details_old" would be violated by some row

I think this issue is caused by the primary key. Creating a new partition of the same transactions as in other partitions in PostgreSQL may be causing the error mentioned above. Is there a way to handle this?

I was also getting an overlapping error suggesting that the data present in the default partition must not be present in the new partition. I came across the limitation of PostgreSQL that it does not support splitting based on NULL, NOT NULL values in Range. So I tried to use the List partitioning method and also tried to create a new partition table based on TRUE and FALSE values of the column is_processed. But I am getting the overlapping error or row violation error.

Is there any other way to split a table into two partitions based on a column value? One approach I was thinking of was to create new tables based on WHERE conditions.

Upvotes: 1

Views: 65

Answers (1)

Laurenz Albe
Laurenz Albe

Reputation: 247235

If you create the partition source_aepsfinswitch_tmp_details_null for the rows where reconciliation_date_time is NULL, the partition constraint (the boundaries) for the default partition have to be changed to exclude NULL values. But that table contains some rows where reconciliation_date_time is NULL, which is why you get the error.

Before attaching source_aepsfinswitch_tmp_details_null, you have to delete the rows with reconciliation_date_time IS NULL from the default partition and insert them into source_aepsfinswitch_tmp_details_null. That doesn't happen automatically.

Upvotes: 1

Related Questions