Reputation: 477
I have a table source_aepsfinswitch_tmp_details
containing a very large dataset. My goal is to partition this table into two partitions based on the reconciliation_date_time
column:
reconciliation_date_time
is NULL
, the row should go to one partition.reconciliation_date_time
is NOT NULL
, the row should go to another partition.Due to the huge size of the data, I want to avoid using INSERT INTO
statements to move the data manually. Instead, I tried the following approach:
BEGIN;
CREATE TABLE IF NOT EXISTS public.source_aepsfinswitch_tmp_details_new (
LIKE public.source_aepsfinswitch_tmp_details EXCLUDING CONSTRAINTS,
PRIMARY KEY (uniqueid, reconciliation_date_time)
) PARTITION BY RANGE (reconciliation_date_time);
ALTER TABLE public.source_aepsfinswitch_tmp_details RENAME TO source_aepsfinswitch_tmp_details_old;
ALTER TABLE public.source_aepsfinswitch_tmp_details_new RENAME TO source_aepsfinswitch_tmp_details;
ALTER TABLE public.source_aepsfinswitch_tmp_details ATTACH PARTITION public.source_aepsfinswitch_tmp_details_old DEFAULT;
COMMIT;
However, when I run this, I get the following error:
SQL Error [42804]: ERROR: column "reconciliation_date_time" in child table must be marked NOT NULL
How can I achieve my goal without hitting this error? I tried both LIST and RANGE, but neither worked. Is there a way to partition the table as described without requiring the column reconciliation_date_time
to be NOT NULL
? Is there any way to split the table into partitions without using INSERT INTO
?
Thanks to Laurenz Albe's suggestion, I was able to attach the partition. However, how can I create two tables based on the parent table? I am getting the following error when executing the query below:
CREATE TABLE IF NOT EXISTS public.source_aepsfinswitch_tmp_details_new (LIKE public.source_aepsfinswitch_tmp_details EXCLUDING CONSTRAINTS) PARTITION BY LIST (reconciliation_date_time);
ALTER TABLE public.source_aepsfinswitch_tmp_details RENAME TO source_aepsfinswitch_tmp_details_old;
ALTER TABLE public.source_aepsfinswitch_tmp_details_new RENAME TO source_aepsfinswitch_tmp_details;
ALTER TABLE public.source_aepsfinswitch_tmp_details ATTACH PARTITION public.source_aepsfinswitch_tmp_details_old DEFAULT;
CREATE TABLE IF NOT EXISTS public.source_aepsfinswitch_tmp_details_null PARTITION OF public.source_aepsfinswitch_tmp_details FOR VALUES IN (NULL);
SQL Error [23514]: ERROR: updated partition constraint for default partition "source_aepsfinswitch_tmp_details_old" would be violated by some row
I think this issue is caused by the primary key. Creating a new partition of the same transactions as in other partitions in PostgreSQL may be causing the error mentioned above. Is there a way to handle this?
I was also getting an overlapping error suggesting that the data present in the default partition must not be present in the new partition. I came across the limitation of PostgreSQL that it does not support splitting based on NULL, NOT NULL values in Range. So I tried to use the List partitioning method and also tried to create a new partition table based on TRUE and FALSE values of the column is_processed
. But I am getting the overlapping error or row violation error.
Is there any other way to split a table into two partitions based on a column value? One approach I was thinking of was to create new tables based on WHERE conditions.
Upvotes: 1
Views: 65
Reputation: 247235
If you create the partition source_aepsfinswitch_tmp_details_null
for the rows where reconciliation_date_time
is NULL, the partition constraint (the boundaries) for the default partition have to be changed to exclude NULL values. But that table contains some rows where reconciliation_date_time
is NULL, which is why you get the error.
Before attaching source_aepsfinswitch_tmp_details_null
, you have to delete the rows with reconciliation_date_time IS NULL
from the default partition and insert them into source_aepsfinswitch_tmp_details_null
. That doesn't happen automatically.
Upvotes: 1