Reputation: 69
I'm new to redshift and facing 'D': 'Serializable isolation violation on table error. This is occurring during the data archive process time frame(mid night load) and impacting the regular batch job data load for same table.
I've written below the archive load process. Is there any best way to avoid such error in my below script?
Can you help? Thanks in advance.
INSERT INTO abc.data_arch
( col1,col2......
)
With cte_365days_older
AS
(
Select * from abc.abc_data_365
WHERE abc_ts < dateadd(day, -365, current_date)
--LIMIT 1
)
SELECT
col1,col2......
FROM cte_365days_older src
WHERE not exists (select 1 from abc.data_arch trgt
where src.abc_hkey = trgt.abc_hkey
)
order by col1,col2
;
DELETE
from abc.abc_data_365
WHERE abc_ts < dateadd(day, -365, current_date);
COMMIT;
VACUUM DELETE ONLY abc.abc_data_365;
COMMIT;
Error:-
'D': 'Serializable isolation violation on table - 342561, transactions forming the cycle are: 2234341, 2034548 (pid:3235)'
Upvotes: 0
Views: 297
Reputation: 909
OK, with the regular batch job data running at the same time, it looks like the conflict is between the transaction which is doing the INSERT, and the VACUUM command running in parallel in another session (as per Max's answer).
Regarding the data architecture, DELETE is a relatively expensive operation, so to avoid the error you might consider an alternative approach:
abc_data_365
table, which only contains the most
recent 365 days of data.Then let your end-users SELECT from the VIEW, not the base table.
Something like this:
CREATE VIEW v_abc_data_365 AS
SELECT * FROM abc_data_365
WHERE abc_ts > DATEADD(DAY, -365, current_date);
Upvotes: 0