Reputation: 411
I have a cronjob that is kicked off every night that involves building up a temporary table, dropping the current table on Redshift, and swapping in the temporary table for the old one. More than half of the time, this specific job gets stuck when dropping the existing table and behaving as if there is some pending transaction that is stopping the drop from going through.
This is just one of dozens of jobs that uses the exact same script to run overnight, none of which have ever had this issue; however, there are a few minor differences:
In addition to the fact that I have never seen this on any other job, this issue has been extremely difficult to troubleshoot for the following reasons:
ubuntu
whereas the cronjob is executed from root
.drop
to stall; I've looked high and low on Stack Overflow (this is the most applicable question with answers - redshift drop or truncate table very very slow), the Redshift docs, and otherwise, but nothing I've found has been the answer. When I see that the job is stalled, I've checked the following tables on Redshift and usually find that things are in the following state:
stv_locks
table shows that that there are three processes running, with the lock_status
of "Holding write lock," "Holding delete lock," and "Holding insert lock" respectively. The process ID associated with these is NOT the ID related to the current job.stv_tr_conflict
table shows nothing.stv_recents
table shows the drop
with a status of Running
.svl_qlog
as finished, so that seems to contradict the stv_locks
table.pg_terminate_backend
to stop the associated process does not actually remove the session when querying stv_sessions
, but DOES free up something that allows the job to finish.Any help in figuring out what exactly is going on here would be greatly appreciated!
Upvotes: 2
Views: 3925
Reputation: 339
I faced the same problem, I just rebot RS then it works again normally.
Upvotes: 2