Reputation: 419
TLDR; I'm trying to partition a table such that each partition can be operated on without locking the others, but it's not working.
Background: I'm creating a data pipeline process that exports data from a central corporate database to a SQL Server DB, where it can be used for analytics. I need to export the full table each day (currently deleting the old one, archiving comes later) and it can be almost 10GB.
To make it fast enough to be usable, I break the table up into 15 chunks based on a "plan" column. This is done in the data pipeline so each of the 15 possible values in the "plan" column is treated independently; each plan chunk is pulled from the corporate DB, after which old data for that chunk is deleted in the SQL Server table and the new data is uploaded.
The problem is, doing this in 15 chunks means all the processes block one another. Everything has to wait on everything else, making it all go VERY slow in the end, and often fail due to timeouts.
To address this I've:
According to the answer here this should allow me to "drop or add lots of data in one gulp".
This doesn't seem to be helping though; I still get blocks.
So two questions:
Alternative I've also considered breaking the destination table up into 15 tables, and trying to recombine them with a view. Would that be better? Worse?
Upvotes: 1
Views: 3939
Reputation: 453897
I've encountered similar issues in the past where concurrent workers are processing rows in a single physical partition each and different partitions but still blocking each other.
One reason this can happen is if one of the workers manages to escalate their lock to table level. You would need to examine the waiting tasks during a period of blocking to see if this applies to you.
SQL Server does support lock escalation to partition level but this is not the default (as can cause greater probability of deadlocks for some scenarios).
To enable this option you need to set LOCK_ESCALATION = AUTO
.
If the work you are doing is replacing all the rows in the partition then you should consider Charlieface's comment and doing this in a new empty table per partition and using TRUNCATE TABLE ... WITH (PARTITIONS ())
or ALTER TABLE ... SWITCH
to clear the old data and ALTER TABLE ... SWITCH
to bring in the new data (both should generally be very quick operations). This would mean the inserts are truly independent, and this pattern may well be more beneficial w.r.t. minimal logging for the insert.
Upvotes: 3