Reputation: 49
I have been getting persistent ROWGROUP_FLUSH deadlocks lately when inserting data into a CCI table (non-partitioned, no other indexes). Database compatibility level is 150 and uses simple recovery mode. I am inserting about 10 million rows in the transaction and the cardinality estimate looks pretty accurate:
SSMS can't render ROWGROUP_FLUSH deadlock XMLs, but here's the representation from SentryOne:
Here's a simplified version of the table structures and insert:
CREATE TABLE dbo.AllData (
Metric INT NOT NULL
,StartDate DATE NOT NULL
,Value NUMERIC(18,0)
)
CREATE CLUSTERED COLUMNSTORE INDEX cci_AllData ON dbo.AllData
CREATE TABLE dbo.Data_MostRecent (
Metric INT NOT NULL
,StartDate DATE NOT NULL
,Value NUMERIC(18,0)
)
CREATE CLUSTERED COLUMNSTORE INDEX cci_Data_MostRecent ON dbo.Data_MostRecent
--update the most recent table with the latest data
BEGIN TRANSACTION
TRUNCATE TABLE dbo.Data_MostRecent
INSERT INTO dbo.Data_MostRecent WITH (TABLOCK)
(
Metric
,StartDate
,Value
)
SELECT
AllData.Metric
,AllData.StartDate
,AllData.Value
FROM dbo.AllData
--get the last date per metric
INNER JOIN (
SELECT
Metric
,StartDate = MAX(StartDate)
FROM dbo.AllData
GROUP BY
Metric
) LastRun
ON AllData.Metric = LastRun.Metric
AND AllData.StartDate = LastRun.StartDate
COMMIT TRANSACTION
I discovered that if I remove the TABLOCK hint I can still get parallel inserts (contrary to this article) and the deadlock does not appear to occur.
2 questions:
Thanks!
Upvotes: 0
Views: 248
Reputation: 2434
(1) contrary to this article... Why does this work?
probably, since this is NOT what the article says.
"Must specify TABLOCK" is a limitation for SQL Server 2016 as explicitly mentioned in the article. I am not sure what is your full scenario in order to discuss the behavior in your case using SQL Server 2019. If you will provide full sample which we can execute (including the DDL+DML and sample data that you want to insert) then we might have better understanding.
(2) Could removing the TABLOCK hint cause other problems? I'm guessing it's just there to allow the parallel insert, but that no longer seems to be necessary.
TABLOCK has other impact on the behavior but using SQL Server 2019 latest version it is not needed for this reason probably (for parallel insert on CCI for example using bulk loading).
Note! This is an old article which you should not follow as it is. Notice that when you go the page then you get alert which explicitly inform you that using this document is not recommended
Check this document for more information:
Notice under bulk loading title, this sentence:
Unlike rowstore bulk loads into SQL Server, you don't need to specify TABLOCK
Also check the limitation under the title of use a staging table...
One key limitation was that this INSERT operation was single threaded. To load data in parallel, you could create multiple staging table or issue INSERT/SELECT with non-overlapping ranges of rows from the staging table.
Upvotes: 1