Reputation: 77
I would like to delete all rows in a table where the batchId (a running number) older than the previous two. I could probably do this in a SQL Database with the query:
WITH CTE AS(
SELECT
*,
DENSE_RANK() OVER(ORDER BY BATCHID DESC) AS RN
FROM MyTable
)
DELETE FROM CTE WHERE RN>2
But the same is not allowed in a SQL Data Warehouse per this. Looking for alternatives here.
Upvotes: 2
Views: 3711
Reputation: 14379
Azure SQL Data Warehouse only supports a limited T-SQL surface area and CTEs for DELETE
operations and DELETEs
with FROM
clauses which will yield the following error:
Msg 100029, Level 16, State 1, Line 1
A FROM clause is currently not supported in a DELETE statement.
It does however support sub-queries so one way to write your statement like this:
DELETE dbo.MyTable
WHERE BATCHID Not In ( SELECT TOP 2 BATCHID FROM dbo.MyTable ORDER BY BATCHID DESC );
This syntax is supported in Azure SQL Data Warehouse and I have tested it. I'm not sure how efficient it will be on billions of rows though. You could also consider partition switching.
If you are deleting a large portion of your table then it might make sense to use a CTAS to put the data you want to keep into a new table, eg something like this:
-- Keep the most recent two BATCHIDS
CREATE TABLE dbo.MyTable2
WITH
(
CLUSTERED COLUMNSTORE INDEX,
DISTRIBUTION = HASH( BATCHID )
-- Add partition scheme here if required
)
AS
SELECT *
FROM dbo.MyTable
WHERE BATCHID In ( SELECT TOP 2 BATCHID FROM dbo.MyTable ORDER BY BATCHID DESC )
OPTION ( LABEL = 'CTAS : Keep top two BATCHIDs' );
GO
-- Rename or DROP old table
RENAME OBJECT dbo.MyTable TO MyTable_Old;
RENAME OBJECT dbo.MyTable2 TO MyTable;
GO
-- Optionally DROP MyTable_Old if everything has been successful
-- DROP TABLE MyTable_Old
This technique is described in more detail here.
Upvotes: 1
Reputation: 1269873
You can try:
delete t from mytable t
where batchId < (select max(batchid) from mytable);
Oh, if you want to keep two, perhaps this will work:
delete t from mytable t
where batchId < (select batchid
from mytable
group by batchid
limit 1 offset 1
);
Upvotes: 0
Reputation: 37473
You can try using JOIN
delete d from MyTable d
join
(
SELECT
*,
RN = ROW_NUMBER() OVER(PARTITION BY BATCH_ID ORDER BY BATCH_ID DESC)
FROM MyTable
)A on d.batch_id=A.batch_id where RN >2
Upvotes: 1