Reputation: 1293
I have a table, Core_Faculty
with 4 fields: ID (PK - INT), InstitutionID (INT), PersonID (INT), DeprecatedDate (SMALLDATETIME)
What I'd like to do is delete all the records for institution/person combinations that have both deprecated records and non-deprecated (DeprecatedDate IS NULL) record, but keep the non-deprecated record.
If an institution/person combination has only just one record (whether deprecated or not), I'd like to keep those and leave them alone. I'm only considering records that have both DeprecatedDate IS NULL and Deprecated IS NOT NULL for each unique institution/person combination.
End goal is to be left with one record per institution/person combination whether deprecated or not, but giving priority to the record that has a NULL deprecated date. These are the good, live records. However, if we are starting with only one record and it's deprecated, go ahead and keep it.
The database currently only can potentially have one of each as institution/person/deprecateddate is a unique key on the table.
How would I go about solving this, and what methods can I use to find the appropriate records, while only considering records that have both deprecated and non-deprecated values for the combination?
Upvotes: 0
Views: 51
Reputation: 2908
DELETE f
FROM
Core_Faculty f
INNER JOIN
(
SELECT *,
ROW_NUMBER() OVER (
PARTITION BY
f.InstitutionID,
f.PersonID
ORDER BY
CASE
WHEN f.DeprecatedDate IS NULL THEN 1
ELSE 2
END,
f.DeprecatedDate
) RowNum
FROM
Core_Faculty f
) d ON
f.ID = d.ID
WHERE
d.RowNum > 1;
Upvotes: 1
Reputation: 11
In SQL Server you can use a common table expression with a ROW_NUMBER function to identify the rows you want to keep:
WITH cte AS (
SELECT [ID]
,[InstitutionID]
,[PersonID]
,[DeprecatedDate]
,ROW_NUMBER() OVER (PARTITION BY [InstitutionID], [PersonID]
ORDER BY [DeprecatedDate] DESC) as [RowNumber]
FROM [Blog].[dbo].[Core_Faculty]
)
SELECT [ID]
,[InstitutionID]
,[PersonID]
,[DeprecatedDate]
,[RowNumber]
FROM cte
--WHERE [RowNumber] = 1
The ORDER BY [DeprecatedDate] DESC
part will make ensure the latest record is the 1st row in the [InstitutionID], [PersonID]
grouping. If there is only one row, even if it is a null, it will be kept since it is the 1st row in the grouping.
You can then use
DELETE
FROM cte
WHERE [RowNumber] > 1
instead of the select to remove the rest of the rows. Leaving you with just one row person/institution combo.
Upvotes: 1