Andy
Andy

Reputation: 1293

How to remove one record so my unique key constraint won't break in the future

I have a table, Core_Faculty with 4 fields: ID (PK - INT), InstitutionID (INT), PersonID (INT), DeprecatedDate (SMALLDATETIME)

What I'd like to do is delete all the records for institution/person combinations that have both deprecated records and non-deprecated (DeprecatedDate IS NULL) record, but keep the non-deprecated record.

If an institution/person combination has only just one record (whether deprecated or not), I'd like to keep those and leave them alone. I'm only considering records that have both DeprecatedDate IS NULL and Deprecated IS NOT NULL for each unique institution/person combination.

End goal is to be left with one record per institution/person combination whether deprecated or not, but giving priority to the record that has a NULL deprecated date. These are the good, live records. However, if we are starting with only one record and it's deprecated, go ahead and keep it.

The database currently only can potentially have one of each as institution/person/deprecateddate is a unique key on the table.

How would I go about solving this, and what methods can I use to find the appropriate records, while only considering records that have both deprecated and non-deprecated values for the combination?

Upvotes: 0

Views: 51

Answers (2)

Bruce Dunwiddie
Bruce Dunwiddie

Reputation: 2908

DELETE f
FROM
    Core_Faculty f
        INNER JOIN
        (
            SELECT *,
                ROW_NUMBER() OVER (
                    PARTITION BY 
                        f.InstitutionID,
                        f.PersonID
                    ORDER BY
                        CASE
                            WHEN f.DeprecatedDate IS NULL THEN 1
                            ELSE 2
                        END,
                        f.DeprecatedDate
                ) RowNum
            FROM
                Core_Faculty f
        ) d ON
            f.ID = d.ID
WHERE
    d.RowNum > 1;

Upvotes: 1

Sarel Jacobs
Sarel Jacobs

Reputation: 11

In SQL Server you can use a common table expression with a ROW_NUMBER function to identify the rows you want to keep:

WITH cte AS (
            SELECT [ID]
                  ,[InstitutionID]
                  ,[PersonID]
                  ,[DeprecatedDate]
                  ,ROW_NUMBER() OVER (PARTITION BY [InstitutionID], [PersonID]
                                          ORDER BY [DeprecatedDate] DESC) as [RowNumber]
            FROM [Blog].[dbo].[Core_Faculty]
        )
SELECT [ID]
      ,[InstitutionID]
      ,[PersonID]
      ,[DeprecatedDate]
      ,[RowNumber]
FROM cte
--WHERE [RowNumber] = 1

The ORDER BY [DeprecatedDate] DESC part will make ensure the latest record is the 1st row in the [InstitutionID], [PersonID] grouping. If there is only one row, even if it is a null, it will be kept since it is the 1st row in the grouping.

You can then use

DELETE
FROM cte
WHERE [RowNumber] > 1

instead of the select to remove the rest of the rows. Leaving you with just one row person/institution combo.

Upvotes: 1

Related Questions