disasterkid
disasterkid

Reputation: 7278

Deleting/Ignoring Duplicate Records In A Table And Keeping Only the Earliest Ones

I need an SQL query to remove the duplicates in the following picture. Suppose the name of the table is "Opens". enter image description here

As you can see there are so many records that are almost identical in Id,SendQueueId,SubscriberId and Email. The only thing that is different is their DateTime. I need to only select one from each Id so that my Ids will be unique and only keep the earliest one.

Upvotes: 1

Views: 130

Answers (2)

liao
liao

Reputation: 41

another suolution with standard sql:

delete from opens a
 where not exists
 (Select *
          From (select Id,
                       SendQueueId,
                       SubscriberId,
                       Email,
                       WP_CampaignId,
                       min(date_time) date_time
                  From opens
                 group by Id, SendQueueId, SubscriberId, Email, WP_CampaignId) b
         Where a.id = b.id
           and a.SendQueueId = b.SendQueueId
           and a.SubscriberId = b.SubscriberId
           and a.Email = b.Email
           and a.WP_CampaignId = b.WP_CampaignId);

Upvotes: 0

Derek
Derek

Reputation: 23238

Use a Common Table Expression to identify duplicates using the ROW_NUMBER function and delete all occurrences outside of whichever you designate as the "first" one.

;with cte as (
  select *, 
    row_number() over (
      partition by Id, SendQueueId, SubscriberId, Email, WP_CampaignId
      order by DateTime
    ) as RN
  from
    Opens
)

delete
  cte
where
  RN > 1

Upvotes: 2

Related Questions