Reputation: 79

How to remove duplicate entries in SQL using selected columns only?

My data is as shown below:

id    | name   | date        | country   | vendor
1717  | CUST A | 8-Aug-1978  | INDIA     | VENDOR 1
1972  | CUST B | 1-Jan-1965  | INDIA     | VENDOR 2
2083  | CUST C | 1-Jan-1936  | AUSTRALIA | VENDOR 1
2189  | CUST D | 27-May-2000 | USA       | VENDOR 4
2189  | CUST D | 27-May-2000 | USA       | VENDOR 5
2189  | CUST D | 27-May-2000 | USA       | VENDOR 6

Question: I want to remove the duplicate rows based on Columns id, name, date, gender and country only (hence excluding Vendor)

In the above example, the 5th and 6th entries are duplicate except for their vendors.

Using Select Query how can I get rid of the 5th and 6th entry and keep on 4th entry?

By Keeping the 4th Entry, I mean the first Entry that comes up by select in the sequence of rows.

Upvotes: 1

Answers (4)

Ran Ben Shimol

Reputation: 41

SELECT count(vendor) as count, id, name, date, gender, country 
FROM TABLENAME 
GROUP BY id, name, date, gender, country
WHERE Count > 1

sql csa

Upvotes: 1

Stefano Zanini

Reputation: 5916

If you are not interested in preserving the vendor information, you can use the distinct keyword

select distinct id, name, date, gender, country
from   yourTable

This way the rows that are different for the undesired column only, will result as identical and the distinct will have the query return only one of them

Edit

If you want to preserve only the rows that are not duplicate, you can first select the combinations of id, name, date, gender and country that are available only once

select  id, name, date, gender, country, count(*)
from    yourTable
group by id, name, date, gender, country
having count(*) = 1

Then you use this table to filter the original one, by joining them together

select  t1.*
from    yourTable t1
join    (
            select  id, name, date, gender, country, count(*)
            from    yourTable
            group by id, name, date, gender, country
            having count(*) = 1
        ) t2
on      t1.id = t2.id and
        t1.name = t2.name and
        t1.date = t2.date and
        t1.gender = t2.gender and
        t1.country = t2.country

Upvotes: 0

Kannan Kandasamy

Reputation: 13959

You can use Row_Number()

select * from (
select *, RowN= Row_Number() over(partition by id, name, date, gender, country order by id, name, date, gender, country) 
    from YourTable ) a where a.RowN = 1

Upvotes: 0

Gordon Linoff

Reputation: 1269933

One method is group by:

select id, name, date, gender, country, min(vendor) as vendor
from t
group by id, name, date, gender, country;

This returns an "arbitrary" value of vendor. Tables in SQL represent unordered sets. There is no concept of 4th or 5th or 6th row. So, if you want one of the particular vendor values, you need to specify how that value is determined.

Upvotes: 2

How to remove duplicate entries in SQL using selected columns only?

Answers (4)

Related Questions