Faheem
Faheem

Reputation: 13

Remove duplicates with least row ids from Oracle

I have a database table which looks like

ID Book_no Book_name Book_category
ID1 1 B1 CB1
ID1 2 B1 CB1
ID1 3 B2 CB1
ID1 4 B2 CB1
ID1 5 B3 CB1
ID2 1 B1 CB2
ID2 2 B1 CB2
ID2 3 B2 CB2

And the expected result is like

ID Book_No Book_name Book_category
ID1 2 B1 CB1
ID1 4 B2 CB1
ID1 5 B3 CB1
ID2 2 B1 CB2
ID2 3 B2 CB2

I want to delete duplicate records from table on the basis of ID, Book_name and Book_category. Below query deletes the duplicate records, but the result is not expected one. As I want to delete all the duplicate records except the highest Book_no. Want to maintain the highest Book_no and delete all other duplicates.

Upvotes: 0

Views: 122

Answers (4)

Joyal Joseph
Joyal Joseph

Reputation: 84

You can use first_value for finding the required id and can remove others using delete.

SELECT DISTINCT a.*, FIRST_VALUE(a.book_name) OVER (ORDER BY a.book_name DESC RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS "HIGHEST" FROM ( select * from books group by id, Book_name, book_category ) a;

SELECT DISTINCT FIRST_VALUE(column_name) OVER (ORDER BY column_name DESC RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS "HIGHEST" FROM (select * from tables group by "column_names_to_be_grouped");

Upvotes: 0

MT0
MT0

Reputation: 167867

You can DELETE correlating on the ROWID pseudo-column:

DELETE FROM table_name
WHERE ROWID IN (
  SELECT rid
  FROM   (
    SELECT ROWID AS rid,
           ROW_NUMBER() OVER (
             PARTITION BY id, book_name, book_category
             ORDER BY book_no DESC
           ) AS rn
    FROM table_name
  )
  WHERE  rn > 1
);

Which, for the sample data:

CREATE TABLE table_name (id, book_no, book_name, book_category) AS
SELECT 'ID1', 1, 'B1', 'CB1' FROM DUAL UNION ALL
SELECT 'ID1', 2, 'B1', 'CB1' FROM DUAL UNION ALL
SELECT 'ID1', 3, 'B2', 'CB1' FROM DUAL UNION ALL
SELECT 'ID1', 4, 'B2', 'CB1' FROM DUAL UNION ALL
SELECT 'ID1', 5, 'B3', 'CB1' FROM DUAL UNION ALL
SELECT 'ID2', 1, 'B1', 'CB2' FROM DUAL UNION ALL
SELECT 'ID2', 2, 'B1', 'CB2' FROM DUAL UNION ALL
SELECT 'ID2', 3, 'B2', 'CB2' FROM DUAL;

Then the remaining rows are:

SELECT * FROM table_name;
ID BOOK_NO BOOK_NAME BOOK_CATEGORY
ID1 2 B1 CB1
ID1 4 B2 CB1
ID1 5 B3 CB1
ID2 2 B1 CB2
ID2 3 B2 CB2

sqlfiddle here

Upvotes: 1

user5683823
user5683823

Reputation:

Assuming your table looks like this:

create table books (id, book_no, book_name, book_category) as
  select 'ID1', 1, 'B1', 'CB1' from dual union all
  select 'ID1', 2, 'B1', 'CB1' from dual union all
  select 'ID1', 3, 'B2', 'CB1' from dual union all
  select 'ID1', 4, 'B2', 'CB1' from dual union all
  select 'ID1', 5, 'B3', 'CB1' from dual union all
  select 'ID2', 1, 'B1', 'CB2' from dual union all
  select 'ID2', 2, 'B1', 'CB2' from dual union all
  select 'ID2', 3, 'B2', 'CB2' from dual
;

You can use a delete statement in which you compare each row to the ones you want to keep. The ones you want to keep have max(book_no) when grouped by the other columns. So:

delete from books
where  (id, book_no, book_name, book_category) not in
         (
           select id, max(book_no), book_name, book_category
           from   books
           group  by id, book_name, book_category
         )
;

This assumes that the columns are non-null; if you may have null in the table, you will need to rewrite this a little more carefully, with a not exists condition instead of not in, but the idea is the same.

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269563

You can use lead() and filter where the name changes:

select t.*
from (select t.*,
             lead(book_name) over (partition by id, book_category order by book_no) as next_book_name
      from t
     ) t
where next_book_name is null or next_book_name <> book_name;

Upvotes: 0

Related Questions