nightmare637
nightmare637

Reputation: 635

How can I delete duplicate rows in the same table that have identical CLOB data?

I have a table in Oracle of which one of the columns (named CONTENTSTRING) is a CLOB. However, some of the rows in this table have identical data in this column. What I'd like to do is remove all the rows except for one that have this identical data. How can I accomplish this?

Googling around, I see a ton of examples for comparing two columns. I also see examples comparing between two different tables. What I don't see is an example using one table and just comparing the rows! I do think I might need to use this function: dbms_lob.compare. However, I'm still not sure how I can set this function up.

From a programmer's perspective, I would think maybe I should do something like:

SELECT CONTENTSTRING FROM TABLE_ALPHA A

and then somehow do another select from the same table as TABLE_ALPHA B, and then use dmbs_lob.compare to compare the two columns. If the row numbers are different AND the column contents are equal, then the row from TABLE_ALPHA B can be deleted.

I think that's the right approach, but how exactly would I write this out in Oracle using SQL? I would appreciate any help or resources on this. Thanks!

Upvotes: 0

Views: 765

Answers (2)

GMB
GMB

Reputation: 222582

This answer assumes that you have a primary key field in the source table (I called it id).

You can use a subquery to list the ids of the duplicated records : this works by self-joining the table with dbms_lob.compare and a comparison clause on the id. If duplicate rows exist with the same CLOB content, all ids but the most ancient (ie the smallest) are selected. The outer query just deletes the selected ids. The NVL will consider NULL contents as duplicates (if that's not relevant for your use case, just remove them).

DELETE FROM TABLE_ALPHA 
WHERE id IN (
    SELECT b.id
    FROM TABLE_ALPHA a
    INNER JOIN TABLE_ALPHA b 
        ON  
        (
            (a.contentString IS NULL AND b.contentString IS NULL) 
            OR dbms_lob.compare(a.CONTENTSTRING, b.CONTENTSTRING) = 0
        ) 
        AND b.id > a.id
);

See this db fiddle.

Upvotes: 1

Vladimir.V.Bvn
Vladimir.V.Bvn

Reputation: 1118

DELETE
FROM TABLE_ALPHA A
WHERE EXISTS (
  SELECT 1 FROM TABLE_ALPHA B
  WHERE DBMS_LOB.COMPARE(A.CONTENTSTRING, B.CONTENTSTRING) = 0
  AND A.ROWID > B.ROWID
)

This deletes all dublicates except first one.

Upvotes: 2

Related Questions