Reputation: 342
I want to identify all rows whose content in a clob column is not unique.
The query I use is:
select
id,
clobtext
from
table t
where
(select count(*) from table innerT where dbms_lob.compare(innerT.clobtext, t.clobtext) = 0)>1
However this query is very slow. Any suggestions to speed it up? I already tried to use the dbms_lob.getlength function to eliminate more elements in the subquery but I didn't really improve the performance (feels the same).
To make it more clear an example:
table
ID | clobtext
1 | a
2 | b
3 | c
4 | d
5 | a
6 | d
After running the query. I'd like to get (order doesn't matter):
1 | a
4 | d
5 | a
6 | d
Upvotes: 1
Views: 576
Reputation: 11934
In the past I've generated checksums (in my C# code) for each clob.
Whilst this will inccur a one off increase in io (to generate the checksum) subsequent scans will be quicker, and you can index the value too
TK has a good PL\SQL example here: Ask Tom
Upvotes: 1