Reputation: 23
Lets say I have the following table for example:
id name
---------
1 Matt
2 Ryan
3 Joseph
4 Matt1
5 5Joseph
6 David
7 Matt_43
We see that Matt and Joseph has been repeated more than once i.e Matt
, Matt1
and Matt_43
. Similarly Joseph
is repeating twice.
Is there a way to retrieve such kind of data?
Upvotes: 0
Views: 82
Reputation: 16001
Depending on how you are defining 'similar', you could look at soundex()
or utl_match
:
with demo (id, name) as
( select 1, 'Matt' from dual union all
select 2, 'Ryan' from dual union all
select 3, 'Joseph' from dual union all
select 4, 'Matt1' from dual union all
select 5, '5Joseph' from dual union all
select 6, 'David' from dual union all
select 7, 'Matt_43' from dual )
select id, name
, soundex(name)
, count(*) over(partition by soundex(name)) as count_similar
, utl_match.jaro_winkler_similarity(name, 'Matt') as similarity_to_matt
from demo
order by similarity_to_matt desc;
ID NAME SOUNDEX(NAME) COUNT_SIMILAR SIMILARITY_TO_MATT
------- --------- ------------- ------------- ------------------
1 Matt M300 3 100
4 Matt1 M300 3 96
7 Matt_43 M300 3 91
2 Ryan R500 1 50
6 David D130 1 48
5 5Joseph J210 2 0
3 Joseph J210 2 0
Upvotes: 1
Reputation: 2465
Assuming that you need to retrieve data having similar text in column name
, you can join the table to itself, use like
in join condition and return distinct
records as below.
select distinct t1.id as id,
t1.name as name
from table1 t1
join table1 t2
on ((t1.name like '%'|| t2.name ||'%' or t2.name like '%'|| t1.name ||'%')
and t1.id <> t2.id);
Result:
+----+---------+
| ID | NAME |
+----+---------+
| 4 | Matt1 |
| 7 | Matt_43 |
| 1 | Matt |
| 3 | Joseph |
| 5 | 5Joseph |
+----+---------+
Update:
If you don't want the result to be case-sensitive, use upper
select distinct t1.id as id,
t1.name as name
from table1 t1
join table1 t2
on ((upper(t1.name) like '%'|| upper(t2.name) ||'%' or upper(t2.name) like '%'|| upper(t1.name) ||'%')
and t1.id <> t2.id)
Upvotes: 1
Reputation: 29667
You could self-join your table with a LIKE
For example:
select t1.id as id1, t1.name as name1, t2.id as id2, t2.name as name2
from your_table t1
join your_table t2
on (upper(t1.name) like '%'|| upper(t2.name) ||'%' and t1.id <> t2.id)
Upvotes: 2
Reputation: 1731
since oracle is very case sensitive even when 2 words are same where one starts with capital letter and second with small,
oracle will consider them two different data, to keep them in your search you would require this approach:
select name from emp where UPPER(name) like '%UPPER(Ryan)%';
Upvotes: 1
Reputation:
you can use like wildcard characters that have operator like '%matt%' this will return all the matt value even if the prefix or postfix would not match
here is the query
select name from emp where name like '%matt%' or name like '%joseph%'
Upvotes: 1