michelson
michelson

Reputation: 750

'where' in with duplicates or 'select distinct' - performance


I would like to return some results basing on subquery results and I was wondering what would be faster - having subquery results be distinct or where in clause having duplicates?
Example

select * 
from some_table 
where my_column in (select /* distinct? */ my_column from some_table_2)

If

-- subquery
select /*distinct?*/ my_column from some_table_2;

This would return something like:

  | MY_COLUMN
1 | 25
2 | 25
3 | 54
...

This example may not make sense but it presents my point.

Upvotes: 1

Views: 93

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270483

There is no benefit to putting distinct in the subquery for in. An in does what is called a "semi-join". This stops at the first matching row.

Oracle probably ignores the select distinct. Other databases might actually implement it.

If you care about performance, try both these versions:

select t.* 
from some_table t
where my_column in (select my_column from some_table_2)

and

select t.*
from some_table t
where exists (select 1 from some_table_2 t2 where t2.my_column = t.my_column);

This version can take advantage of an index on some_table_2(my_column).

Upvotes: 3

Related Questions