alott
alott

Reputation: 53

Finding the intersection between two columns

I'm trying to find the (set) intersection between two columns in the same table in MySQL. I basically want to find the rows that have either a col1 element that is in the table's col2, or a col2 element that is in the table's col1.

Initially I tried:

SELECT * FROM table WHERE col1 IN (SELECT col2 FROM table)

which was syntactically valid, however the run-time is far too high. The number of rows in the table is ~300,000 and the two columns in question are not indexed. I assume the run time is either n^2 or n^3 depending on whether MySQL executes the subquery again for each element of the table or if it stores the result of the subquery temporarily.

Next I thought of taking the union of the two columns and removing distinct elements, because if an element shows up more than once in this union then it must have been present in both columns (assuming both columns contain only distinct elements).

Is there a more elegant (i.e. faster) way to find the set intersection between two columns of the same table?

Upvotes: 5

Views: 11782

Answers (2)

lababidi
lababidi

Reputation: 2752

If you only want the values, try the INTERSECT command:

(SELECT col1 FROM table) INTERSECT (SELECT col2 FROM table)

Upvotes: -1

Joe Stefanelli
Joe Stefanelli

Reputation: 135848

SELECT t1.*
    FROM table t1
        INNER JOIN table t2
            ON t1.col1 = t2.col2

Creating indexes on col1 and col2 would go a long way to help this query as well.

Upvotes: 12

Related Questions