JGibbers
JGibbers

Reputation: 257

Get all rows with a matching field in a different row in the same table

Let's say I have a table like this:

|id|userID|email         |website    |
--------------------------------------
|1 |user1 |[email protected]|website.com|
|2 |user2 |[email protected]|website.com|
|3 |user3 |[email protected]|website.com|
|4 |user1 |[email protected]|foo.com    |
|5 |user2 |[email protected]|foo.com    |

And I want to get all of the rows where website='website.com' and have a corresponding row with a matching userID where website='foo.com'

So, in this instance it would return rows 1 and 2.

Any ideas?

Upvotes: 0

Views: 4163

Answers (3)

Gordon Linoff
Gordon Linoff

Reputation: 1270873

Here is one way:

select t.*
from t
where t.website = 'website.com' and
      exists (select 1 from t t2 where t2.userId = t.userId and t2.website = 'foo.com');

EDIT:

You can also express this as a join:

select distinct t.*
from t join
     t2
     on t2.userId = t.userId and
        t.website = 'website.com' and
        t2.website = 'foo.com';

If you know there are not duplicates, then you can remove the distinct.

Upvotes: 2

Lukas Vermeer
Lukas Vermeer

Reputation: 5940

Hive is kinda limited in terms of using subqueries (they are only allowed in the FROM clause), so we'll have to work around that a bit. The good news is that we don't really have to worry (much) about doing massive joins, because, well, Hadoop. :-)

One way to find matching rows in a table is to simply join the table on itself.

SELECT left.*
FROM your_table left
JOIN your_table right
ON (left.userID = right.userID)
WHERE left.website = 'website.com'
AND right.website = 'foo.com';

Note we have two versions of the same table, called left and right respectively, and we are retrieving rows from left that have a matching row in right that has the same userID (JOIN condition) but website is foo.com (and clause).

Hope that helps.

Upvotes: 1

juergen d
juergen d

Reputation: 204894

To get the user you can do

select userID
from your_table
where website in ('website.com', 'foo.com')
group by userID
having count(distinct website) = 2

but if you need the complete row then do

select * from your_table
where userID in
(
   select userID
   from your_table
   where website in ('website.com', 'foo.com')
   group by userID
   having count(distinct website) = 2
)

Upvotes: 1

Related Questions