Reputation: 257
Let's say I have a table like this:
|id|userID|email |website |
--------------------------------------
|1 |user1 |[email protected]|website.com|
|2 |user2 |[email protected]|website.com|
|3 |user3 |[email protected]|website.com|
|4 |user1 |[email protected]|foo.com |
|5 |user2 |[email protected]|foo.com |
And I want to get all of the rows where website='website.com' and have a corresponding row with a matching userID where website='foo.com'
So, in this instance it would return rows 1 and 2.
Any ideas?
Upvotes: 0
Views: 4163
Reputation: 1270873
Here is one way:
select t.*
from t
where t.website = 'website.com' and
exists (select 1 from t t2 where t2.userId = t.userId and t2.website = 'foo.com');
EDIT:
You can also express this as a join:
select distinct t.*
from t join
t2
on t2.userId = t.userId and
t.website = 'website.com' and
t2.website = 'foo.com';
If you know there are not duplicates, then you can remove the distinct
.
Upvotes: 2
Reputation: 5940
Hive is kinda limited in terms of using subqueries (they are only allowed in the FROM
clause), so we'll have to work around that a bit. The good news is that we don't really have to worry (much) about doing massive joins, because, well, Hadoop. :-)
One way to find matching rows in a table is to simply join the table on itself.
SELECT left.*
FROM your_table left
JOIN your_table right
ON (left.userID = right.userID)
WHERE left.website = 'website.com'
AND right.website = 'foo.com';
Note we have two versions of the same table, called left
and right
respectively, and we are retrieving rows from left
that have a matching row in right
that has the same userID (JOIN
condition) but website is foo.com (and
clause).
Hope that helps.
Upvotes: 1
Reputation: 204894
To get the user you can do
select userID
from your_table
where website in ('website.com', 'foo.com')
group by userID
having count(distinct website) = 2
but if you need the complete row then do
select * from your_table
where userID in
(
select userID
from your_table
where website in ('website.com', 'foo.com')
group by userID
having count(distinct website) = 2
)
Upvotes: 1