Reputation: 1153
Finding it challenging to solve this
Table 1 has usersids (around 2 Million)
Table 2 has all user ids with email addresses (around 150 Million)
Table 3 has email addresses (around 100 users who are subscribed to a certain email program)
Need to count users from table 1 who are subscribed vs. not subscribed to the email program.
I tried
select b.email_address
from
table_1 a
left outer join table_2 b
on a.user_id = b.user_id
intersect
select email from table_3
but this is an incorrect way. I need to get exact count of subscribers to email program. Any help appreciated.
Upvotes: 0
Views: 52
Reputation: 1269953
I guess the challenge here is that the second table (the big one) has multiple users per email address. Because there are so few email addresses, I would be inclined to approach this as:
with emailusers as (
select distinct userid
from table2 t2 join
table3 t3
on t2.email = t3.email
)
select sum(case when eu.userid is not null then 1 else 0 end) as subscribed,
sum(case when eu.userid is null then 1 else 0 end) as notsubscribed
from table1 t1 left join
emailusers eu
on t1.userid = eu.userid;
Upvotes: 1
Reputation: 2911
You want a join and a CASE WHEN that describes whether a record was found or not.
SELECT
SUM(CASE WHEN c.email is NULL THEN 1 ELSE 0 END) AS not_subscribed,
SUM(CASE WHEN c.email IS NOT NULL THEN 1 ELSE 0 END) AS subscribed
FROM table_1 a
LEFT JOIN table_2 AS b
ON a.user_id = b.user_id
LEFT JOIN table_3 AS c
ON b.email = c.email
Now this makes some assumptions about your data, most especially that there are not duplicate email addresses in table_3. You should be able to verify that not_subscribed + subscribed equals SELECT COUNT(DISTINCT usersids) FROM table_1. If it does not, you should do these joins one at a time and determine where records are being lost/added.
Upvotes: 0