Reputation: 3
Hi I'm having trouble getting the right count for this problem. I'm trying to get a count of duplicate email that has a different first name and/or different last name. (i.e [email protected] sam [email protected] ben I need a count of that duplicate email) I'm working with 2 tables. The email_address is in the mrtcustomer.customer_email table and the first and last name is in the mrtcustomer.customer_master table
My code
SELECT COUNT(*)
FROM
(SELECT e.customer_master_id, email_address, customer_first_name, customer_last_name,
ROW_NUMBER() OVER (PARTITION BY EMAIL_ADDRESS ORDER BY CUSTOMER_FIRST_NAME) RN
FROM mrtcustomer.customer_email e
JOIN mrtcustomer.customer_master t ON e.customer_master_id = t.customer_master_id
WHERE t.customer_first_name IS NOT NULL
AND t.customer_last_name IS NOT NULL
AND customer_FIRST_NAME != 'Unknown'
AND customer_LAST_NAME != 'Unknown'
GROUP BY e.customer_master_id, email_address, customer_first_name, customer_last_name
ORDER BY 1 DESC)
WHERE RN > 1
I'm guessing its my WHERE clause that is wrong.
Upvotes: 0
Views: 108
Reputation: 682
identify distinct fname,lname,email records... then group by email (having more than one record)... then do a count on that.
-- count
SELECT COUNT(DISTINCT email_address)
FROM
(
-- group by email , find where there is more than one distinct record for each email
SELECT email_address
FROM
(
-- get distinct Fname, Lname, Email combinations in derived table
SELECT customer_first_name , customer_last_name, email_address
FROM mrtcustomer.customer_email
JOIN mrtcustomer.customer_master t ON e.customer_master_id = t.customer_master_id
WHERE t.customer_first_name IS NOT NULL
AND t.customer_last_name IS NOT NULL
AND customer_FIRST_NAME != 'Unknown'
AND customer_LAST_NAME != 'Unknown'
GROUP BY 1,2,3
) foo
GROUP BY 1
HAVING COUNT(*)>1
) bar
Upvotes: 0
Reputation: 16677
i would start with something like this: (edited to reflect edits)
select email_address
, count( distinct customer_first_name ) f
, count( distinct customer_last_name ) l
from customer_email e, customar_master m
where e.customer_master_id = m.customer_master_id
group by email_address
then if either of the name columns is > 1 you have a problem - so wrap that similar to this:
select email_address from
(
select email_address
, count( distinct customer_first_name ) f
, count( distinct customer_last_name ) l
from customer_email e, customar_master m
where e.customer_master_id = m.customer_master_id
group by email_address
)
where fn > 1 or ln > 1
Upvotes: 1