Reputation: 25
I have a data set that looks like this example data set.
There are multiple users under one domain. I want only one row per email_domain and the row should correspond to the max(last_login) value. In short I only want the user from the email_domain who was the last to log in among all users from the same domain.
Ive tried a query which looks like this
select *
FROM
(
select LOWER(SUBSTRING(ua.email FROM POSITION ('@' IN ua.email) + 1)) AS email_domain, last_login, last_name, first_name, email, phone
from user_with_address ua
order by email_domain
) as A
group by email_domain, last_login, last_name, first_name, email, phone
having last_login = max(last_login)
order by email_domain
I still get a list with multiple values for each email domain, what am I doing wrong? Please help.
Disclaimer: I have basic->intermediate knowledge of SQL.
Upvotes: 2
Views: 178
Reputation:
Use distinct on ()
select distinct on (email_domain) *
FROM (
select lower(split_part(email, '@', 2)) AS email_domain,
last_login,
last_name,
first_name,
email,
phone
from user_with_address
) as A
order by email_domain, last_login desc;
I also incorporated Patrick's suggested to simplify the expression to extract the domain from the email.
Upvotes: 2
Reputation: 31
I like Tim Biegeleisen's answer, but this is a little more simple, SQL wise. Don't know about performance differences though.
select
LOWER(SUBSTRING(ua.email FROM POSITION ('@' IN ua.email) + 1)) AS email_domain,
last_login,
last_name,
first_name,
email,
phone
from user_with_address ua
where last_login = (select max(last_login)
from user_with_address ua2
where LOWER(SUBSTRING(ua.email FROM POSITION ('@' IN ua.email) =
LOWER(SUBSTRING(ua2.email FROM POSITION ('@' IN ua2.email))
order by email_domain;
Upvotes: 0
Reputation: 520898
One option is to use ROW_NUMBER()
and retain the most recent login record for each group of same email domain records.
SELECT t.email_domain, t.last_login, t.last_name, t.first_name, t.email, t.phone
FROM
(
SELECT a.*,
ROW_NUMBER() OVER (PARTITION BY a.email_domain ORDER BY a.last_login DESC) rn
FROM
(
SELECT LOWER(SUBSTRING(ua.email FROM POSITION ('@' IN ua.email) + 1)) AS email_domain,
last_login, last_name, first_name, email, phone
FROM user_with_address ua
) a
) t
WHERE t.rn = 1
ORDER BY t.email_domain
Note that I actually subquery twice here, to avoid having to repeat your code which computes the email domain. If not for that, we could have accomplished this with just a single subquery. We can use a single subquery here, but the query would be a bit harder to read.
Upvotes: 0