Reputation: 99
I got a situation where i have to show the first 9 rows with domain name of email and no of occurences of the email and the 10th row as the sum of the remaining domains and show it as others.
What i have done .
I have succesfully get the different domains and their occurrences using the below query
SELECT (SUBSTRING_INDEX(SUBSTR(email, INSTR(email, '@') + 1),'.',1)) as domain,
COUNT(*) as C
FROM newsletter_recipient
where LENGTH(email) > 0
GROUP BY (SUBSTRING_INDEX(SUBSTR(email, INSTR(email, '@') + 1),'.',1))
ORDER BY COUNT(*) DESC
Current Result
domain C
------------------------
gmail 12
dddd 2
mmmmm 2
dsf 2
aaaa 1
bbbb 1
ccc 1
yopmail 1
yahoo 1
dde 1
rfg 1
eedd 1
dfdg 1
sad 1
dfdf 1
sfd 1
web 1
Expected Result
domain C
------------------------
gmail 12
dddd 2
mmmmm 2
dsf 2
aaaa 1
bbbb 1
ccc 1
yopmail 1
yahoo 1
others 8
Hope Someone could help
Upvotes: 1
Views: 61
Reputation: 5720
I would solve this with a union first 10 + others:
SELECT * from (
SELECT
(SUBSTRING_INDEX(SUBSTR(email, INSTR(email, '@') + 1),
'.',
1)) AS domain,
COUNT(*) AS C
FROM
newsletter_recipient
WHERE
LENGTH(email) > 0
GROUP BY (SUBSTRING_INDEX(SUBSTR(email, INSTR(email, '@') + 1),
'.',
1))
ORDER BY COUNT(*) DESC
Limit 10) as FIRST_10
UNION
SELECT 'others', sum(ALL_OTHERS.C) from (
SELECT
COUNT(*) AS C
FROM
newsletter_recipient
WHERE
LENGTH(email) > 0
GROUP BY (SUBSTRING_INDEX(SUBSTR(email, INSTR(email, '@') + 1),
'.',
1))
ORDER BY COUNT(*) DESC
Limit 11,18446744073709551615
) as ALL_OTHERS
18446744073709551615 : Big enough to skip first 10 and ensure to retrieve all of the others , see : MySQL skip first 10 results
Upvotes: 0
Reputation: 1300
you could try the following
WITH cte AS (
SELECT (SUBSTRING_INDEX(SUBSTR(email, INSTR(email, '@') + 1),'.',1)) as domain,
COUNT(*) as C
FROM newsletter_recipient
WHERE LENGTH(email) > 0
GROUP BY (SUBSTRING_INDEX(SUBSTR(email, INSTR(email, '@') + 1),'.',1))
ORDER BY COUNT(*) DESC
)
SELECT
IF(rn < 10, domain, 'others') AS domain_name,
SUM(c) AS count
FROM (
SELECT domain, c, ROW_NUMBER() OVER () AS rn FROM cte
) T
GROUP BY domain_name
Upvotes: 0
Reputation: 1269483
Use two levels of aggregation:
SELECT (CASE WHEN seqnum < 10 THEN domain ELSE 'Others' END) as domain,
SUM(c)
FROM (SELECT SUBSTRING_INDEX(SUBSTR(email, INSTR(email, '@') + 1), '.', 1) as domain,
COUNT(*) as C,
ROW_NUMBER() OVER (ORDER BY COUNT(*) DESC) as seqnum
FROM newsletter_recipient
WHERE LENGTH(email) > 0
GROUP BY domain
) d
GROUP BY (CASE WHEN seqnum < 10 THEN domain ELSE 'Others' END)
ORDER BY SUM(c) DESC;
Upvotes: 1
Reputation: 37473
Use a case when expression
-
SELECT case when (SUBSTRING_INDEX(SUBSTR(email, INSTR(email, '@') + 1),'.',1))
not in ('gmail','dddd','mmmmm','dsf','aaaa','bbbb','ccc','yopmail','yahoo') then 'Other'
else (SUBSTRING_INDEX(SUBSTR(email, INSTR(email, '@') + 1),'.',1)) as domain,
COUNT(*) as C
FROM newsletter_recipient
where LENGTH(email) > 0
GROUP BY case when (SUBSTRING_INDEX(SUBSTR(email, INSTR(email, '@') + 1),'.',1))
not in ('gmail','dddd','mmmmm','dsf','aaaa','bbbb','ccc','yopmail','yahoo') then 'Other'
else (SUBSTRING_INDEX(SUBSTR(email, INSTR(email, '@') + 1),'.',1))
ORDER BY COUNT(*) DESC
Upvotes: 0