Reputation: 199
Context
I am trying to solve the HackerRank problem at the link below:
https://www.hackerrank.com/challenges/challenges/problem
I am focusing on excluding the unwanted data from the output. The problem states:
If more than one student created the same number of challenges and the count is less than the maximum number of challenges created, then exclude those students from the result.
I have split this up into two conditions:
What I've tried
I am focusing on the first condition. This is the query I've come up with:
SELECT h.hacker_id, h.name, COUNT(c.challenge_id) as cnt
FROM Hackers as h
LEFT JOIN Challenges as c ON h.hacker_id = c.hacker_id
GROUP BY h.hacker_id, h.name, c.challenge_id
HAVING /* Either the max, or things that are not duplicates*/
COUNT(c.challenge_id) = MAX((SELECT COUNT(c1.challenge_id) FROM Challenges as c1))
ORDER BY cnt DESC, hacker_id ASC
This results in no output to stdout.
The issue is with the HAVING statement. I tried checking the conditional by querying just the MAX(...) statement:
SELECT MAX((SELECT COUNT(c1.challenge_id) FROM Challenges as c1))
FROM Challenges
I get an output.
Then I tried using the whole conditional in an IF() statement:
SELECT IF(
COUNT(c.challenge_id) = MAX((SELECT COUNT(c1.challenge_id) FROM Challenges as c1)),
CONCAT(COUNT(c.challenge_id), ', it works',
'No good'
)
FROM Challenges as c
This returns the same max value as the test of my MAX() clause.
Can anyone explain to me why the full query, with the conditional in the HAVING clause, returns no output?
Upvotes: 0
Views: 108
Reputation: 7114
I think you almost got it. A few changes and addition to your current query can return the correct result.
Your original query:
SELECT h.hacker_id, h.name, COUNT(c.challenge_id) as cnt
FROM Hackers as h
LEFT JOIN Challenges as c ON h.hacker_id = c.hacker_id
GROUP BY h.hacker_id, h.name, c.challenge_id /*remove c.challenge_id from the group by */
HAVING /*remove HAVING from the query*/
COUNT(c.challenge_id) = MAX((SELECT COUNT(c1.challenge_id) FROM Challenges as c1))
/*take the MAX function and modify the subquery by adding group by, order by and limit then move it to SELECT*/
ORDER BY cnt DESC, hacker_id ASC /*remove this for now*/
After modification of the above, you'll get something like this:
#1st sub-query
SELECT h.hacker_id, h.name, COUNT(*) AS cnt,
MAX((SELECT COUNT(*) AS cnt FROM Challenges GROUP BY hacker_id ORDER BY cnt DESC LIMIT 1)) AS mxc
FROM Hackers AS h
LEFT JOIN Challenges AS c ON h.hacker_id = c.hacker_id
GROUP BY h.hacker_id, h.name
Make the query above as a sub-query then construct another query to be used as validation on any hacker_id
that have the same challenges_created
count. This is the sub-query:
#2nd sub-query
SELECT hacker_id, COUNT(*) cnt FROM Challenges GROUP BY hacker_id
Then use LEFT JOIN
on the 1st & 2nd query like this
SELECT A.hacker_id, A.name, A.cnt AS 'challenges_created' FROM
(SELECT h.hacker_id, h.name, COUNT(*) AS cnt,
MAX((SELECT COUNT(c1.challenge_id) AS cnt FROM Challenges AS c1 GROUP BY hacker_id ORDER BY cnt DESC LIMIT 1)) AS mxc
FROM Hackers AS h
LEFT JOIN Challenges AS c ON h.hacker_id = c.hacker_id
GROUP BY h.hacker_id, h.name) A
LEFT JOIN (SELECT hacker_id, COUNT(*) cnt FROM Challenges GROUP BY hacker_id) V
ON A.cnt=V.cnt AND A.hacker_id <> V.hacker_id AND V.cnt < A.mxc
WHERE V.cnt IS NULL
ORDER BY A.cnt DESC, A.hacker_id ASC
Lets observe the ON
condition on this part of the LEFT JOIN
. So ON A.cnt=V.cnt AND A.hacker_id <> V.hacker_id AND V.cnt < A.mxc
literally mean "the count is the same but hacker id is not while the second result count is less than the max count from the first result". Add WHERE V.cnt IS NULL
to return only the one doesn't match with the ON
condition and finalize it with the ORDER BY
.
I obtained the data used in the the challenge and created a fiddle from it here : https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=be99385a91d809b3daeaeb1ca3f54e0d
Btw,
"The main difference between WHERE and HAVING clause comes when used together with GROUP BY clause, In that case WHERE is used to filter rows before grouping and HAVING is used to exclude records after grouping.
Read more: https://www.java67.com/2019/06/difference-between-where-and-having-in-sql.html#ixzz6b01iR3NB"
Upvotes: 0
Reputation: 1624
The challenge was actually challenging :-). It took more time than I expected. Here is my solution below -
SELECT Tx.hacker_id,
Tx.NAME,
Tx.challenges
FROM (SELECT h.hacker_id AS hacker_id,
h.NAME AS NAME,
Count(DISTINCT c.challenge_id) AS challenges
FROM hackers h
LEFT JOIN challenges c
ON h.hacker_id = c.hacker_id
GROUP BY h.hacker_id,
h.NAME
HAVING Count(DISTINCT c.challenge_id) = (SELECT Max(challenges)
FROM
(SELECT
h.hacker_id,
Count(DISTINCT c.challenge_id) AS challenges
FROM hackers h
LEFT JOIN challenges c
ON h.hacker_id =
c.hacker_id
GROUP BY h.hacker_id) T
)
UNION
SELECT T1.hacker_id AS hacker_id,
T1.NAME AS NAME,
T1.challenges AS challenges
FROM (SELECT h.hacker_id,
h.NAME,
Count(DISTINCT c.challenge_id) AS challenges
FROM hackers h
LEFT JOIN challenges c
ON h.hacker_id = c.hacker_id
GROUP BY h.hacker_id,
h.NAME) T1
JOIN (SELECT T.challenges
FROM (SELECT h.hacker_id,
h.NAME,
Count(DISTINCT c.challenge_id) AS challenges
FROM hackers h
LEFT JOIN challenges c
ON h.hacker_id = c.hacker_id
GROUP BY h.hacker_id,
h.NAME) T
GROUP BY T.challenges
HAVING Count(DISTINCT T.hacker_id) = 1) T2
ON T1.challenges = T2.challenges) Tx
ORDER BY 3 DESC,
1;
Although the code looks long but the logic is quite simple. The upper part of the union computes those hacker_id and related info who submitted the maximum amount of chanllenges and the 2nd part of the union tackles to select only those hackers who have submitted non-max but unique number of challenges.
Upvotes: 1