Reputation: 359
This is a solution of one of the SQL problem on Hacerrank but I am not able to understand how this query works as I am new to this field. Can anyone explain this or give me link for good tutorials where I can learn this type of complex queries.
SET @d=0,@a=0,@p=0,@s=0;
SELECT MIN(Doctor),MIN(Professor),MIN(SINGER),MIN(Actor)
FROM
(SELECT IF(OCCUPATION='Actor',NAME,NULL) AS Actor,
IF(OCCUPATION='Doctor',NAME,NULL) AS Doctor,
IF(OCCUPATION='Professor',NAME,NULL) AS Professor,
IF(OCCUPATION='Singer',NAME,NULL) AS SINGER,
case OCCUPATION when 'ACTOR' THEN @a:=@a+1
when 'Doctor' THEN @d:=@d+1
when 'Professor' THEN @p:=@p+1
when 'Singer' THEN @s:=@s+1
end
as idn FROM OCCUPATIONS ORDER BY NAME ) AS TMP GROUP BY TMP.idn ;
Upvotes: 0
Views: 599
Reputation: 21513
Splitting this up into sections
SET @d=0,@a=0,@p=0,@s=0;
This sets up 4 user variables, initialising them each to 0.
SELECT IF(OCCUPATION='Actor',NAME,NULL) AS Actor,
IF(OCCUPATION='Doctor',NAME,NULL) AS Doctor,
IF(OCCUPATION='Professor',NAME,NULL) AS Professor,
IF(OCCUPATION='Singer',NAME,NULL) AS SINGER,
case OCCUPATION when 'ACTOR' THEN @a:=@a+1
when 'Doctor' THEN @d:=@d+1
when 'Professor' THEN @p:=@p+1
when 'Singer' THEN @s:=@s+1
end
as idn
FROM OCCUPATIONS
ORDER BY NAME
This is returning 5 columns. The first 4 columns return a name for that row if that row refers to a particular occupation. So if the occupation for that row is 'Doctor' then the 2nd column will contain the contents of the name field, while columns 1, 3 and 4 will be NULL.
The 5th column calculates the count of that occupation. This may or may not be done in the order of the names (where MySQL chooses to assign the values to the user variables is not defined) - it may well work now but maybe not in the future. But essentially if will contain a sequential count of the number of that occupation, with the first of that occupation containing 1 and the next 2, etc.
SELECT MIN(Doctor),MIN(Professor),MIN(SINGER),MIN(Actor)
FROM
(.....) AS TMP GROUP BY TMP.idn ;
This is taking the results of the earlier SELECT (used as a sub query) and GROUPing the result by the count. So for each count value (which will occur up to 4 times, once for each occupation) it will return the MIN (so first alphabetically) name for each occupation. The MIN aggregate function ignores NULL values normally, so will return the first used name.
Working that through with some dummy data. Using the following as an example:-
id Name Occupation
1 freda Actor
2 fredb Doctor
3 fredc Professor
4 fredd Actor
5 frede Doctor
6 fredf Actor
7 fredg Professor
9 fredh Singer
10 fredi Actor
11 fredj Doctor
12 fredk Professor
13 fredl Actor
14 fredm Doctor
15 fredn Professor
16 fredo Professor
17 fredp Singer
19 fredq Doctor
20 fredr Actor
The sub query gives:-
Actor Doctor Professor Singer idn
freda NULL NULL NULL 1
NULL fredb NULL NULL 1
NULL NULL fredc NULL 1
fredd NULL NULL NULL 2
NULL frede NULL NULL 2
fredf NULL NULL NULL 3
NULL NULL fredg NULL 2
NULL NULL NULL fredh 1
fredi NULL NULL NULL 4
NULL fredj NULL NULL 3
NULL NULL fredk NULL 3
fredl NULL NULL NULL 5
NULL fredm NULL NULL 4
NULL NULL fredn NULL 4
NULL NULL fredo NULL 5
NULL NULL NULL fredp 2
NULL fredq NULL NULL 5
fredr NULL NULL NULL 6
The outer query acts on this, giving one row per used count (ie, the number of rows will be the same as the max number of a occupation, whichever occupation has the most), and the name of the person for that occupation for that count:-
Doctor Professor Singer Actor
fredb fredc fredh freda
frede fredg fredp fredd
fredj fredk NULL fredf
fredm fredn NULL fredi
fredq fredo NULL fredl
NULL NULL NULL fredr
Upvotes: 4