Raj Suvariya
Raj Suvariya

Reputation: 359

Not able to understand this SQL query

This is a solution of one of the SQL problem on Hacerrank but I am not able to understand how this query works as I am new to this field. Can anyone explain this or give me link for good tutorials where I can learn this type of complex queries.

SET @d=0,@a=0,@p=0,@s=0;
SELECT MIN(Doctor),MIN(Professor),MIN(SINGER),MIN(Actor)
FROM
(SELECT IF(OCCUPATION='Actor',NAME,NULL) AS Actor, 
        IF(OCCUPATION='Doctor',NAME,NULL) AS Doctor, 
        IF(OCCUPATION='Professor',NAME,NULL) AS Professor, 
        IF(OCCUPATION='Singer',NAME,NULL) AS SINGER,
    case OCCUPATION when 'ACTOR' THEN @a:=@a+1 
                    when 'Doctor' THEN @d:=@d+1 
                    when 'Professor' THEN @p:=@p+1 
                    when 'Singer' THEN @s:=@s+1 
    end
as idn FROM OCCUPATIONS ORDER BY NAME ) AS TMP GROUP BY TMP.idn ;

Upvotes: 0

Views: 599

Answers (1)

Kickstart
Kickstart

Reputation: 21513

Splitting this up into sections

SET @d=0,@a=0,@p=0,@s=0;

This sets up 4 user variables, initialising them each to 0.

SELECT IF(OCCUPATION='Actor',NAME,NULL) AS Actor, 
        IF(OCCUPATION='Doctor',NAME,NULL) AS Doctor, 
        IF(OCCUPATION='Professor',NAME,NULL) AS Professor, 
        IF(OCCUPATION='Singer',NAME,NULL) AS SINGER,
    case OCCUPATION when 'ACTOR' THEN @a:=@a+1 
                    when 'Doctor' THEN @d:=@d+1 
                    when 'Professor' THEN @p:=@p+1 
                    when 'Singer' THEN @s:=@s+1 
    end
as idn 
FROM OCCUPATIONS 
ORDER BY NAME

This is returning 5 columns. The first 4 columns return a name for that row if that row refers to a particular occupation. So if the occupation for that row is 'Doctor' then the 2nd column will contain the contents of the name field, while columns 1, 3 and 4 will be NULL.

The 5th column calculates the count of that occupation. This may or may not be done in the order of the names (where MySQL chooses to assign the values to the user variables is not defined) - it may well work now but maybe not in the future. But essentially if will contain a sequential count of the number of that occupation, with the first of that occupation containing 1 and the next 2, etc.

SELECT MIN(Doctor),MIN(Professor),MIN(SINGER),MIN(Actor)
FROM
(.....) AS TMP GROUP BY TMP.idn ;

This is taking the results of the earlier SELECT (used as a sub query) and GROUPing the result by the count. So for each count value (which will occur up to 4 times, once for each occupation) it will return the MIN (so first alphabetically) name for each occupation. The MIN aggregate function ignores NULL values normally, so will return the first used name.

Working that through with some dummy data. Using the following as an example:-

id  Name    Occupation
1   freda   Actor
2   fredb   Doctor
3   fredc   Professor
4   fredd   Actor
5   frede   Doctor
6   fredf   Actor
7   fredg   Professor
9   fredh   Singer
10  fredi   Actor
11  fredj   Doctor
12  fredk   Professor
13  fredl   Actor
14  fredm   Doctor
15  fredn   Professor
16  fredo   Professor
17  fredp   Singer
19  fredq   Doctor
20  fredr   Actor

The sub query gives:-

Actor   Doctor  Professor   Singer  idn 
freda   NULL    NULL        NULL    1
NULL    fredb   NULL        NULL    1
NULL    NULL    fredc       NULL    1
fredd   NULL    NULL        NULL    2
NULL    frede   NULL        NULL    2
fredf   NULL    NULL        NULL    3
NULL    NULL    fredg       NULL    2
NULL    NULL    NULL        fredh   1
fredi   NULL    NULL        NULL    4
NULL    fredj   NULL        NULL    3
NULL    NULL    fredk       NULL    3
fredl   NULL    NULL        NULL    5
NULL    fredm   NULL        NULL    4
NULL    NULL    fredn       NULL    4
NULL    NULL    fredo       NULL    5
NULL    NULL    NULL        fredp   2
NULL    fredq   NULL        NULL    5
fredr   NULL    NULL        NULL    6

The outer query acts on this, giving one row per used count (ie, the number of rows will be the same as the max number of a occupation, whichever occupation has the most), and the name of the person for that occupation for that count:-

Doctor  Professor   Singer  Actor
fredb   fredc       fredh   freda
frede   fredg       fredp   fredd
fredj   fredk       NULL    fredf
fredm   fredn       NULL    fredi
fredq   fredo       NULL    fredl
NULL    NULL        NULL    fredr

Upvotes: 4

Related Questions