Tito
Tito

Reputation: 671

Aggregating a set of results for each set of data

I have a list of nations. Each short abbreviation (short) corresponds to 1 country, however people like to input other variables besides the nation name which ends you with the list below

Output

short        nation     Students
   A           A           604
   A        Austria       6707
   A       Österreich     3400
   AFG        Afg           18
   AFG    Afghanistan     1991
   AGL       Angola         16
   AGN      Guinea           2
   AL         Al             5
   AL       Albanien        61
   ARM        Arm            6
   ARM      Armenien        87

So as you can see calculating the total number of students for each country can come out as 2 or 3 results. So obviously the question came to my mind if it is possible to calculate the sum of each shorttext grouped such as the following

Expected output

short        nation     Students
   A           A           604
   A        Austria       6707
   A       Österreich     3400
   A                     10711 
   AFG        Afg           18
   AFG    Afghanistan     1991
   AFG                    2009
   AGL       Angola         16
   AGN      Guinea           2
   AL         Al             5
   AL       Albanien        61
   AL                       66
   ARM        Arm            6
   ARM      Armenien        87
   ARM                      93

My code is as follows

Part 1

with natctf as (
SELECT  short, 
        nation, 
        cast(Studentcount as varchar(6)) as Studentcount
        FROM (
                SELECT  ctf.shorttext as short, ctf.longtext as nation,
                                count(distinct s.studentid) as Studentcount
                from students s
                    join pupil p on p.id = s.pupilid
                    join pupilnation pn on pn.pupilid = p.id
                    join country ctf on ctf.id = pn.coutnryid

                Group by ctf.shorttext,ctf.longtext
                Order by ctf.shorttext
) t )

Part 2

SELECT short, initcap(nation), Studentcount
FROM natctf

UNION ALL
SELECT null as short, 
       cast(count(nation) as varchar(3)) ||' Nations', 
       cast(SUM(cast(Studentcount as bigint)) as varchar(10)) ||' Students'
FROM natctf

Upvotes: 1

Views: 73

Answers (3)

S-Man
S-Man

Reputation: 23676

Disclaimer: This is the solution for PostgreSQL versions 9.0-9.4. For Postgres 9.5 or higher I would go with the GROUPING SETS solution of @LaurenzAlbe


demo: db<>fiddle

WITH count_nations AS (                                               -- A
    SELECT 
        *, 
        sum(students) OVER (PARTITION BY short) as total              -- B
    FROM nations
)
SELECT short, name, students FROM count_nations                       -- C            
UNION                                                                 -- E
SELECT short, NULL, total FROM count_nations                          -- D

ORDER BY 
    short, 
    name NULLS LAST,                                                  -- F
    students

A: WITH clause makes the query more readable as you do not need to write the same sub query twice.

B: The window function (https://www.postgresql.org/docs/current/static/tutorial-window.html) SUM sums up all values in a given frame (here the short column). So you are getting the total of the countries as separate column.

Result of the subquery:

short     name          students   total
A         A             604        10711
A         Austria       6707       10711
A         Österreich    3400       10711
AFG       Afg           18         2009
AFG       Afghanistan   1991       2009
AGL       Angola        16         16
AGN       Guinea        2          2
AL        Al            5          66
AL        Albanien      61         66
ARM       Arm           6          93
ARM       Armenien      87         93

C: Selecting the original columns...

D: Selecting the new column without a name...

E: UNION both results. UNION makes the result distinct so you are getting only one row per nation. (UNION ALL wouldn't do the distinct)

F: ordering the result. For the nations rows, the NULL values should be the last.

Result:

short   name          students
A       A             604
A       Austria       6707
A       Österreich    3400
A                     10711
AFG     Afg           18
AFG     Afghanistan   1991
AFG                   2009
AGL     Angola        16
AGL                   16
AGN     Guinea        2
AGN                   2
AL      Al            5
AL      Albanien      61
AL                    66
ARM     Arm           6
ARM     Armenien      87
ARM                   93

In your example you add the extra row only for those nations which have more than one row. For example for AGN you don't add the row. If this is your intention the db<>fiddle linked above shows you the solution:

  1. Adding a row counter per window frame into the WITH clause
  2. Filter the UNION subquery for all nations with row_count > 1

Upvotes: 2

Laurenz Albe
Laurenz Albe

Reputation: 246403

The best solution is to use grouping sets, an SQL standard feature that is just the right thing for your use case:

SELECT ctf.shorttext as short,
       ctf.longtext as nation,
       count(...)
FROM country AS ctf JOIN ...
GROUP BY GROUPING SETS ((ctf.shorttext, ctf.longtext), (ctf.shorttext))
ORDER BY ctf.shorttext, ctf.longtext

Upvotes: 2

sticky bit
sticky bit

Reputation: 37472

UNION ALL a query where you GROUP BY the short and the long name and another one where you only group by the short name.

SELECT x.short,
       x.nation,
       x.studentcount
       FROM (SELECT ctf.shorttext short,
                    ctf.longtext nation,
                    count(DISTINCT s.studentid) studentcount
                    FROM students s
                         INNER JOIN pupil p
                                    ON p.id = s.pupilid
                         INNER JOIN pupilnation pn
                                    ON pn.pupilid = p.id
                         INNER JOIN country ctf
                                    ON ctf.id = pn.coutnryid
                    GROUP BY ctf.shorttext,
                             ctf.longtext
             UNION ALL
             SELECT ctf.shorttext short,
                    NULL nation,
                    count(DISTINCT s.studentid) studentcount
                    FROM students s
                         INNER JOIN pupil p
                                    ON p.id = s.pupilid
                         INNER JOIN pupilnation pn
                                    ON pn.pupilid = p.id
                         INNER JOIN country ctf
                                    ON ctf.id = pn.coutnryid
                    GROUP BY ctf.shorttext) x
      ORDER BY x.short,
               x.nation NULLS LAST;

Note, that the count for the query grouped by the short name must not be the sum of the counts, the other query returns. That's because of the distinct count. If for a student there are several different long names with one short name, they get counted in each group of the long name but only once in the group for the short name.

Upvotes: 0

Related Questions