PostgreSQL 9.3: Invalid count within crosstab query

Question

I have the following data to show into the pivot table:

Here is the following table Employee contains the employee details:

          Employee_Number              Employee_Role                Group_Name
          --------------------------------------------------------------------
           EMP101                      C# Developer                  Group_1               
           EMP102                      ASP Developer                 Group_1               
           EMP103                      SQL Developer                 Group_2               
           EMP104                      PLSQL Developer               Group_2               
           EMP101                      Java Developer                                           
           EMP102                      Web Developer                                            
           EMP101                      DBA                                             
           EMP105                      DBA                                             
           EMP106                      SQL Developer                 Group_3               
           EMP107                      Oracle Developer              Group_3               
           EMP101                      Oracle Developer              Group_3               
           EMP108                      JSP                           Group_4               
           EMP108                      JS                            Group_5               
           EMP101                      C# Developer                  Group_1               
           EMP101                      C# Developer                  Group_1               
           EMP101                      C# Developer                  Group_1               
           EMP101                      C# Developer                  Group_1

I want to show output into the pivot table as shown below:

Employee_Number     TotalRoles      TotalGroups       Available     Others     Group_1     Group_2      Group_3      Group_4      Group_5
------------------------------------------------------------------------------------------------------------------------------------------
   EMP101               8                5                2           6           5           0            1             0           0
   EMP102               2                5                1           1           1           0            0             0           0
   EMP103               1                5                1           0           0           1            0             0           0   
   EMP104               1                5                1           0           0           1            0             0           0
   .......
   .......

For which I am using this script:

SELECT * FROM crosstab(
      $$SELECT grp.*, e.group_name
     , CASE WHEN e.employee_number IS NULL THEN 0 ELSE 1 END AS val
    FROM  (
       SELECT employee_number
    , count(employee_role)::int            AS total_roles
    , (SELECT count(DISTINCT group_name)::int
       FROM   employee
       WHERE  group_name <> '')            AS total_groups
    , count(group_name <> '' OR NULL)::INT AS available                    
    , count(group_name =  '' OR NULL)::int AS others
       FROM   employee
       GROUP  BY employee_number
       ) grp
    LEFT   JOIN employee e ON e.employee_number = grp.employee_number
          AND e.group_name <> ''         
    ORDER  BY grp.employee_number, e.group_name$$
     ,$$VALUES ('Group_1'),('Group_2'),('Group_3'),('Group_4'),('Group_5')$$
   ) AS ct (employee_number text
      , total_roles  int
      , total_groups int
      , available    int
      , others       int
      , Group_1    int
      , Group_2    int
      , Group_3    int
      , Group_4    int
      , Group_5    int);

But getting an wrong Available in the output for EMP101 it has to 2 because he is available in group_1 and group_3. But getting an wrong available for that.

mlinth · Accepted Answer

I see two problems: 1) count(group_name <> '' OR NULL) will count each occurrence of group_name which isn't blank, i.e. you will double count the values when there are duplicates. 2) You are not grouping the group counts - the crosstab function won't aggregate for you, so you'll get incorrect group counts when an employee belongs more than once to a group. The following query has a different way of calculating available, and groups the counts. You'll get nulls for missing values, but you got them before :-)

SELECT * FROM crosstab(
      $$SELECT grp.*, e.group_name
     ,  val
    FROM  (
       SELECT employee_number
    , count(employee_role)::int            AS total_roles
    , (SELECT count(DISTINCT group_name)::int
       FROM   employee
       WHERE  group_name <> '')            AS total_groups

    , (count(distinct group_name) - count(distinct group_name =  '' OR NULL))::int AS available   

    , count(group_name =  '' OR NULL)::int AS others
       FROM   employee
       GROUP  BY employee_number
       ) grp
    LEFT   JOIN 
    (select employee_number,group_name,count(*) as val from employee group by employee_number,group_name) e
      ON e.employee_number = grp.employee_number

    ORDER  BY grp.employee_number, e.group_name$$
     ,$$VALUES ('Group_1'),('Group_2'),('Group_3'),('Group_4'),('Group_5')$$
   ) AS ct (employee_number text
      , total_roles  int
      , total_groups int
      , available    int
      , others       int
      , Group_1    int
      , Group_2    int
      , Group_3    int
      , Group_4    int
      , Group_5    int);

PostgreSQL 9.3: Invalid count within crosstab query

Answers (1)

Related Questions