Reputation: 189
I have 4 tables:
employee
: employee_id
(and several unrelated columns)calls
: employee_id
, date
, number_of_calls
(on that given date by employee)employee_skills
: employee_id
, skills_id
skills
: skills_id
, primary_skill
, secondary_skill
As a site note:
In employee_skills
combinations of primary_skill
and secondary_skill
can occur several times with a different skill_id
.
There can be two entries such as 234 | Python | Java
and 235 | Python | Java
.
I would like to have a SELECTS
-statement that returns:
employee_id
, number_of_total_calls
(of that employee), number_of_unique_secondary_skills
(of that employee)
The problem I am facing is regarding the number_of_total_calls_of_employee
.
When I only join the tables employee
and calls
, I get the correct number (sum) of total calls of that employee.
However, when I also join the tables employee_skills and skills, I get a multiple of the correct number (e.g. 6 times the correct number for employee_id
123).
I think that this is the case since when doing so the employee_id
occurs several times in the new table, not just once (the employee_id
123
occurs 6 times in the new table), since the employee_id
is several times in the table employee_skills
(e.g. the employee_id
123
occurs 6 times in employee_skills
).
The SELECT
-statement I am using looks as follows:
SELECT e.employee_id, SUM(c.number_of_calls), COUNT(DISTINCT s.secondary_skill)
FROM employees e
INNER JOIN calls c
ON e.employee_id = c.employee_id
INNER JOIN employee_skills es
ON e.employee_id = es.employee_id
INNER JOIN skills s
ON es.skills_id = s.skills_id
GROUP BY e.employee_id;
Upvotes: 0
Views: 39
Reputation: 1270401
First, you don't need the employees table, because employee_id
is in both other tables. I would then recommend aggregating along both dimensions before joining:
SELECT es.employee_id, c.number_of_calls, es.num_secondary_skills
FROM (SELECT c.employee_id, SUM(c.number_of_calls) as number_of_calls
FROM calls c
GROUP BY c.employe_id
) c JOIN
(SELECT es.employee_id, COUNT(DISTINCT s.secondary_skill) as num_secondary_skills
FROM employee_skills es JOIN
skills s
ON es.skills_id = s.skills_id
GROUP BY es.employee_id
) es
ON es.employee_id = c.employee_id;
The above only returns employees that both have calls and skills. If some employees are missing one or the other, then they are filtered out.
If you want all employees, then the employees
table is handy -- with outer joins:
SELECT e.employee_id, c.number_of_calls, es.num_secondary_skills
FROM employees e LEFT JOIN
(SELECT c.employee_id, SUM(c.number_of_calls) as number_of_calls
FROM calls c
GROUP BY c.employe_id
) c
ON c.employee_id = e.employee_id LEFT JOIN
(SELECT es.employee_id, COUNT(DISTINCT s.secondary_skill) as num_secondary_skills
FROM employee_skills es JOIN
skills s
ON es.skills_id = s.skills_id
GROUP BY es.employee_id
) es
ON es.employee_id = e.employee_id;
Upvotes: 1
Reputation: 317
Just take the sum of calls separately as an inner query.
SELECT e.employee_id,
( SELECT SUM(c.number_of_calls) FROM calls c WHERE c.employee_id = e.employee_id ) as sum_of_calls,
COUNT(DISTINCT s.secondary_skill)
FROM employees e
INNER JOIN calls c
ON e.employee_id = c.employee_id
INNER JOIN employee_skills es
ON e.employee_id = es.employee_id
INNER JOIN skills s
ON es.skills_id = s.skills_id
GROUP BY e.employee_id;
Upvotes: 0