Reputation: 19
I'm doing Intro to SQL and I'm currently doing this exercise, but I'm stuck.
Exercise:
Create a list of department names, the manager id, manager name( employee last name) of that department, and the average salary in each department.
Below is the schema of the exercise, just the tables that I'm using.
CREATE TABLE "EMPLOYEES"
("EMPLOYEE_ID" NUMBER(6,0),
"FIRST_NAME" VARCHAR2(20),
"LAST_NAME" VARCHAR2(25) CONSTRAINT "EMP_LAST_NAME_NN" NOT NULL ENABLE,
"EMAIL" VARCHAR2(25) CONSTRAINT "EMP_EMAIL_NN" NOT NULL ENABLE,
"PHONE_NUMBER" VARCHAR2(20),
"HIRE_DATE" DATE CONSTRAINT "EMP_HIRE_DATE_NN" NOT NULL ENABLE,
"JOB_ID" VARCHAR2(10) CONSTRAINT "EMP_JOB_NN" NOT NULL ENABLE,
"SALARY" NUMBER(8,2),
"COMMISSION_PCT" NUMBER(2,2),
"MANAGER_ID" NUMBER(6,0),
"DEPARTMENT_ID" NUMBER(4,0),
"BONUS" VARCHAR2(5),
CONSTRAINT "EMP_SALARY_MIN" CHECK (salary > 0) ENABLE,
CONSTRAINT "EMP_ID_PK" PRIMARY KEY ("EMPLOYEE_ID")
USING INDEX ENABLE,
CONSTRAINT "EMP_EMAIL_UK" UNIQUE ("EMAIL")
USING INDEX ENABLE
);
CREATE TABLE "DEPARTMENTS"
("DEPARTMENT_ID" NUMBER(4,0),
"DEPARTMENT_NAME" VARCHAR2(30) CONSTRAINT "DEPT_NAME_NN" NOT NULL ENABLE,
"MANAGER_ID" NUMBER(6,0),
"LOCATION_ID" NUMBER(4,0),
CONSTRAINT "DEPT_ID_PK" PRIMARY KEY ("DEPARTMENT_ID")
USING INDEX ENABLE
);
This is what I have now,
select d.department_name, d.manager_id,e.last_name as "manager_name" ,
avg(salary)
from departments d join employees e ON (e.employee_id=d.manager_id)
group by d.department_name,e.last_name, d.manager_id;
I was able to generate all the fields it asked for except for the average, I did get the answer by doing a separate query
select department_name, round(avg(salary)) from employees e join
departments d ON (e.department_id=d.department_id)
group by department_name;
I can't figure out for the life of me, how to make it so that the select statement that gives me the average salary works, I attempted HAVING clause which doesn't work because it limit what its already given. Hopefully someone can explain to me what I'm missing. Appreciate the responses!
Upvotes: 1
Views: 8559
Reputation: 21
WITH DINFO AS (SELECT departments.department_name,departments.department_id,departments.manager_id,ROUND(AVG(salary)) AS AVGSAL
FROM departments
INNER JOIN employees ON (employees.department_id = departments.department_id)
GROUP BY departments.department_name,departments.department_id,departments.manager_id
)
SELECT DINFO.department_name,DINFO.manager_id,last_name, DINFO.AVGSAL FROM employees e, DINFO
WHERE e.employee_id = DINFO.manager_id;
Upvotes: 0
Reputation: 74605
Because employees serves to cite both managers and non-managers, and departments have a manager, you have to join the employees table to departments TWICE in the same SQL. Youre looking for both these joins in the same sql:
This join will associate manager details with the department:
departments inner join employees mgrs on departments.manager_id = mgrs.employee_id
This join will associate employees with a department
departments inner join employees emps on emps.department_id = departments.department_id
So, your ultimate sql will have the following somewhere in it:
FROM
departments
INNER JOIN employees mgrs ON ....
INNER JOIN employees emps ON ....
WHERE
Confused? Consider another example relevant to you. You have an address where you live and an address where you work. If it were a database it would probably have tables persons
and addresses
. It wouldn't have persons
, home_addresses
and work_addresses
because a home/work address is just an address. Persons table might have a home_address_id
and a work_address_id
for you though and a different ID in each column, pointing to 2 different address rows in the addresses. This means there's no way to join addresses in just once and get both your work and your home address. Logically, it doesn't work: if there's only one set of address columns, which data is it showing - your work or your home? You can't show two values on top of each other in one cell, no.. You have to add more cells. We add more cells onto the right of an existing block of cells by... Doing another join! If you want to have columns in your select list of:
person.Name,
homeaddress.Street,
workaddress.Street
Then address has to be joined in twice.
Person table
------------
Name Home_Address_Id Work_Address_Id
JohnSmith 1 2
Address table
-------------
ID Building Street
1 261 Great Hope Street
2 Office2000 Commercial Way
Hopefully it's also clear to see that your addresses for work and home are different (you work in the city, and live in the country, ok? no home-working allowed ;) ), so you can't say:
person p INNER JOIN address a ON p.home_address_id = a.id AND p.work_address_id = a.id
Because when the database joins the two tables together, before it processes your ON conditions, it joins all rows to all rows (conceptually) then it filters
Persons-Joined-To-Addresses
---------------------------
Name Home_Address_Id Work_Address_Id ID Building Street
JohnSmith 1 2 1 261 Great Hope Street
JohnSmith 1 2 2 Office2000 Commercial Way
You can see that there isn't ANY row that satisfied both conditions, because the database scans the rows it has generated, testing for any that satisfy all the ON conditions:
Does 1=1 and 2=1? No. Do not include the 'JohnSmith 1 2 1 261 Great Hope Street' row
Does 1=2 AND 2=2? No. Do not include the 'JohnSmith 1 2 2 Office2000 Commercial Way' row
If you join the addresses in TWICE though, well that's different:
Persons-Joined-To-Addresses-Joined-To-Addresses-Again
-----------------------------------------------------
Name Home_Address_Id Work_Address_Id IH BuildingH StreetH IW BuildingW StreetW
JohnSmith 1 2 1 261 Great Hope Street 1 261 Great Hope Street
JohnSmith 1 2 1 261 Great Hope Street 2 Office2000 Commercial Way
JohnSmith 1 2 2 Office2000 Commercial Way 1 261 Great Hope Street
JohnSmith 1 2 2 Office2000 Commercial Way 2 Office2000 Commercial Way
Of these 4 rows, there is only one that satisfies the condition for the joins. Here's the list of test the database does:
Does 1=1 and 2=1? No. Don't include the 'JohnSmith/Great Hope Street/Great Hope Street' row
Does 1=1 and 2=2? YES. Include the 'JohnSmith/Great Hope Street/Commercial Way' row
Does 1=2 AND 2=1? No. Do not include the 'JohnSmith/Commercial Way/Great Hope Street' row
Does 1=2 AND 2=2? No. Do not include the 'JohnSmith/Commercial Way/Commercial Way' row
The two conditions are the same as before but I've renamed the ID columns as IH and IW. The conditions are home_address_id = IH and work_address_id = IW
. It's very important that you grasp the notion in the 'Does a=b AND x=y?' rows above that you appreciate that we're treating address IDs differently by attaching a meaning of "home" or "work" to them. Were saying:
Does person.home_address_id=id_from_addresses_aliased_as_home AND person.work_address_id=id_from_addresses_aliased_as_work?
After you get this into one sql, summing the salaries of employees will work. Make sur you alias the employees table differently for each join. I called mine mgrs and emps so I know whether the data is "manager level" or "employee level"
I haven't written the whole query for you because this is clearly a homework exercise, but if you need more help, post comments :)
Upvotes: 2
Reputation: 108
Is this what you are after?
select d.department_name, d.manager_id,e.last_name as "manager_name", avg(salary)
from departments d join employees e ON (e.employee_id=d.manager_id)
group by d.department_name,d.manager_id, e.last_name ;
Upvotes: 0