Reputation: 17
I have several tables that I have joined in my query : hr.employees,hr.departments, hr.locations, hr.countries: hr.employees:
name | salary | department_id (points to hr.departments) |
---|---|---|
... | ... | ... |
... | ... | ... |
hr.departments
department_id | location_id (points to hr.locations) |
---|---|
... | ... |
... | ... |
hr.locations
location_id | country_id (points to hr.countries) |
---|---|
... | ... |
... | ... |
hr.countries
country_id | country_name |
---|---|
... | ... |
... | ... |
For my query I have to display the salary of each employee with a higher or equal salary to that of the average salary of a country
The code looks as follows and gives me the correct result:
SELECT
e.first_name || ' ' || e.last_name AS name,
TRIM(CAST(con.country_name AS CHAR(25))) AS country,
e.salary AS salary,
(SELECT ROUND(AVG(e1.salary)) FROM hr.employees e1
JOIN hr.departments dep1 ON e1.department_id = dep1.department_id
JOIN hr.locations loc1 ON dep1.location_id = loc1.location_id
WHERE country_id = loc.country_id
)AS avg_sal_country
FROM hr.employees e
JOIN hr.departments dep ON e.department_id = dep.department_id
JOIN hr.locations loc ON dep.location_id = loc.location_id
JOIN HR.countries con ON loc.country_id = con.country_id
WHERE (salary >= (SELECT AVG(e1.salary) FROM hr.employees e1
JOIN hr.departments dep1 ON e1.department_id = dep1.department_id
JOIN hr.locations loc1 ON dep1.location_id = loc1.location_id
WHERE country_id = loc.country_id) )
;
According to the grading system my professor uses the cost of my statement is 27 while the cost of the query from the professor is 16, how can I rewrite this to be more efficent?
Edit: My result looks like this
name | country | salary | avg_sal_country |
---|---|---|---|
Bob | USA | 4000 | 3800 |
Suzy | UK | 3000 | 2000 |
Tom | USA | 5000 | 3800 |
Upvotes: 0
Views: 58
Reputation: 7836
Just for comparisson - your query generates explain plan starting like:
Plan hash value: 4172893912
---------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 729 | 25 (16)| 00:00:01 |
|* 1 | HASH JOIN OUTER | | 9 | 729 | 25 (16)| 00:00:01 |
... it is clear that you use two subqueries (in Select And in Where) and they use the same tables as the main query. If we slightly correct the sintax and get ridd just of 1st subquery so it would look like:
-- STEP 1 -- changed sintax and replaced subquery column with AVG() OVER() Analytic Function
Select e.FIRST_NAME || ' ' || e.LAST_NAME "NAME",
con.COUNTRY_NAME "COUNTRY",
e.SALARY "SALARY",
Ceil(Avg(e.SALARY) OVER(Partition By loc.COUNTRY_ID)) "AVG_SAL_COUNTRY" -- Analytic function instead of subquery for every row
From HR.EMPLOYEES e
Inner Join HR.DEPARTMENTS dep ON e.DEPARTMENT_ID = dep.DEPARTMENT_ID
Inner Join HR.LOCATIONS loc ON dep.LOCATION_ID = loc.LOCATION_ID
Inner Join HR.COUNTRIES con ON loc.COUNTRY_ID = con.COUNTRY_ID
Where ( SALARY >= ( Select Avg(e1.SALARY) -- jet another subquery for every row
From HR.EMPLOYEES e1
Inner Join HR.DEPARTMENTS dep1 ON (e1.DEPARTMENT_ID = dep1.DEPARTMENT_ID)
Inner Join HR.LOCATIONS loc1 ON dep1.LOCATION_ID = loc1.LOCATION_ID
Where loc1.COUNTRY_ID = loc.COUNTRY_ID
)
)
Plan hash value: 2085091834
---------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 7 | 441 | 17 (18)| 00:00:01 |
| 1 | WINDOW SORT | | 7 | 441 | 17 (18)| 00:00:01 |
.. it looks better now - what would the result be like if we dump the other subquery from Where clause and transform it into joined subquery...
-- STEP 2 replaced where condition subquery and used it joined to the tables
Select e.FIRST_NAME || ' ' || e.LAST_NAME "NAME",
con.COUNTRY_NAME "COUNTRY",
e.SALARY "SALARY",
avgs.AVG_SAL_COUNTRY -- no need for analytic function any more
From HR.EMPLOYEES e
Inner Join HR.DEPARTMENTS dep ON e.DEPARTMENT_ID = dep.DEPARTMENT_ID
Inner Join HR.LOCATIONS loc ON dep.LOCATION_ID = loc.LOCATION_ID
Inner Join HR.COUNTRIES con ON loc.COUNTRY_ID = con.COUNTRY_ID
Inner Join ( Select loc1.COUNTRY_ID, CEIL(Avg(e1.SALARY)) "AVG_SAL_COUNTRY"
From HR.EMPLOYEES e1
Inner Join HR.DEPARTMENTS dep1 ON (dep1.DEPARTMENT_ID = e1.DEPARTMENT_ID)
Inner Join HR.LOCATIONS loc1 ON (loc1.LOCATION_ID = dep1.LOCATION_ID )
Group By loc1.COUNTRY_ID
) avgs ON(avgs.COUNTRY_ID = loc.COUNTRY_ID)
Where e.SALARY >= avgs.AVG_SAL_COUNTRY
Plan hash value: 21112100
--------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 7 | 448 | 16 (13)| 00:00:01 |
| 1 | NESTED LOOPS | | 7 | 448 | 16 (13)| 00:00:01 |
... well, a bit better (16) but not much. So, let us adjust the query with analytic function ...
-- STEP 3 -- back to AVG() OVER() Analytic Function
SELECT NAME, COUNTRY, SALARY, AVG_SAL_COUNTRY
FROM (
Select e.FIRST_NAME || ' ' || e.LAST_NAME "NAME",
con.COUNTRY_NAME "COUNTRY",
e.SALARY "SALARY",
Ceil(Avg(e.SALARY) OVER(Partition By loc.COUNTRY_ID)) "AVG_SAL_COUNTRY" -- Analytic function instead of subquery for every row
From HR.EMPLOYEES e
Inner Join HR.DEPARTMENTS dep ON e.DEPARTMENT_ID = dep.DEPARTMENT_ID
Inner Join HR.LOCATIONS loc ON dep.LOCATION_ID = loc.LOCATION_ID
Inner Join HR.COUNTRIES con ON loc.COUNTRY_ID = con.COUNTRY_ID
)
WHERE SALARY >= AVG_SAL_COUNTRY
Plan hash value: 94384860
------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 106 | 7738 | 8 (13)| 00:00:01 |
|* 1 | VIEW | | 106 | 7738 | 8 (13)| 00:00:01 |
And that would be it - looks better (8) than what we have started with (25)
Upvotes: 0
Reputation: 222612
I would recommend window functions (aka analytic functions in Oracle), which can efficiently replace the subquery in your code:
SELECT name, country, salary, ROUND(avg_sal_country) AS avg_sal_country
FROM (
SELECT
e.first_name || ' ' || e.last_name AS name,
TRIM(CAST(con.country_name AS CHAR(25))) AS country,
e.salary,
AVG(e.salary) OVER(PARTITION BY loc.country_id) AS avg_sal_country
FROM hr.employees e
JOIN hr.departments dep ON e.department_id = dep.department_id
JOIN hr.locations loc ON dep.location_id = loc.location_id
JOIN HR.countries con ON loc.country_id = con.country_id
) x
WHERE salary >= avg_sal_country
Upvotes: 2