Reputation: 67
return one row for each Job_ID with the following columns:
• Job ID
• Average female service length in that job in years rounded to the tenth of a year
• Average Female Salary for that job
• Average male service length in that job in years rounded to the tenth of a year
• Average Male Salary for that job
• The difference in average male and female salaries for that job such that a positive number means the average female salary is higher and negative number means the average male salary is higher.
TABLE
HR.EMPLOYEES
Name Null Type
-------------- -------- ------------
EMPLOYEE_ID NOT NULL NUMBER(6)
FIRST_NAME VARCHAR2(20)
LAST_NAME NOT NULL VARCHAR2(25)
EMAIL NOT NULL VARCHAR2(25)
PHONE_NUMBER VARCHAR2(20)
HIRE_DATE NOT NULL DATE
JOB_ID NOT NULL VARCHAR2(10)
SALARY NUMBER(8,2)
COMMISSION_PCT NUMBER(2,2)
MANAGER_ID NUMBER(6)
DEPARTMENT_ID NUMBER(4)
GENDER CHAR(1)
What I Have So Far but this is returning an M & F row for each job_id i need M & F columns
SELECT gender, job_id, ROUND(AVG(salary),0) as avg_job_salary,
(SELECT ROUND(AVG(salary),0)
FROM hr.employees
WHERE gender = 'M') AS avg_m_salary, (SELECT ROUND(AVG(salary),0)
FROM hr.employees
WHERE gender = 'F') AS avg_f_salary,
ROUND(AVG(days_of_svc/365),1) AS avg_years_svc
FROM (SELECT job_id, salary, gender, (SYSDATE-hire_date) AS days_of_svc
FROM hr.employees)
GROUP BY job_id,gender
ORDER BY job_id, gender;
or version 2
SELECT gender, job_id, ROUND(AVG(salary),0) as avg_job_salary,
((SELECT ROUND(AVG(salary),0)
FROM hr.employees
WHERE gender = 'F') - (SELECT ROUND(AVG(salary),0)
FROM hr.employees
WHERE gender = 'M')) as diff,
ROUND(AVG(days_of_svc/365),1) AS avg_years_svc
FROM (SELECT job_id, salary, gender, (SYSDATE-hire_date) AS days_of_svc
FROM hr.employees)
GROUP BY job_id,gender
ORDER BY job_id, gender;
sample expected results rows
JOB_ID F_AVG_LENGTH F_AVG_SAL M_AVG_LENGTH M_AVG_SAL DIFFERENCE
------ ------------ --------- ------------ --------- ----------
MAILCLERK 24.1 48000 23.4 47000 1000
CASHIER 4.6 12000 4.4 13500 -1500
Upvotes: 5
Views: 2527
Reputation:
On my machine I made a copy of HR.EMPLOYEES
in my schema, I named the clone HR_EMPLOYEES
. Then I added a column for GENDER
since on my copy of Oracle, the HR.EMPLOYEES
table does not have a GENDER
column. I populated the column with my best guesses, just for testing.
In Oracle 11 you can use the PIVOT operation, which makes the job easier. I divided average tenure by 365.25 so it is expressed in years rather than days. Note that there are many jobs that have either no males or no females working in them, so there are many NULL results. I assumed you want them shown too - otherwise they can be left out.
select job_id, round(F_AVG_TENURE_D/365.25, 1) as f_avg_length,
round(F_AVG_SALARY) as f_avg_salary,
round(M_AVG_TENURE_D/365.25, 1) as m_avg_length,
round(M_AVG_SALARY) as m_avg_salary,
round(F_AVG_SALARY - M_AVG_SALARY) as avg_sal_diff
from (
select job_id, gender, sysdate - hire_date as tenure, salary
from hr_employees
)
pivot (avg(tenure) as avg_tenure_d, avg(salary) as avg_salary
for gender in ('F' as F, 'M' as M))
order by avg_sal_diff desc nulls last, job_id -- ORDER BY is optional
;
Output:
JOB_ID F_AVG_LENGTH F_AVG_SALARY M_AVG_LENGTH M_AVG_SALARY AVG_SAL_DIFF
---------- ------------ ------------ ------------ ------------ ------------
SH_CLERK 11.2 3511 9.9 2973 538
ST_MAN 12.3 7467 10.3 7000 467
ST_CLERK 10.5 2883 10.8 2743 140
PU_CLERK 11.6 2833 10 2700 133
AD_VP 11.1 17000 15.8 17000 0
SA_REP 10.3 8244 10.6 8471 -228
SA_MAN 10.3 12000 10.9 12333 -333
IT_PROG 10.2 4500 10.5 6600 -2100
AC_ACCOUNT 14.4 8300
AC_MGR 14.4 12008
AD_ASST 13.1 4400
AD_PRES 13.4 24000
FI_ACCOUNT 11.2 7920
FI_MGR 14.2 12008
HR_REP 14.4 6500
MK_MAN 12.7 13000
MK_REP 11.2 6000
PR_REP 14.4 10000
PU_MAN 13.9 11000
19 rows selected.
Upvotes: 1
Reputation: 915
Your expected result does not require gender as output column.
So you need to remove it from your select and group by statements :
SELECT job_id, ROUND(AVG(salary),0) as avg_job_salary,
(SELECT ROUND(AVG(salary),0)
FROM hr.employees
WHERE gender = 'M') AS avg_m_salary, (SELECT ROUND(AVG(salary),0)
FROM hr.employees
WHERE gender = 'F') AS avg_f_salary,
ROUND(AVG(days_of_svc/365),1) AS avg_years_svc
FROM (SELECT job_id, salary, gender, (SYSDATE-hire_date) AS days_of_svc
FROM hr.employees)
GROUP BY job_id
ORDER BY job_id;
There are too many dips into the same table in this approach though, in subselect. Let's try to optimize it.
EDIT :
select job_id,ROUND(AVG(salary),0) avg_job_salary,
round(avg(case when gender='M' then Salary end),0) avg_m_salary,
round(avg(case when gender='F' then Salary end),0) avg_f_salary,
round(avg(case when gender='F' then Salary end),0) - round(avg(case when gender='M' then Salary end),0) diff_in_avg
ROUND(AVG((SYSDATE-HIRE_DATE)/365),1) AS avg_years_svc
from hr.employees group by JOB_ID
order by JOB_ID;
I am assuming your formula for calculating avg_years_svc is as expected already, and no bifurcation is required for male and female candidates.
EDIT 2 :
PIVOT function may be able to help. This can help you to bifurcate averages based on male and female count.
Select job_id,avg_m_salary,avg_f_salary, avg_f_salary - avg_m_salary diff_salary from(
select job_id,salary from hr.employees) PIVOT
(avg(salary) for gender in('M' as avg_m_salary,'F' as avg_f_salary));
Upvotes: 1
Reputation: 396
SELECT Q.*, Q.M_AVG_SAL-Q.F_AVG_SAL DIFFERENCE FROM (
SELECT JOB_ID
, SUM(CASE WHEN gender = 'M' THEN SALARY ELSE 0 END)/SUM(CASE WHEN gender = 'M' THEN 1 ELSE 0 END) M_AVG_SAL
, ROUND(SUM(CASE WHEN gender = 'M' THEN SVC_LEN ELSE 0 END)/SUM(CASE WHEN gender = 'M' THEN 1 ELSE 0 END), 1) M_AVG_LENGTH
, SUM(CASE WHEN gender = 'F' THEN SALARY ELSE 0 END)/SUM(CASE WHEN gender = 'F' THEN 1 ELSE 0 END) F_AVG_SAL
, ROUND(SUM(CASE WHEN gender = 'F' THEN SVC_LEN ELSE 0 END)/SUM(CASE WHEN gender = 'F' THEN 1 ELSE 0 END), 1) F_AVG_LENGTH
FROM (
SELECT JOB_ID, SALARY
, MONTHS_BETWEEN(SYSDATE, p.HIRE_DATE)/12 svc_len, GENDER
FROM EMPLOYEES p
)
GROUP BY JOB_ID
) Q;
Awwww, I forgot about division by zero, so:
SELECT JOB_ID
, DECODE(F_CNT, 0, 0, ROUND(F_LEN/F_CNT, 1)) F_AVG_LENGTH
, DECODE(F_CNT, 0, 0, F_SAL/F_CNT) F_AVG_SAL
, DECODE(M_CNT, 0, 0, ROUND(M_LEN/M_CNT, 1)) M_AVG_LENGTH
, DECODE(M_CNT, 0, 0, M_SAL/M_CNT) M_AVG_SAL
, DECODE(M_CNT, 0, 0, M_SAL/M_CNT) - DECODE(F_CNT, 0, 0, F_SAL/F_CNT)
DIFFERENCE FROM (
SELECT JOB_ID
, SUM(CASE WHEN gender = 'M' THEN SALARY ELSE 0 END) M_SAL
, SUM(CASE WHEN gender = 'M' THEN 1 ELSE 0 END) M_CNT
, SUM(CASE WHEN gender = 'M' THEN SVC_LEN ELSE 0 END) M_LEN
, SUM(CASE WHEN gender = 'F' THEN SALARY ELSE 0 END) F_SAL
, SUM(CASE WHEN gender = 'F' THEN 1 ELSE 0 END) F_CNT
, SUM(CASE WHEN gender = 'F' THEN SVC_LEN ELSE 0 END) F_LEN
FROM (
SELECT JOB_ID, SALARY, GENDER
, MONTHS_BETWEEN(SYSDATE, p.HIRE_DATE)/12 svc_len
FROM EMPLOYEES p
)
GROUP BY JOB_ID
) Q;
Upvotes: 0