GettingStarted
GettingStarted

Reputation: 7605

I don't understand why my group by is failing

SELECT ENAME, MAX(SAL), STORES.CITY 
FROM EMPLOYEES 
INNER JOIN STORES ON EMPLOYEES.STORE_ID = STORES.STORE_ID
GROUP BY EMPLOYEES.STORE_ID, STORES.CITY

DDL for STORES:

CREATE TABLE  "STORES" 
(   
    "STORE_ID" NUMBER NOT NULL ENABLE, 
    "CITY" VARCHAR2(50), 
     PRIMARY KEY ("STORE_ID")
     USING INDEX  ENABLE
)
/

DDL for EMPLOYEES:

CREATE TABLE  "EMPLOYEES" 
(   
    "EMPNO" NUMBER(4,0), 
    "ENAME" VARCHAR2(10), 
    "JOB" VARCHAR2(9), 
    "HIREDATE" DATE, 
    "SAL" NUMBER(7,2), 
    "COMM" NUMBER(7,2), 
    "STORE_ID" NUMBER
)
/

CREATE INDEX  "EMP_NAME_IDEX" ON  "EMPLOYEES" ("ENAME")
/

CREATE INDEX  "EMP_NAME_JOB_DATE_IDX" ON  "EMPLOYEES" ("ENAME", "JOB", "HIREDATE")
/

I am trying to get a new view where I can display the record for the highest paid employee in each store.

Desired output:

Like this:

ENAME   JOB      STORE_ID   MAX(SAL)    CITY
------------------------------------------------------
ALLEN   SALESMAN    2        1600       New York City
KING    PRESIDENT   3        5000       Chicago
SCOTT   ANALYST     4        3000       Philadelphia

Current output:

ENAME   JOB     STORE_ID    MAX(SAL)    CITY
------------------------------------------------------------
ALLEN   SALESMAN    2       1600    New York City
TURNER  SALESMAN    2       1500    New York City
WARD    SALESMAN    2       1250    New York City
MARTIN  SALESMAN    2       1250    New York City
KING    PRESIDENT   3       5000    Chicago
BLAKE   MANAGER     3       2850    Chicago
CLARK   MANAGER     3       2450    Chicago
SCOTT   ANALYST     4       3000    Philadelphia
FORD    ANALYST     4       3000    Philadelphia
JONES   MANAGER     4       2975    Philadelphia
MILLER  CLERK       4       1300    Philadelphia
ADAMS   CLERK       4       1100    Philadelphia
JAMES   CLERK       4        950    Philadelphia
SMITH   CLERK       4        800    Philadelphia

Upvotes: 2

Views: 77

Answers (5)

Hilarion
Hilarion

Reputation: 870

As others have already explained, "You can't have your cake and eat it".
You either group by something, and can return it as is, or have something aggregated. In your case you want to group by store, i.e. have a separate result row per store (that matches the inner join criteria), but not by employee. In that case the employee data has to be aggregated, which includes name.
A classic aggregation takes all entries within a group, so you would use all employee names in the aggregation, which is not what you want, as you want just the employee with the top salary.

Fortunately Oracle DB does have something for such cases (although possibly not all), and that is the KEEP modifier of aggregate functions, with its DENSE_RANK and LAST / FIRST.
This construct allows the aggregation to be performed on a subset of rows from the group, where what you KEEP in the aggregation is FIRST or LAST in a designated ranking (based on given ordering criteria). In your case, you'd need to order employees within the group (employees of a store) based on salary criterion and that way have the aggregation limited to the best paid employees. Assuming, that salaries are unique values (which they are usually not), this will give you a sub-group of one employee, so you can apply any aggregation you want (often MAX or MIN), and still get that one name. If we'll assume, that salary values are not unique, then there may be more than one employee with the top salary - there's a tie on the first place. In that case you either introduce a tie-breaker into the ranking order (e.g. the secondary criteria, after salary, can be the job name) or resolve the issue at the aggregation level (e.g. by choosing the MAX or MIN employee name).

All in all we get something like this:

SELECT MAX(e.ename)
         KEEP (DENSE_RANK FIRST ORDER BY e.sal DESC NULLS LAST, e.job) AS name,
       MIN(e.job)
         KEEP (DENSE_RANK FIRST ORDER BY e.sal DESC NULLS LAST, e.job) AS job,
       e.store_id,
       MAX(e.sal) AS salary,
       s.city
  FROM employees e
 INNER JOIN stores s ON e.store_id = s.store_id
 GROUP BY e.store_id, s.sity

Upvotes: 1

walid
walid

Reputation: 21

So while there are many answers that contain corrections to your code, none of them answer the question of why your GROUP BY is failing.

The answer is as follows: The GROUP BY clause can only be used on columns that are explicitly mentioned in the query after the SELECT clause. So you SELECT the columns that you want returned to you and then you group them by one or more of those selected columns. If GROUP BY contains columns that are not selected in SELECT then it doesn't have those columns to group the data by. A common misconception is to believe that the query is going to treat the whole database as data that is present to all the clauses of a SQL statement. The reality is that GROUP BY only operates on the result of the SELECT.

The following works because all columns used in GROUP BY are queried by SELECT:

SELECT COL1, COL2, COL3 FROM MYTABLE GROUP BY COL2, COL3

but the following fails because COL4 in GROUP BY is not queried by SELECT:

SELECT COL1, COL2, COL3 FROM MYTABLE GROUP BY COL2, COL4

Upvotes: 0

Zaynul Abadin Tuhin
Zaynul Abadin Tuhin

Reputation: 31993

In group by number of selection list must need to put in group by clause

SELECT ENAME, MAX(SAL), STORES.CITY FROM EMPLOYEES 
INNER JOIN STORES 
ON EMPLOYEES.STORE_ID = STORES.STORE_ID
GROUP BY EMPLOYEES.STORE_ID,ENAME,STORES.CITY

As you select ENAME but not included that on group by as a result it throws error

As you change output so below will work for your change output

  with cte as
       ( select e.ENAME,SAL,CITY
       , row_number() over(partition by e.STORE_ID order by e.sal desc) rn
       FROM EMPLOYEES e
        INNER JOIN STORES  s  ON e.STORE_ID = s.STORE_ID
      ) select ENAME,SAL,CITY from cte where rn=1

Upvotes: 1

Barbaros Özhan
Barbaros Özhan

Reputation: 65278

You can convert it to :

SELECT ENAME, MAX(SAL), STORES.CITY FROM EMPLOYEES 
INNER JOIN STORES 
ON EMPLOYEES.STORE_ID = STORES.STORE_ID
GROUP BY ENAME, STORES.CITY

or

SELECT EMPLOYEES.STORE_ID, MAX(SAL), STORES.CITY FROM EMPLOYEES 
INNER JOIN STORES 
ON EMPLOYEES.STORE_ID = STORES.STORE_ID
GROUP BY EMPLOYEES.STORE_ID, STORES.CITY

since in your case SQL Select statement has non-grouped and non-aggregated columns in the select list, that's a violation of rules. And for your target might prefer the following one :

SELECT ENAME, MAX(SAL), STORES.STORE_ID, STORES.CITY FROM EMPLOYEES 
INNER JOIN STORES 
ON EMPLOYEES.STORE_ID = STORES.STORE_ID
GROUP BY ENAME, STORES.STORE_ID, STORES.CITY

Upvotes: 2

flyingfox
flyingfox

Reputation: 13506

You need to remove ENAME or add ENAME to GROUP BY

SELECT MAX(SAL), STORES.CITY FROM EMPLOYEES 
INNER JOIN STORES 
ON EMPLOYEES.STORE_ID = STORES.STORE_ID
GROUP BY EMPLOYEES.STORE_ID, STORES.CITY

or

SELECT ENAME, MAX(SAL), STORES.CITY FROM EMPLOYEES 
INNER JOIN STORES 
ON EMPLOYEES.STORE_ID = STORES.STORE_ID
GROUP BY EMPLOYEES.STORE_ID,ENAME, STORES.CITY

Upvotes: 1

Related Questions