Honza
Honza

Reputation: 1008

Oracle - group by of joined tables

I tried to look for an answer and I found more advices, but not anyone of them was helpful, so I'm trying to ask now.

I have two tables, one with distributors (columns: distributorid, name) and the second one with delivered products (columns: distributorid, productid, corruptcount, date) - the column corruptcount contains the number of corrupted deliveries. I need to select the first five distributors with the most corrupted deliveries in last two months. I need to select distributorid, name and sum of corruptcount, here is my query:

SELECT del.distributorid, d.name, SUM(del.corruptcount) AS corrupt FROM distributor d, delivery del WHERE d.distributorid = del.distributorid AND d.distributorid IN (SELECT distributorid FROM (SELECT distributorid, SUM(corruptcount) AS corrupt FROM delivery WHERE storeid = 1 AND "date" BETWEEN ADD_MONTHS(SYSDATE, -2) AND SYSDATE AND ROWNUM <= 5 GROUP BY distributorid ORDER BY corrupt DESC)) GROUP BY del.distributorid

But Oracle returns error message: "not a GROUP BY expression".

And when I edit my query to this:

SELECT del.distributorid, d.name, del.corruptcount-- , SUM(del.corruptcount) AS corrupt FROM distributor d, delivery del WHERE d.distributorid = del.distributorid AND d.distributorid IN (SELECT distributorid FROM (SELECT distributorid, SUM(corruptcount) AS corrupt FROM delivery WHERE storeid = 1 AND "date" BETWEEN ADD_MONTHS(SYSDATE, -2) AND SYSDATE AND ROWNUM <= 5 GROUP BY distributorid ORDER BY corrupt DESC)) --GROUP BY del.distributorid

It's working as you expect and returns correct data:
1 IBM 10 2 DELL 0 2 DELL 1 2 DELL 6 3 HP 3 8 ACER 2 9 ASUS 1

I'd like to group this data. Where and why is my query wrong? Can you help please? Thank you very, very much.

Upvotes: 0

Views: 89

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269583

I think the problem is just the d.name in the select list; you need to include it in the group by clause as well. Try this:

SELECT del.distributorid, d.name, SUM(del.corruptcount) AS corrupt
FROM distributor d join
     delivery del
     on d.distributorid = del.distributorid
WHERE d.distributorid IN
             (SELECT distributorid
              FROM delivery
              WHERE storeid = 1 AND
                    "date" BETWEEN ADD_MONTHS(SYSDATE, -2) AND SYSDATE AND
                    ROWNUM <= 5
              GROUP BY distributorid
              ORDER BY SUM(corruptcount) DESC
             ) 
GROUP BY del.distributorid, d.name;

I also switched the query to using explicit join syntax with an on clause, instead of the outdated implicit join syntax using a condition in the where.

I also removed the additional layer of subquery. It is not really necessary.

EDIT:

"Why does d.name have to be included in the group by?" The easy answer is that SQL requires it because it does not know which value to include from the group. You could instead use min(d.name) in the select, for instance, and there would be no need to change the group by clause.

The real answer is a wee bit more complicated. The ANSI standard does actually permit the query as you wrote it. This is because id is (presumably) declared as a primary key on the table. When you group by a primary key (or unique key), then you can use other columns from the same table just as you did. Although ANSI supports this, most databases do not yet. So, the real reason is that Oracle doesn't support the ANSI standard functionality that would allow your query to work.

Upvotes: 1

Related Questions