Space Ostrich
Space Ostrich

Reputation: 423

How does `GROUP BY` determine which aggregation method to use?

I'd been having trouble with the "Column 'examplecolumn' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause." error for a while and, much to my exasperation, the eventual fix was to add every offending column to a GROUP BY clause. After doing some research I still don't entirely get why this throws an error instead of giving you a very ugly table, but that's not the important part.

What is important is that according to my research, GROUP BY merges identical rows based on the column that you pass to it. Choosing an appropriate method to add the data together. Out of curiosity, I duplicated my existing query and made one small change. If my understanding of how GROUP BY works was correct it should have given me the same output:

The original:

SELECT [ProductName]
   ,[CompanyName]
   ,[CategoryID] AS 'CategoryName'
   ,SUM(Quantity) AS 'Total Ordered'
FROM [Order Details] AS t1
INNER JOIN Products AS t2
   ON t1.ProductID = t2.ProductID
INNER JOIN Suppliers AS t3
   ON t2.SupplierID = t3.SupplierID
GROUP BY ProductName, CategoryID, CompanyName

The altered version:

SELECT [ProductName]
   ,[CompanyName]
   ,[CategoryID] AS 'CategoryName'
   ,Quantity AS 'Total Ordered'
FROM [Order Details] AS t1
INNER JOIN Products AS t2
   ON t1.ProductID = t2.ProductID
INNER JOIN Suppliers AS t3
   ON t2.SupplierID = t3.SupplierID
GROUP BY ProductName, CategoryID, CompanyName, Quantity

I had believed that the GROUP BY clause would use SUM(Quantity) as it's aggregation method. It did not. In fact, other than remove the error message, I'm not sure what it did.

Here's a screenshot of the output:output of the two queries

It's obvious my understanding of GROUP BY is wrong. I had believed that the aggregation method was determined by the type of data found in the column that was listed. Since that is not the case, how DOES GROUP BY work?

Upvotes: 0

Views: 61

Answers (2)

Damien_The_Unbeliever
Damien_The_Unbeliever

Reputation: 239664

GROUP BY indicates the columns that will not be aggregated. As such, every unique combination of values in those columns will produce one row of output.

All of the remaining columns you want in your output will need to be in aggregates - but that's what the error message tells you - it wants you to tell the server which aggregate is appropriate.

Upvotes: 3

Akshey Bhat
Akshey Bhat

Reputation: 8545

The aggregation method is not decided by GROUP BY clause. It should be given in SQL. The second SQL query just gives you unique combinations which can be formed by values of columns GROUP BY clause.

Upvotes: 0

Related Questions