Reputation: 147
I have a problem with the following query:
SELECT
ee.id
ee.column2
ee.column3,
ee.column4,
SUM(ee.column5)
FROM
table1 ee
LEFT JOIN table2 epc ON ee.id = epc.id
WHERE
ee.id (6050)
GROUP BY ee.id
WHERE column id is the primary key. On version 8.4, the query returns an error saying that column2, column3 and column4 don't exist in the group by clause.
This same query executes successfully on version 9.3.
Does anybody know why?
Upvotes: 8
Views: 6612
Reputation: 4017
This does not hold for views however.
So this works fine:
CREATE TABLE table1(
id int NOT null PRIMARY KEY,
column2 int,
column3 int,
column4 int,
column5 int
);
SELECT id, column2, column3, column4, SUM(column5)
FROM table1
GROUP BY id;
If you instead replace querying against the table with a view that's simply a select of all of the columns in the table:
CREATE VIEW view1 AS
SELECT id, column2, column3, column4, column5
FROM table1;
SELECT id, column2, column3, column4, SUM(column5)
FROM view1
GROUP BY id;
It'll fail with (PostgreSQL 11.18):
ERROR: column "view1.column2" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT id, column2, column3, column4, SUM(column5)
^
SQL state: 42803
Character: 12
Upvotes: 0
Reputation: 109263
The SQL-99 standard introduced the concept of functionally dependent columns. A column is functionally dependent on another column when that other column (or set of columns) already uniquely defines it. So if you have a table with a primary key, then all other columns in that table are functionally dependent on that primary key.
So when using a GROUP BY
, and you include the primary key of a table, then you do not need to include the other columns of that same table in the GROUP BY
-clause as they have already been uniquely identified by the primary key.
This is also documented in GROUP BY
Clause:
When
GROUP BY
is present, or any aggregate functions are present, it is not valid for theSELECT
list expressions to refer to ungrouped columns except within aggregate functions or when the ungrouped column is functionally dependent on the grouped columns, since there would otherwise be more than one possible value to return for an ungrouped column. A functional dependency exists if the grouped columns (or a subset thereof) are the primary key of the table containing the ungrouped column.
(emphasis mine)
Upvotes: 2
Reputation:
This was introduced in 9.1
Quote from the release notes:
Allow non-GROUP BY columns in the query target list when the primary key is specified in the GROUP BY clause (Peter Eisentraut)
The SQL standard allows this behavior, and because of the primary key, the result is unambiguous.
It is also explained with examples in the chapter about group by
:
In this example, the columns product_id, p.name, and p.price must be in the GROUP BY clause since they are referenced in the query select list (but see below). The column s.units does not have to be in the GROUP BY list since it is only used in an aggregate expression (sum(...)), which represents the sales of a product. For each product, the query returns a summary row about all sales of the product.
In a nutshell: if the group by
clause contains a column that uniquely identifies the rows, it is sufficient to include that column only.
Upvotes: 6