Reputation: 14906
I am trying to join 2 tables into one table book
(they have one-to-one relation) and then nest a one-to-many relation ebooks
If I try to nest the one-to-many results it works fine:
select r.*, array_agg(e) e_books from gardner_record r
left join gardner_e_book_record e on r.ean_number = e.physical_edition_ean
group by r.id
This works as expected. I get the results back but there is a few more columns I need to add to gardner_record
from another table with one-to-one relationship called gardner_inventory_item
This is what I have tried:
select book.*, array_agg(e) e_books from (
select r.*,
i.price,
i.discount,
i.free_stock,
i.report,
i.report_date
from gardner_record r
inner join gardner_inventory_item i on r.ean_number = i.ean
) book
left join gardner_e_book_record e on book.ean_number = e.physical_edition_ean
group by book.id
But it doesn't work. The query gives error that modified_date
(a column on gardner_record
) needs to be in group by or aggregated.
Why is this? I make a single table in subquery. Then I left join the nested records as ebooks
and use the array_agg()
in select
. Shouldn't that mean it is all aggregated? It works in first query.
What am I missing conceptually here?
Upvotes: 0
Views: 115
Reputation: 658102
The first query does not produce the same error because (I assume since you did not disclose) id
is the PRIMARY KEY
of gardner_record
, and the PK covers all columns of the same table in the GROUP BY
clause since Postgres 9.1. Related:
In the second query, you have formed the derived table book
before you aggregate. In the outer SELECT
, id
no longer has that property (the derived table book has no PK) and it only covers itself in the GROUP BY
list - and there are also additional columns from gardner_inventory_item
which are not covered at all in the GROUP BY
list.
This would work:
SELECT r.*
, i.price
, i.discount
, i.free_stock
, i.report
, i.report_date
, array_agg(e) AS e_books
FROM gardner_record r
JOIN gardner_inventory_item i ON r.ean_number = i.ean
LEFT JOIN gardner_e_book_record e
GROUP BY r.id, i.ean; -- Assuming these are PKs of each table
Or, assuming we can aggregate on physical_edition_ean
just as well (since you mentioned that 1:1 relationship):
SELECT r.*
, i.price
, i.discount
, i.free_stock
, i.report
, i.report_date
, array_agg(e) AS e_books
FROM gardner_record r
JOIN gardner_inventory_item i ON r.ean_number = i.ean
LEFT JOIN (
SELECT physical_edition_ean, array_agg(e) AS e_books
FROM gardner_e_book_record e
GROUP BY 1
) e ON e.physical_edition_ean = r.ean_number
Simpler and faster.
Upvotes: 1
Reputation: 15691
In general any column that does not appear in an aggregate function must be included in the GROUP BY
clause, however if the selected table has a primary key, then simply including the primary key in the GROUP BY
clause allows all columns of that table to be selected without aggregation, which you make use of in the first query. However this approach does not work when the primary key is wrapped in a subquery since the Postgres optimizer cannot tell that the column was originally a primary key anymore.
To get around this, you could either list out all the columns of the book
subquery explicitly (which might be cumbersome).
Alternatively, you could join and group the gardner_record
and gardner_e_book_record
tables in a subquery and then join the gardner_inventory_item
afterwards. Basically just putting your first query as the subquery and then joining on the gardner_inventory_item
table. Example:
select book_e_ebooks, i.* from (
select r.*, array_agg(e) e_books from gardner_record r
left join gardner_e_book_record e on r.ean_number = e.physical_edition_ean
group by r.id
) book_e_ebooks
left join gardner_inventory_item i on book_e_ebooks.ean_number = i.ean;
But you really don't need a subquery at all:
select r.*, i.*, array_agg(e) e_books from gardner_record r
left join gardner_inventory_item i on r.ean_number = i.ean
left join gardner_e_book_record e on r.ean_number = e.physical_edition_ean
group by r.id, i.id
Upvotes: 1