Reputation: 606
I'm making a query with having multiple non aggregated columns with group by clause but Postgres is throwing an error that I have to add non aggregated columns in group by or use any aggregate function on that column this is the query that I'm trying to run.
select
tb1.pipeline as pipeline_id,
tb3.pipeline_name as pipeline_name,
tb2."name" as integration_name,
cast(tb1.integration_id as VARCHAR) as integration_id,
tb1.created_at as created_at,
cast(tb1.id as VARCHAR) as batch_id,
sum(tb1.row_select) as row_select,
sum(tb1.row_insert) as row_insert,
from
table1 tb1
join
table2 tb2 on tb1.integration_id = tb2.id
join
table3 tb3 on tb1.pipeline = tb3.id
where
tb1.pipeline is not null
and tb1.is_super_parent = false
group by
tb1.pipeline
and I found one solution/hack for this error that is I added max function in all other non aggregated columns this solves my problem.
select
tb1.pipeline as pipeline_id,
max(tb3.pipeline_name) as pipeline_name,
max(tb2."name") as integration_name,
max(cast(tb1.integration_id as VARCHAR)) as integration_id,
max(tb1.created_at) as created_at,
max(cast(tb1.id as VARCHAR)) as batch_id,
sum(tb1.row_select) as row_select,
sum(tb1.row_insert) as row_insert,
from
table1 tb1
join
table2 tb2 on tb1.integration_id = tb2.id
join
table3 tb3 on tb1.pipeline = tb3.id
where
tb1.pipeline is not null
and tb1.is_super_parent = false
group by
tb1.pipeline
But I don't want to add max functions when there is no need for that second thing is that applying max to all other column query will be expensive so any other better approach that I can do to solve the above issue, thanks in advance.
Upvotes: 0
Views: 1013
Reputation: 14861
Well the first thing you need is to learn to format your queries in so as to get an idea of their flow at a glance. Note due to the extra comma in row_insert, from
your query will give a syntax error. With that said; How do you solve your issue?
You cannot avoid the additional aggregates or the expanded group by as long as the exist in the scope same query. You need to separate the aggregation from selection of additional columns. You basically have 2 choices:
with sums (pipeline_id, row_select, row_insert) as
( select tb1.pipeline
, sum(tb1.row_select) as row_select
, sum(tb1.row_insert) as row_insert
table1 tb1
where tb1.pipeline is not null
and tb1.is_super_parent = false
group by tb1.pipeline
)
select s.pipeline_id
, tbl3.pipeline_name
, tb2."name" integration_name
, s.row_select
, s.row_insert
from sums s
join table2 tbl2 on (s.pipeline_id = tb2.id)
join table3 tbl3 on (s.pipeline_id = tb3.id);
select s.pipeline_id
, tbl3.pipeline_name
, tb2."name" integration_name
, s.row_select
, s.row_insert
from ( select tb1.pipeline
, sum(tb1.row_select) as row_select
, sum(tb1.row_insert) as row_insert
table1 tb1
where tb1.pipeline is not null
and tb1.is_super_parent = false
group by tb1.pipeline
) s
join table2 tbl2 on (s.pipeline_id = tb2.id)
join table3 tbl3 on (s.pipeline_id = tb3.id);
NOTE: Not tested as no sample data supplied.
Upvotes: 1