jhohman
jhohman

Reputation: 37

Postgres SQL GROUP BY without limiting rows

I am running the following SQL query in Postgres:

SELECT a.id, a.name, array_to_string(array_agg(c.name), ',') AS tags_string, 
CASE d.is_reminder WHEN 't' then 'overdue' ELSE '' END AS overdue
FROM contacts AS a
LEFT OUTER JOIN taggings b ON b.contact_id=a.id
LEFT OUTER JOIN tags c ON b.tag_id=c.id
LEFT OUTER JOIN tasks d ON a.id=d.contact_id
GROUP BY a.id, d.is_reminder
ORDER BY a.id;

Which returns the following records from my db:

  id   |        name        |         tags_string          | overdue 
-------+--------------------+------------------------------+---------
 24471 | Austin Wang        |                              | 
 24472 | Chris Rothstein    | Seller                       | overdue
 24473 | Josh Hohman        | Seller                       | overdue
 24474 | Jay Pritchett      | Friends & Family             | 
 24475 | Claire Dunphy      | Past Client,Friends & Family | 
 24475 | Claire Dunphy      | Past Client,Friends & Family | overdue
 24476 | Haley Dunphy       | Buyer                        | overdue
 24477 | Cameron Tucker     | Friends & Family             | overdue
 24478 | Gloria Pritchett   | Friends & Family             | overdue
 24479 | Mitchell Pritchett | Buyer                        | overdue

I only want to return 1 row per id. In the results above, Claire Dunphy id:24475 appears twice: once with "overdue" and once without. If an id has an "overdue" then I want this record to appear.

If the contact does not have an overdue task, then I still want to display record. I just want to eliminate showing the non-overdue tasks (if any) for contacts that also have an overdue task.

In the results above, this means that I would show the second Claire Dunphy record, but not the first.

Any help is greatly appreciated!

Upvotes: 1

Views: 109

Answers (2)

peterm
peterm

Reputation: 92805

Try

SELECT a.id, 
       a.name, 
       array_to_string(array_agg(c.name), ',') tags_string, 
       CASE WHEN strpos(array_to_string(array_agg(d.is_reminder), ','), 't') > 0 THEN 'overdue' ELSE '' END overdue
  FROM contacts a LEFT JOIN taggings b 
    ON b.contact_id = a.id LEFT JOIN tags c 
    ON b.tag_id = c.id LEFT JOIN tasks d 
    ON a.id = d.contact_id
 GROUP BY a.id 

Here is SQLFiddle demo that shows the idea.

It obviously based on preaggregated sample data that you posted, but nevertheless shows how to get detect existence of overdue row for aggregated column.

Upvotes: 2

Hart CO
Hart CO

Reputation: 34784

You need an aggregate, MAX() will pick something over nothing, so it works in this case:

SELECT a.id, a.name, array_to_string(array_agg(c.name), ',') AS tags_string, 
MAX(CASE d.is_reminder WHEN 't' then 'overdue' ELSE '' END) AS overdue
FROM contacts AS a
LEFT OUTER JOIN taggings b ON b.contact_id=a.id
LEFT OUTER JOIN tags c ON b.tag_id=c.id
LEFT OUTER JOIN tasks d ON a.id=d.contact_id
GROUP BY a.id, a.name, array_to_string(array_agg(c.name), ',')
ORDER BY a.id

Upvotes: 0

Related Questions