sameers
sameers

Reputation: 5095

Postgres SQL group by query that joins and filters on grouping column

The underlying problem is that I have vendor invoices, where each vendor belongs to an organization, and each organization has a set of "excluded vendor roles."

I want to sum and group the invoice charges by vendor, WITHOUT including the charges of vendors whose roles are "excluded."

So if we have the following tables:

excluded roles

|role_id|org_id|
|      1|     1|
|      2|     2|

vendors

|vendor_id|org_id|
|     v1.1|     1|
|     v1.2|     1|
|     v2.1|     2|
|     v2.2|     2|

and this table for charges:

|vendor_id|charge|role|
|     v1.1|     3|   1|
|     v1.1|     9|   2|
|     v1.2|     4|   1|
|     v1.2|    10|   2|
|     v2.1|     5|   1|
|     v2.2|     6|   2|
|     v2.1|     7|   1|

The result should be

| group_by(vendor_id)| sum(charges) |
|                v1.1|            9 |
|                v1.2|           10 |
|                v2.1|           12 |

Can I do this in Postgres without writing a function?

Upvotes: 0

Views: 52

Answers (2)

Mafor
Mafor

Reputation: 10661

Here is another approach, without subquery:

SELECT v.vendor_id, sum(c.charge)
FROM charges c
LEFT JOIN vendors v ON v.vendor_id = c.vendor_id
LEFT JOIN excluded_roles r ON r.role_id = c.role AND r.org_id = v.org_id
WHERE r.role_id IS NULL
GROUP BY v.vendor_id;

Step by step:

-- Get vendors with their roles and org_ids
SELECT *
FROM charges c
LEFT JOIN vendors v ON v.vendor_id = c.vendor_id;
|vendor_id|charge|role|vendor_id|org_id|
|---------|------|----|---------|------|
|v1_1     |3     |1   |v1_1     |1     |
|v1_1     |9     |2   |v1_1     |1     |
|v1_2     |4     |1   |v1_2     |1     |
|v1_2     |10    |2   |v1_2     |1     |
|v2_1     |5     |1   |v2_1     |2     |
|v2_1     |7     |1   |v2_1     |2     |
|v2_2     |6     |2   |v2_2     |2     |
-- Join with the excluded_roles
SELECT v.vendor_id, sum(c.charge)
FROM charges c
LEFT JOIN vendors v ON v.vendor_id = c.vendor_id
LEFT JOIN excluded_roles r ON r.role_id = c.role AND r.org_id = v.org_id;
|vendor_id|charge|role|vendor_id|org_id|role_id|org_id|
|---------|------|----|---------|------|-------|------|
|v1_2     |4     |1   |v1_2     |1     |1      |1     |
|v1_1     |3     |1   |v1_1     |1     |1      |1     |
|v2_1     |5     |1   |v2_1     |2     |       |      |
|v2_1     |7     |1   |v2_1     |2     |       |      |
|v1_2     |10    |2   |v1_2     |1     |       |      |
|v1_1     |9     |2   |v1_1     |1     |       |      |
|v2_2     |6     |2   |v2_2     |2     |2      |2     |
-- Remove the roles existing in the excluded_roles
SELECT *
FROM charges c
LEFT JOIN vendors v ON v.vendor_id = c.vendor_id
LEFT JOIN excluded_roles r ON r.role_id = c.role AND r.org_id = v.org_id
WHERE r.role_id IS NULL;
|vendor_id|charge|role|vendor_id|org_id|role_id|org_id|
|---------|------|----|---------|------|-------|------|
|v2_1     |5     |1   |v2_1     |2     |       |      |
|v2_1     |7     |1   |v2_1     |2     |       |      |
|v1_2     |10    |2   |v1_2     |1     |       |      |
|v1_1     |9     |2   |v1_1     |1     |       |      |
-- Aggregate
SELECT v.vendor_id, sum(c.charge)
FROM charges c
LEFT JOIN vendors v ON v.vendor_id = c.vendor_id
LEFT JOIN excluded_roles r ON r.role_id = c.role AND r.org_id = v.org_id
WHERE r.role_id IS NULL
GROUP BY v.vendor_id;
|vendor_id|sum|
|---------|---|
|v2_1     |12 |
|v1_2     |10 |
|v1_1     |9  |

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269483

If I understand correctly, you basically want not exists:

select vendor_id, sum(c.charges)
from charges c join
     vendors v
     using (vendor_id)
where not exists (select 1
                  from excluded e
                  where e.role_id = c.role and e.org_id = v.org_id
                 )
group by vendor_id;

In your data, vendors can only be in one org. If vendors can be in multiple orgs, then moving the join to the subquery is better:

select vendor_id, sum(c.charges)
from charges c
where not exists (select 1
                  from excluded e join
                       vendors v
                       using (ord_id)
                  where e.role_id = c.role and
                        v.vendor_id = c.vendor_id
                 )
group by vendor_id;

Upvotes: 1

Related Questions