Reputation: 121
I have two tables as follows:
TABLE A
| id | col_a | col_b | user_id |
--------------------------------
| 1 | false | true | 1 |
| 2 | false | true | 2 |
| 3 | true | true | 2 |
| 4 | true | true | 3 |
| 5 | true | false | 1 |
TABLE B
| id | name |
--------------
| 1 | Bob |
| 2 | Jim |
| 3 | Helen |
| 4 | Michael|
| 5 | Jen |
I want to get the sum of two counts, which are the number of true
values in col_a
and number of true
values in col_b
. I want to group that data by user_id
. I also want to join Table B and get the name of each user. The result would look like this:
|user_id|total (col_a + col_b)|name
------------------------------------
| 1 | 2 | Bob
| 2 | 3 | Jim
| 3 | 2 | Helen
So far I got the total sum with the following query:
SELECT
(SELECT COUNT(*) FROM "TABLE_A" WHERE "col_a" is true)+
(SELECT COUNT(*) FROM "TABLE_A" WHERE "col_b" is true)
as total
However, I'm not sure how to proceed with grouping these counts by user_id.
Upvotes: 1
Views: 1049
Reputation: 656391
Something like this is typically fastest:
SELECT *
FROM "TABLE_B" b
JOIN (
SELECT user_id AS id
, count(*) FILTER (WHERE col_a)
+ count(*) FILTER (WHERE col_b) AS total
FROM "TABLE_A"
GROUP BY 1
) a USING (id);
While fetching all rows, aggregate first, join later. That's cheaper. See:
The aggregate FILTER
clause is typically fastest. See:
Often, you want to keep total counts of 0 in the result. You did say:
get the name of each user.
SELECT b.id AS user_id, b.name, COALESCE(a.total, 0) AS total
FROM "TABLE_B" b
LEFT JOIN (
SELECT user_id AS id
, count(col_a OR NULL)
+ count(col_b OR NULL) AS total
FROM "TABLE_A"
GROUP BY 1
) a USING (id);
...
count(col_a OR NULL)
is an equivalent alternative, shortest, and still fast. (Use the FILTER
clause from above for best performance.)
The LEFT JOIN
keeps all rows from "TABLE_B"
in the result.
COALESCE()
return 0
instead of NULL
for the total count.
If col_a
and col_b
have only few true
values, this is typically (much) faster - basically what you had already:
SELECT b.*, COALESCE(aa.ct, 0) + COALESCE(ab.ct, 0) AS total
FROM "TABLE_B" b
LEFT JOIN (
SELECT user_id AS id, count(*) AS ct
FROM "TABLE_A"
WHERE col_a
GROUP BY 1
) aa USING (id)
LEFT JOIN (
SELECT user_id AS id, count(*) AS ct
FROM "TABLE_A"
WHERE col_b
GROUP BY 1
) ab USING (id);
Especially with (small in this case!) partial indexes like:
CREATE INDEX a_true_idx on "TABLE_A" (user_id) WHERE col_a;
CREATE INDEX b_true_idx on "TABLE_A" (user_id) WHERE col_b;
Aside: use legal, lower-case unquoted names in Postgres to make your like simpler.
Upvotes: 1
Reputation: 7403
You are double counting JIM, if that is not supposed since it only shows up in two rows and not three, maybe you can do the following:
with cte_A as (
select col_a as col, user_id
from A
where col_a=true
union -- ALL -- (if you want to double count Jim)
select col_b as col, user_id
from A
where col_b=true
)
select B.user_id, sum(*) as total, B.name
from cte_A
join B
on cte_A.user_id = B.user_id
group by B.user_id
If you want to actually double count then use the UNION ALL instead of UNION
Upvotes: 0
Reputation: 24568
select user_id,name
, count(case when col_a = true then 1 end)
+ count(case when col_b = true then 1 end) total
from tableA a
join TableB b on a.user_id= b.id
group by user_id,name
Upvotes: 1