jackhammer013
jackhammer013

Reputation: 2297

Postgres COUNT number of column values with INNER JOIN

I am creating a report in Postgres 9.3. This is my SQL Fiddle.
Basically I have two tables, responses and questions, the structure is:

responses
->id
->question_id
->response

questions
->id
->question
->costperlead

for the column response there can only be 3 values, Yes/No/Possbily, and my report should have the columns:

  question_id
, # of Yes Responses
, # of No Responses
, # of Possbily Responses
, Revenue

Then:

# of Yes Responses - count of all Yes values in the response column
# of No Responses - count of all No values in the response column
# of Possbily Responses - count of all 'Possbily' values in the response column

Revenue is the costperlead * (Number of Yes Responses + Number of Possibly Responses).

I don't know how to construct the query, I'm new plus I came from MySQL so some things are different for postgres. In my SQL Fiddle sample most responses are Yes and Null, it's ok eventually, there will be Possibly and No.

So far I have only:

SELECT a.question_id
FROM responses a
INNER JOIN questions b ON a.question_id = b.id
WHERE a.created_at = '2015-07-17'
GROUP BY a.question_id;

Upvotes: 4

Views: 3023

Answers (2)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 656804

Since the only predicate filters rows from table responses, it would be most efficient to aggregate responses first, then join to questions:

SELECT *, q.costperlead * (r.ct_yes + r.ct_maybe) AS revenue
FROM  (
   SELECT question_id
        , count(*) FILTER (WHERE response = 'Yes')      AS ct_yes
        , count(*) FILTER (WHERE response = 'No')       AS ct_no
        , count(*) FILTER (WHERE response = 'Possibly') AS ct_maybe
   FROM   responses
   WHERE  created_at = '2015-07-17'
   GROUP  BY 1
   ) r
JOIN   questions q ON q.id = r.question_id;

db<>fiddle here

This uses the aggregate FILTER clause (in Postgres 9.4 or later). See:

Aside: consider implementing response as boolean type with true/false/null.

For Postgres 9.3:

SELECT *, q.costperlead * (r.ct_yes + r.ct_maybe) AS revenue
FROM  (
   SELECT question_id
        , count(response = 'Yes' OR NULL)      AS ct_yes
        , count(response = 'No' OR NULL)       AS ct_no
        , count(response = 'Possibly' OR NULL) AS ct_maybe
   FROM   responses
   WHERE  created_at = '2015-07-17'
   GROUP  BY 1
   ) r
JOIN   questions q ON q.id = r.question_id;

Old sqlfiddle

Comprehensive comparison of techniques:

Upvotes: 3

Renzo
Renzo

Reputation: 27424

You should try:

SELECT a.question_id, 
       SUM(CASE WHEN a.response = 'Yes' THEN 1 ELSE 0 END) AS NumsOfYes, 
       SUM(CASE WHEN a.response = 'No' THEN 1 ELSE 0 END) AS NumsOfNo,
       SUM(CASE WHEN a.response = 'Possibly' THEN 1 ELSE 0 END) AS NumOfPossibly,
       costperlead  * SUM(CASE WHEN a.response = 'Yes' THEN 1 ELSE 0 END) + SUM(CASE WHEN a.response = 'Possibly' THEN 1 ELSE 0 END) AS revenue
FROM responses a 
     INNER JOIN questions b ON a.question_id = b.id 
GROUP BY a.question_id, b.costperlead

Upvotes: 4

Related Questions