Stretch0
Stretch0

Reputation: 9251

Query a percentage of users based on multiple select statements

I want to be able to select all who have answered 'Crunchy' and also answered 'Straight' and return that as a percentage of all users who have answered 'Crunchy'.

The end game is to be able to produce a fact such as '70% of users who answered Crunchy also answered Straight'.

|id |question_id | user_id | answer   |
|-------------------------------------|
|1  |   1        |  81     | Crunchy  |
|2  |   1        |  82     | Crunchy  |
|3  |   1        |  83     | Smooth   |
|4  |   2        |  81     | Straight |
|5  |   2        |  82     | Diagonal |
|6  |   2        |  83     | Diagonal |

Pseudo code would be something like the following:

((Select all unique users where answer === Crunchy && where answer === Straight)

/

(Select all unique users where answer === Crunchy)) 

* 100 = %

So far I have a query to select all users who answered Crunchy and also answered Straight:

SELECT *
    FROM
    ((SELECT *      
        FROM answers
        WHERE (question_id = 1 AND answer = 'Crunchy')) A

    JOIN

    (SELECT *       
        FROM answers
        WHERE (question_id = 3 AND answer = 'Straight')) B

    ON A.user_id = B.user_id ) 
    GROUP BY A.user_id

This will select all users who have selected both answers but I need to have it as a sum so I can divide it by a second query which will be the same as the first but without the join, so essentially this;

SELECT *      
        FROM answers
        WHERE (question_id = 1 AND answer = 'Crunchy'

And then multiply the answer of that by 100.

How do I complete the rest of this SQL query or is there a more simple way to go about this?

Upvotes: 3

Views: 49

Answers (1)

Barmar
Barmar

Reputation: 781340

You can use a LEFT JOIN to correlate them.

SELECT IF(COUNT(*) = 0, 0, COUNT(a2.user_id)/COUNT(*)*100) AS percentage
FROM answers AS a1
LEFT JOIN answers AS a2 ON a1.user_id = a2.user_id 
    AND a2.question_id = 2 AND a2.answer = 'Straight'
WHERE a1.question_id = 1 AND a1.answer = 'Crunchy'

The LEFT JOIN will return NULL for the a2 columns when there's no matching row with answer = 'Straight'. And COUNT(a2.user_id) only counts the non-null values, so this counts all the users who answered Straight. Meanwhile, COUNT(*) counts all the rows, which is all the Crunchy answers.

The IF() protects against division by 0 if there are no Crunchy answers.

Upvotes: 2

Related Questions