Reputation: 439
First thing, I'm very new to databases so this is probably a very simple question, but I didn't know how to google it or how to give it a good title. I'm using postgres from python, but the problem is to put together the right query.
The scenario is the following. I hava a table with columns: ID, Trial, Subject, Invalid. It comes from a behavioral experiment where many subjects perform a task that is composed of several trials. Their responses can be invalid for different reasons, and depending on the reason there is a different invalidation code (an integer). A valid response has code 0.
------------------------------ ID | SUBJECT | TRIAL | INVALID ------------------------------ 1 Peter 1 0 2 Peter 2 0 3 Peter 3 1 4 Peter 4 3 5 Mary 1 3 6 Mary 2 2 7 Mary 3 0 8 Mary 4 2
I would like to do two things (which I'm not sure how to do in an elegant way).
a) For each subject, I would like to know how many responses are in total and how many are valid. Now I'm making a query for each subjects, with the condition, e.g., WHERE Subject='Peter', but I can imagine that there is a more elegant solution.
Sample answer:
Subject Valids Total Peter 2 4 Mary 1 4
b) For each subjects, I would like to know how many responses were invalid for each of the invalidation codes. Ideally I would get a table like:
Subject Invalid Count Peter 0 2 Peter 1 1 Peter 2 0 Peter 3 1 Mary 0 1 Mary 1 0 Mary 2 2 Mary 3 1
Upvotes: 0
Views: 695
Reputation: 94939
Query #1: You want one result row per subject, so you group by subject. Use COUNT to count all records for a subject and COUNT in combination with CASE to count conditionally (all valid ones).
select
subject,
count(*) as all_responses,
count(case when invalid = 0 then 1 end) as valid_responses
from mytable
group by subject;
Query #2: Here you want one result row per subject and code, so you group by these two. Then count with COUNT.
select
subject,
invalid,
count(*) as responses
from mytable
group by subject, invalid;
UPDATE: In your updated request you want query #2 to show all subject/code combinations even if they have a count of 0. In order to do this, you'd have to create the set of all valid combinations first and then outer join your response table:
select
s.subject,
c.code,
count(m.invalid) as responses
from subjects s
cross join codes c
left join mytable m on (m.subject = subjects.subject and m.invalid = codes.code)
group by s.subject, c.code;
If you don't have tables for subjects and code (which you should), you can get them from your responses table instead:
select
s.subject,
c.code,
count(m.invalid) as responses
from (select distinct subject from mytable) s
cross join (select distinct invalid as code from mytable) c
left join mytable m on (m.subject = subjects.subject and m.invalid = codes.code)
group by s.subject, c.code;
Upvotes: 2