eaponte
eaponte

Reputation: 439

How to count entries in sql "like in a loop"

First thing, I'm very new to databases so this is probably a very simple question, but I didn't know how to google it or how to give it a good title. I'm using postgres from python, but the problem is to put together the right query.

The scenario is the following. I hava a table with columns: ID, Trial, Subject, Invalid. It comes from a behavioral experiment where many subjects perform a task that is composed of several trials. Their responses can be invalid for different reasons, and depending on the reason there is a different invalidation code (an integer). A valid response has code 0.

------------------------------
ID | SUBJECT | TRIAL | INVALID  
------------------------------
1    Peter     1       0  
2    Peter     2       0  
3    Peter     3       1  
4    Peter     4       3
5    Mary      1       3  
6    Mary      2       2  
7    Mary      3       0  
8    Mary      4       2

I would like to do two things (which I'm not sure how to do in an elegant way).

a) For each subject, I would like to know how many responses are in total and how many are valid. Now I'm making a query for each subjects, with the condition, e.g., WHERE Subject='Peter', but I can imagine that there is a more elegant solution.

Sample answer:

Subject Valids Total
Peter   2      4
Mary    1      4

b) For each subjects, I would like to know how many responses were invalid for each of the invalidation codes. Ideally I would get a table like:

Subject Invalid Count
Peter   0       2
Peter   1       1
Peter   2       0
Peter   3       1
Mary    0       1
Mary    1       0
Mary    2       2
Mary    3       1

Upvotes: 0

Views: 695

Answers (1)

Thorsten Kettner
Thorsten Kettner

Reputation: 94939

Query #1: You want one result row per subject, so you group by subject. Use COUNT to count all records for a subject and COUNT in combination with CASE to count conditionally (all valid ones).

select 
  subject, 
  count(*) as all_responses, 
  count(case when invalid = 0 then 1 end) as valid_responses
from mytable
group by subject;

Query #2: Here you want one result row per subject and code, so you group by these two. Then count with COUNT.

select 
  subject, 
  invalid,
  count(*) as responses
from mytable
group by subject, invalid;

UPDATE: In your updated request you want query #2 to show all subject/code combinations even if they have a count of 0. In order to do this, you'd have to create the set of all valid combinations first and then outer join your response table:

select 
  s.subject, 
  c.code,
  count(m.invalid) as responses
from subjects s
cross join codes c
left join mytable m on (m.subject = subjects.subject and m.invalid = codes.code)
group by s.subject, c.code;

If you don't have tables for subjects and code (which you should), you can get them from your responses table instead:

select 
  s.subject, 
  c.code,
  count(m.invalid) as responses
from (select distinct subject from mytable) s
cross join (select distinct invalid as code from mytable) c
left join mytable m on (m.subject = subjects.subject and m.invalid = codes.code)
group by s.subject, c.code;

Upvotes: 2

Related Questions