Reputation: 4336
One Quiz
can have many Submissions
. I want to fetch all Quizzes
that have at least one associated Submission
with submissions.correct = t
and at least one associated Submission
with submissions.correct = f
.
How can I fix the following query and the WHERE statement in particular to make that happen:
SELECT quizzes.*,
Count(submissions.id) AS submissions_count
FROM "quizzes"
INNER JOIN "submissions"
ON "submissions"."quiz_id" = "quizzes"."id"
WHERE ( submissions.correct = 'f' )
AND ( submissions.correct = 't' )
GROUP BY quizzes.id
ORDER BY submissions_count ASC
Update:
Here is the missing information:
I need all row data from Quizzes. I only need the count for ordering within the query (the quizzes with the least amount of submissions first).
k-voc_development=# \d quizzes;
Table "public.quizzes"
Column | Type | Modifiers
------------+-----------------------------+------------------------------------------------------
id | integer | not null default nextval('quizzes_id_seq'::regclass)
question | character varying | not null
created_at | timestamp without time zone | not null
updated_at | timestamp without time zone | not null
Indexes:
"quizzes_pkey" PRIMARY KEY, btree (id)
Referenced by:
TABLE "submissions" CONSTRAINT "fk_rails_04e433a811" FOREIGN KEY (quiz_id) REFERENCES quizzes(id)
TABLE "answers" CONSTRAINT "fk_rails_431b8a33a3" FOREIGN KEY (quiz_id) REFERENCES quizzes(id)
k-voc_development=# \d submissions;
Table "public.submissions"
Column | Type | Modifiers
------------+-----------------------------+----------------------------------------------------------
id | integer | not null default nextval('submissions_id_seq'::regclass)
quiz_id | integer | not null
correct | boolean | not null
created_at | timestamp without time zone | not null
updated_at | timestamp without time zone | not null
Indexes:
"submissions_pkey" PRIMARY KEY, btree (id)
"index_submissions_on_quiz_id" btree (quiz_id)
Foreign-key constraints:
"fk_rails_04e433a811" FOREIGN KEY (quiz_id) REFERENCES quizzes(id)
k-voc_development=#
Upvotes: 2
Views: 1911
Reputation: 658092
The best solution depends on the details of your implementation, data distribution and requirements.
If you have a typical installation with referential integrity (FK constraint) and defined submissions.correct
as boolean NOT NULL
and just want the quiz_id
along with a count of total submissions, then you don't need to join in quizzes
at all and this should be fastest:
SELECT quiz_id, count(*) AS ct
FROM submissions
-- WHERE correct IS NOT NULL -- only relevant if correct can be NULL
GROUP BY 1
HAVING bool_or(correct)
AND bool_or(NOT correct);
The dedicated aggregate function bool_or()
is particularly useful for your test with boolean values. Simpler and faster than a CASE
expression or similar constructs.
There are many other techniques, the best solution depends on the missing information.
I need all row data from
quizzes
. I only need the count for ordering within the query (the quizzes with the least amount of submissions first).
This should be fastest if lots of quizzes qualify (high percentage of total).
SELECT q.*
FROM (
SELECT quiz_id, count(*) AS ct
FROM submissions
GROUP BY 1
HAVING count(*) > count(correct OR NULL)
) s
JOIN quizzes q ON q.id = s.quiz_id
ORDER BY s.ct;
count(*) > count(correct OR NULL)
works because correct
is boolean NOT NULL
. Should be slightly faster than the variant above for few submissions per quiz.
Upvotes: 2
Reputation: 44250
-- I want to fetch all Quizzes
SELECT * FROM quizzes q
WHERE EXISTS ( -- that have at least one associated Submission with submissions.correct = t
SELECT * FROM submissions s
WHERE s.quiz_id = q.id AND s.correct = 't'
)
AND EXISTS ( -- and at least one associated Submission with submissions.correct = f.
SELECT * FROM submissions s
WHERE s.quiz_id = q.id AND s.correct = 'f'
);
Upvotes: 2
Reputation: 93754
Move your where
clause to Having
Clause with Conditional Count
aggregate
SELECT quizzes.*,
Count(submissions.id) AS submissions_count
FROM "quizzes"
INNER JOIN "submissions"
ON "submissions"."quiz_id" = "quizzes"."id"
GROUP BY quizzes.id
HAVING Count(CASE WHEN submissions.correct = 'f' THEN 1 END) >= 1
and Count(CASE WHEN submissions.correct = 't' THEN 1 END) >= 1
ORDER BY submissions_count ASC
Upvotes: 1
Reputation: 44796
If there are no other submissions.correct values than t and f, then this will work:
SELECT quizzes.*,
Count(submissions.id) AS submissions_count
FROM "quizzes"
INNER JOIN "submissions"
ON "submissions"."quiz_id" = "quizzes"."id"
GROUP BY quizzes.id
HAVING COUNT(DISTINCT submissions.correct) >= 2
ORDER BY submissions_count ASC
Upvotes: 1