GeekedOut
GeekedOut

Reputation: 17185

Count query stops working correctly after I add a where clause

I have 3 tables: problems, categories, and a joining table for problem_categories

I am trying to get and count all the categories associated with a problem, and count how many problems are associated with each problem.

Here is my query which doesn't return correct results:

SELECT 
  category_name , 
  categories.category_id , 
  problems.problem_id, 
  COUNT(problems.problem_id) as num_problems 
FROM 
  problem_categories 
left JOIN 
  categories 
on 
  problem_categories.category_id = categories.category_id 
left join 
  problems 
on
  problem_categories.problem_id = problems.problem_id 
WHERE 
  problem_categories.problem_id = 266
GROUP BY 
  problems.problem_id , category_name

but as soon as I take out the where clause, it returns all the categories and correct counts of problems. But the trick is that I need to query this for a particular problem_id

Any way to adjust this query to get it right? What am I doing incorrectly here?

Thanks!!

Upvotes: 1

Views: 183

Answers (4)

Gareth
Gareth

Reputation: 936

Your question is somewhat unclear. How can you count problems if you are selecting a specific one? If you want to count categories associated with a particular problem you do this (I am assuming you have foreign keys and no duplicates in the problem_categories table):

SELECT problem_id, COUNT(category_id) as num_categories
FROM problem_categories
WHERE problem_id = 266
GROUP BY problem_id

then join to problem table if you need to print out more problem information (for efficiency, always get the detail last). If you are wanting to count problem -> category -> problem then do:

select p1.problem_id, (COUNT(DISTINCT(p2.problem_id)) - 1) as num_problems
FROM
(select problem_id, category_id FROM problem_categories where problem_id = 266) as p1
inner join 
(select problem_id, category_id FROM problem_categories) as p2
ON p1.category_id = p2.category_id
GROUP by p1.problem_id

Again if you want to show the problem details, join to the problem table to get them. If you want the num_problems and num_categories together join the first query to the second one before joining to the problem table to get the problem detail. i.e.:

SELECT p1.problem_id, p1.num_categories, p2.num_problems 
FROM
(SELECT problem_id, COUNT(category_id) as num_categories
FROM problem_categories
WHERE problem_id = 266
GROUP BY problem_id) as p1
INNER JOIN
(SELECT p1.problem_id, (COUNT(DISTINCT(p2.problem_id)) - 1) as num_problems
FROM
(SELECT problem_id, category_id FROM problem_categories where problem_id = 266) as p1
INNER JOIN 
(SELECT problem_id, category_id FROM problem_categories) as p2
ON p1.category_id = p2.category_id
GROUP by p1.problem_id) as p2
ON p1.problem_id = p2.problem_id

Upvotes: 1

rauschen
rauschen

Reputation: 3996

SELECT p1.problem_id, problem_categories.category_id , category_name , 
    COUNT(*) as num_problems 
FROM problems p1
LEFT JOIN problem_categories  on
   p1.problem_id = problem_categories.problem_id 
LEFT JOIN categories on 
   problem_categories.category_id = categories.category_id 
LEFT JOIN problems p2 ON
   p2.problem_id = problem_categories.problem_id 
WHERE p1.problem_id = 266
GROUP BY categories.category_id,p2.problem_id
ORDER BY categories.category_id

Thats it

Upvotes: 1

user890904
user890904

Reputation:

you can prepare the categories and the count of problems in them. after that you can use it as part of any query to supply info.

SELECT categories.category_name ,
       categories.category_id,
       (select COUNT(*)
        from problem_categories
        where problem_categories.category_id =categories.category_id
       ) as num_problems
FROM categories 

that can be used in a join (and if you wish stored in a temporary table) to supplement info about the categories:

select problem_categories.problem_id,
       temp_tab.category_name ,
       temp_tab.category_id,
       temp_tab.num_problems            
from problem_categories, 
(
SELECT categories.category_name ,
       categories.category_id,
       (select COUNT(*)
        from problem_categories
        where problem_categories.category_id =categories.category_id
       ) as num_problems
FROM categories 
) temp_tab
where problem_categories.problem_id =266
and problem_categories.category_id = temp_tab.category_id

if you will use this with more than one id or without the id condition, it is better to store the info from the category in a temporary table and join it in order not to repeat the calculations.

Upvotes: 1

Luis
Luis

Reputation: 1294

I think the following should work:

SELECT category_name , categories.category_id, COUNT(*) as num_problems 
FROM problem_categories src JOIN CATEGORIES ON src.category_id = categories.category_id 
    JOIN problem_categories dest ON categories.category_id = dest.category_id 
WHERE problems.problem_id = 266
GROUP BY categories.category_id

What you are doing is for the problem id you want you look up the categories ids and from there the names, and from categories id you look how many (if any) problems each has. The non usage of the problem table is intentional. Also, for this to perform you need to index the problem_categories in BOTH orders.

UPDATE: The Left join can be an inner join, because each category related to problem 266 has at least one problem (problem 266).

Upvotes: 1

Related Questions