Reputation: 13
I'm having trouble with my results of my count function, i think its due to the count(a1_journal.publisher_id) >=3 however instead of removing the results that are <3 from display it just gives them a null value.
Is there a way to either remove the null values from display (i tried a where col IS NOT NULL but it didn't work properly) or stop them from being selected in the count?
SELECT publisher_name, (select count(a1_journal.publisher_id)
FROM a1_journal
WHERE a1_journal.publisher_id=a1_publisher.publisher_id
HAVING count(a1_journal.publisher_id) >=3)
AS Number_of_Journals
FROM a1_publisher
ORDER by Number_of_Journals DESC;
Thanks
Upvotes: 1
Views: 2883
Reputation: 8239
The problem is that you return all rows in table a1_publisher. Try this instead.
select j.publisher_id, count(j.publisher_id)
FROM a1_journal j inner join a1_publisher p ON j.publisher_id=p.publisher_id
GROUP BY j.publisher_id
HAVING count(j.publisher_id) >=3
ORDER BY count(j.publisher_id) DESC
UPDATE:
To select publisher's name there're 2 ways.
If publisher's name is unique you can add the column to group by like this
select j.publisher_id,p.publisher_name, count(j.publisher_id)
FROM a1_journal j
inner join a1_publisher p ON j.publisher_id=p.publisher_id
GROUP BY j.publisher_id, p.publisher_name
HAVING count(j.publisher_id) >=3
ORDER BY count(j.publisher_id) DESC
If it's not unique, you should have another join with a1_publisher like this.
SELECT aj.publisher_id, aj.numberOfJournals, ap.publisher_name
FROM a1_publisher ap
INNER JOIN (
SELECT j.publisher_id, count(j.publisher_id) numberOfJournals
FROM a1_journal j
inner join a1_publisher p ON j.publisher_id=p.publisher_id
GROUP BY j.publisher_id
HAVING count(j.publisher_id) >=3 ) aj
ON ap.publisher_id = ap.publisher_id
ORDER BY count(j.publisher_id) DESC
Upvotes: 3
Reputation: 3957
Your query, in the way you've made it doesn't care about the COUNT inside the subselect. It'll be processed for every publisher_name no matter what. It is also inefficient in terms of how it's processing. Performance if the data sets are very large would most likely be very noticeable.
You have several ways to fix this. One way is to rewrite to a join and use the COUNT() function to aggregate the number of journals, like this:
SELECT
a1_publisher.publisher_name,
COUNT(a1_journal.publisher_id) AS Number_of_Journals
FROM
a1_publisher
INNER JOIN a1_publisher
ON a1_journal.publisher_id = a1_publisher.publisher_id
GROUP BY
a1_publisher.publisher_name
HAVING
COUNT(a1_journal.publisher_id) >= 3 -- Here is your filter.
ORDER BY
COUNT(a1_journal.publisher_id) DESC, a1_publisher.publisher_name ;
Now, if you want to find the publishers without any journals, you can change it slightly:
SELECT
a1_publisher.publisher_name
FROM
a1_publisher
LEFT JOIN a1_publisher
ON a1_journal.publisher_id = a1_publisher.publisher_id
WHERE
a1_journal.pubhslier_id IS NULL
ORDER BY
a1_publisher.publisher_name
For performance reasons, have an index on a1_journal.publisher_id, and one on a1_publisher.publisher_id (though i suspect this is already the primary key)
Upvotes: 0