Reputation: 883
I have an article table which holds the number of articles views for each day. A new record is created to hold the count for each seperate day for each article.
The query below gets the article id and total views for the top 5 viewed article id for all time :
SELECT article_id,
SUM(article_count) as cnt
FROM article_views
GROUP BY article_id
ORDER BY cnt DESC
LIMIT 5
I also have a seperate article table which holds all the article fields. I want to ammend the query above to join to the article table and get two fields for each article id. I have tried to do this below but count is comming back incorrectly :
SELECT article_views.article_id, SUM( article_views.article_count ) AS cnt, articles.article_title, articles.artcile_url
FROM article_views
INNER JOIN articles ON articles.article_id = article_views.article_id
GROUP BY article_views.article_id
ORDER BY cnt DESC
LIMIT 5
Im not sure exactly what im doing wrong. Do I need to do a subquery?
Upvotes: 10
Views: 39569
Reputation: 108400
Your query looks basically right to me...
But the value returned for cnt
is going to be dependent upon article_id
column being UNIQUE in the articles
table. We'd assume that it's the primary key, and absent a schema definition, that's only an assumption.)
Also, we're likely to assume there's a foreign key between the tables, that is, there are no values of article_id
in the articles_view
table which don't match a value of article_id
on a row from the articles
table.
To check for "orphan" article_id values, run a query like:
SELECT v.article_id
FROM articles_view v
LEFT
JOIN articles a
ON a.article_id = v.article_id
WHERE a.article_id IS NULL
To check for "duplicate" article_id values in articles, run a query like:
SELECT a.article_id
FROM articles a
GROUP BY a.article_id
HAVING COUNT(1) > 1
If either of those queries returns rows, that could be an explanation for the behavior you observe.
Upvotes: 0
Reputation: 79929
Add articles.article_title, articles.artcile_url
to the GROUP BY
clause:
SELECT
article_views.article_id,
articles.article_title,
articles.artcile_url,
SUM( article_views.article_count ) AS cnt
FROM article_views
INNER JOIN articles ON articles.article_id = article_views.article_id
GROUP BY article_views.article_id,
articles.article_title,
articles.artcile_url
ORDER BY cnt DESC
LIMIT 5;
The reason you were not getting correct result set, is that when you select rows that are not included in the GROUP BY
nor in an aggregate function in the SELECT
clause MySQL picks up random value.
Upvotes: 16
Reputation: 1269743
You are using a MySQL (mis) feature called Hidden Columns, because article title is not in the group by
. However, this may or may not be causing your problem.
If the counts are wrong, then I think you have duplicate article_id
in the article table. You can check this by doing:
select article_id, count(*) as cnt
from articles
group by article_id
having cnt > 1
If any appear, then that is your problem. If they all have different titles, then grouping by the title (as suggested by Mahmoud) would fix the problem.
If not, one way to fix it is the following:
SELECT article_views.article_id, SUM( article_views.article_count ) AS cnt, articles.article_title, articles.artcile_url
FROM article_views INNER JOIN
(select a.* from articles group by article_id) articles
ON articles.article_id = article_views.article_id
GROUP BY article_views.article_id
ORDER BY cnt DESC
LIMIT 5
This chooses an abitrary title for the article.
Upvotes: 3