Reputation: 16017
I have 2 tables - comments and ratings. The comments table contains a column reply
which indicates whether a comment is a reply to another comment. The ratings table contains ratings for the comments in the form of comment_id, user_id, rating
When I am selecting comments to display it is a bit complex so I'll try to simplify as much as I can
SELECT
COALESCE(SUM(cr.vote), 0) AS rating,
COUNT(r.id) AS replies
FROM comments c
LEFT JOIN comments_ratings cr ON c.id = cr.comment
LEFT JOIN comments r ON c.id = r.reply
WHERE c.id = 1
GROUP BY c.id;
Here is the testing setup
CREATE TABLE `comments` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`text` text NOT NULL,
`author` int(10) unsigned NOT NULL,
`time` datetime DEFAULT NULL,
`reply` int(10) unsigned DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `reply` (`reply`),
CONSTRAINT `comments_ibfk_1` FOREIGN KEY (`reply`) REFERENCES `comments` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8
CREATE TABLE `comments_ratings` (
`comment` int(10) unsigned NOT NULL,
`user` int(10) unsigned NOT NULL,
`vote` tinyint(4) NOT NULL,
PRIMARY KEY (`comment`,`user`),
KEY `user` (`user`),
CONSTRAINT `comments_ratings_ibfk_1` FOREIGN KEY (`comment`) REFERENCES `comments` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION,
-- CONSTRAINT `comments_ratings_ibfk_2` FOREIGN KEY (`user`) REFERENCES `users` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION
) ENGINE=InnoDB DEFAULT CHARSET=utf8
INSERT INTO comments (id, reply, text, author) VALUES (1, null, '', 0), (null, 1, '', 0),(null, 1, '', 0),(null, 1, '', 0);
INSERT INTO comments_ratings (comment, user, vote) VALUES (1, 1, 1);
Now if you execute the select statement you will see rating
becomes 3 even though there is only 1 record in comments_ratings
with value 1. If I add another reply it will become 4. If you add another comments_ratings
record with value 1 it will double and become 8. That is because each row from the joins is copying information in fields it doesn't have.
Can you help me set up the join on r
so that it doesn't double the rating and replies.
Upvotes: 0
Views: 86
Reputation: 16017
Update: Even though both answers were correct, I have currently been testing this set up with large volumes of data and the performance is next to awful. After a short investigation I determined the reason - basically the suggested solutions create a temporary table in memory in which all data from the table is filled on each query, as the amount of data increases this time also increases and on a fairly weak server that I'm running I get a query time of above 5 seconds for a couple thousand rows.
I have come up with a solution for that problem, it still uses temporary tables but instead of copying the entire table into the temp one it only copies the range of records that are being selected, here it is:
SELECT
c.*,
COUNT(r.id) AS replies
FROM
(
SELECT
c.id,
c.text,
c.time,
c.author AS author_id,
SUM(cr.vote) AS rating,
crv.vote AS voted
FROM
comments c
LEFT JOIN users u ON u.id = c.author
LEFT JOIN comments_ratings cr ON cr. COMMENT = c.id
LEFT JOIN comments_ratings crv ON crv. COMMENT = c.id
AND crv. USER = ?
WHERE
c.item = ?
AND c.type = ?
AND c.id < ?
GROUP BY
c.id
ORDER BY
c.id DESC
LIMIT 0,
100
) AS c
LEFT JOIN comments r ON c.id = r.reply
GROUP BY
c.id
ORDER BY
c.id DESC
I tested this method with 4+ million records in the table and the queries were executed in less than 10 milliseconds on a pretty weak server machine.
Upvotes: 0
Reputation: 16958
When you have some LEFT JOIN
s from some sub-tables to one super-table you should remember that rows of your super-table will be repeated by both of sub-tables, So you should change your query to something like this:
SELECT
COALESCE(SUM(cr.vote), 0) AS rating,
COALESCE(SUM(r.cnt), 0) AS replies
FROM
comments c
LEFT JOIN
(SELECT
cri.comment,
SUM(cri.vote) As vote
FROM
comments_ratings cri
GROUP BY
cri.comment
)cr ON c.id = cr.comment
LEFT JOIN
(SELECT
ci.reply,
COUNT(ci.id) cnt
FROM
comments ci
GROUP BY
ci.reply
) AS r ON c.id = r.reply
WHERE
c.id = 1
GROUP BY
c.id;
Upvotes: 1
Reputation: 1269503
One method is to pre-aggregate the data before the join. Something like this:
FROM comments c LEFT JOIN
(SELECT cr.comment, SUM(cr.vote) as vote
FROM comments_ratings cr
GROUP BY cr.comment
) cr
ON c.id = cr.comment LEFT JOIN
comments r
ON c.id = r.reply
You might also want to include filtering conditions in the subquery, for performance reasons.
Upvotes: 1