Rens Tillmann
Rens Tillmann

Reputation: 474

Not returning all rows with LEFT JOINS and HAVING

I am building a simple forum. What I need to display is a list with the latest Topics and near each topic it's latest reply/post. Something like:

Topic name:         | Last reply:
Topic name 1 here   | 2013-02-01 by username1
Topic name 2 here   | 2013-01-01 by username2
etc.

The thing is that Topics and Replies are both in the same table called 'pages'. This table has a field template which contains of course an ID for the table 'templates'.

So I simply join templates and filter by the Template.name for the record 'Forum topic'

This will return all topics. Because I also needed the latest reply for each topic I decided to do a subquery with a filter on the same template but now for the record 'Forum reply' and ordered by Reply.date DESC and with a LIMIT 1 which returns the latest reply for this topic. Now it is time to show my current query which works fine for Topics that contain at least 1 reply:

/*----some fields to return----*/
SELECT TopicContent.title, ReplyContent.title, Reply.date, Reply.id, 
(
  /*----subquery to return latest reply id used inside HAVING later on----*/
  SELECT Reply.id
  FROM pages AS Reply
  INNER JOIN templates AS Template ON Reply.template = Template.id
  WHERE Reply.parent_id = Topic.id
  AND Template.name =  'Forum reply'
  ORDER BY Reply.date DESC 
  LIMIT 1
) AS reply_id
FROM pages AS Topic
INNER JOIN templates AS Template ON Topic.template = Template.id
INNER JOIN page_content AS TopicContent ON Topic.id = TopicContent.page
/*----left join used because topic could have zero replies if new----*/
LEFT JOIN pages AS Reply ON Topic.id = Reply.parent_id
LEFT JOIN page_content AS ReplyContent ON Reply.id = ReplyContent.page
WHERE Template.name =  'Forum topic'
HAVING Reply.id = reply_id
/*--------------------------------------------------------------------------------*/
/*----HAVING            | returns not all topics but with correct latest reply----*/
/*----GROUP BY Topic.id | returns topics correctly but incorrect latest reply-----*/

Most important now is that if a new Topic is submitted it does not contain any replies yet.

So when I use HAVING it returns only Topics that also contain at least one reply. And when using GROUP BY it does not return the latest replies for the topics that contain one or more replies. And for the topics that do not contain replies it returns NULL which is good.

Do I need to change JOINS? Any ideas to resolve this. Thanks a lot!

Upvotes: 1

Views: 142

Answers (2)

Rick
Rick

Reputation: 171

I'd move the MaxId subquery to the from clause as the way you're using GROUP BY won't work for what you're trying to do. Here you just join your resultset to a subquery that retrieves your MaxId field.

SELECT
*
FROM 
   pages
   .
   .
   .
   LEFT OUTER JOIN 
   (
   SELECT 
      ReplyId As [MaxId], 
      p.Template
   FROM 
      Pages p INNER JOIN Templates t ON p.Template = t.Id
   WHERE
      TemplateName = 'Forum Reply'
   ORDER BY ReplyDate LIMIT 1
   ) a ON a.Template = pages.Template
WHERE 
   Template.name =  'Forum topic'

Alternatively (this one could give you more than one reply if multiple first replies have the exact same time in which case you could limit the results)

SELECT
   *
FROM
   pages
   .
   .
   .
   (SELECT
      ReplyId As [MaxiId],
      Template
   FROM
      Pages p INNER JOIN
      Templates t ON p.Template = t.id LEFT OUTER JOIN
      (SELECT 
         p.Template, 
         MIN(ReplyDate) AS [FirstReplyDate]
      FROM      
         Pages p INNER JOIN
         Templates t ON p.Template = t.id
      WHERE
         TemplateName = 'Forum Reply'
      GROUP BY
         p.Template) b ON b.Template = t.id
   WHERE 
      p.ReplyDate = b.FirstReplyDate
   ) a ON a.Template = pages.Template

Either way you should be moving that first reply subquery to the From clause

Upvotes: 1

invisal
invisal

Reputation: 11171

This is not the answer yet, but rather, it is to ask for more information from the poster, but because it is too long for comment, I will post here.

From your query, I can conclude that your tables look something like this

templates(id, name)
pages(id, parent_id, date, template)
page_contents(page, title)

pages stores the structure of the post, page_contents stores the content of the post, and template determine what type of post. Am I correct?

Question:

  • What is the value of parent_id, if it is topic, not a reply. Since reply is under topic, so the parent_id should be the ID of the topic. However, topic does not under any topic, so basically, it should be either 0 or NULL, is it right?
  • Do you limit yourself to this database design, from my perspective, this database design does not fit for forum because to retrieve something as simple as this, you have to INNER JOIN a bunch of tables.

Upvotes: 0

Related Questions