taz
taz

Reputation: 31

sql selection from one-to-many table

I have got 3 tables with those columns below:

Topics:
[TopicID] [TopicName]
Messages:
[MessageID] [MessageText]
MessageTopicRelations
[EntryID] [MessageID] [TopicID]

messages can be about more than one topic. question is: given couple of topics, I need to get messages which are about ALL these topics and not the less, but they can be about some other topic too. a message which is about SOME of these given topics won't be included. I hope I explained my request well. otherwise I can provide sample data. thanks

Upvotes: 3

Views: 219

Answers (4)

OMG Ponies
OMG Ponies

Reputation: 332681

The following use x, y, and z to stand in for topic ids, being that none were provided for examples.

Using JOINs:

SELECT m.*
  FROM MESSAGES m
  JOIN MESSAGETOPICRELATIONS mtr ON mtr.messageid = m.messageid
  JOIN TOPICS tx ON tx.topicid = mtr.topicid
                AND tx.topicid = x
  JOIN TOPICS ty ON ty.topicid = mtr.topicid
                AND ty.topicid = y
  JOIN TOPICS tz ON tz.topicid = mtr.topicid
                AND tz.topicid = z

Using GROUP BY/HAVING COUNT(*):

  SELECT m.*
    FROM MESSAGES m
    JOIN MESSAGETOPICRELATIONS mtr ON mtr.messageid = m.messageid
    JOIN TOPICS t ON t.topicid = mtr.topicid
   WHERE t.topicid IN (x, y, z)
GROUP BY m.messageid, m.messagetext
  HAVING COUNT(*) = 3

Of the two, the JOIN approach is safer.

The GROUP BY/HAVING relies on the MESSAGETOPICRELATIONS.TOPICID being either part of the primary key, or having a unique key constraint to ensure there aren't duplicates. Otherwise, you could have 2+ instances of the same topic associated to a message - which would be a false positive. Using HAVING COUNT(DISTINCT ... would clear up any false positives, but support depends on the database - MySQL supports it at 5.1+, but not on 4.1. Oracle might, have to wait till Monday to test on SQL Server...

I looked into Bill's comment about not needing the join to the TOPICS table:

SELECT m.*
  FROM MESSAGES m
  JOIN MESSAGETOPICRELATIONS mtr ON mtr.messageid = m.messageid
                                AND mtr.topicid IN (x, y, z)

...will return false positives - rows that match at least one of the values defined in the IN clause. And:

SELECT m.*
  FROM MESSAGES m
  JOIN MESSAGETOPICRELATIONS mtr ON mtr.messageid = m.messageid
                                AND mtr.topicid = x
                                AND mtr.topicid = y
                                AND mtr.topicid = z

...won't return anything at all, because the topicid can never be all of the values at once.

Upvotes: 5

Bill Karwin
Bill Karwin

Reputation: 562691

Re: the answer by OMG Ponies, you don't need to join to the TOPICS table. And the HAVING COUNT(DISTINCT) clause works fine in MySQL 5.1. I just tested it.

This is what I mean:

Using GROUP BY/HAVING COUNT(*):

  SELECT m.*
    FROM MESSAGES m
    JOIN MESSAGETOPICRELATIONS mtr ON mtr.messageid = m.messageid
   WHERE mtr.topicid IN (x, y, z)
GROUP BY m.messageid
  HAVING COUNT(DISTINCT mtr.topicid) = 3

The reason that I suggest COUNT(DISTINCT) is that if the columns (messageid,topicid) don't have a unique constraint, you could get duplicates, which would result in a count of 3 in the group, even with fewer than three distinct values.

Upvotes: 1

mechanical_meat
mechanical_meat

Reputation: 169434

Edit: thanks to @Paul Creasey and @OMG Ponies for finding the flaws in my approach.
The correct way to do this is with a self-join for each topic; as shown in the leading answer.


Another profoundly inelegant entry:

select m.MessageText
       , t.TopicName
  from Messages m
       inner join MessageTopicRelations mtr
       on mtr.MessageID = m.MessageID
       inner join Topics t
       on t.TopicID = mtr.TopicID
   and
       t.TopicName = 'topic1'

UNION 

select m.MessageText
       , t.TopicName
  from Messages m
       inner join MessageTopicRelations mtr
       on mtr.MessageID = m.MessageID
       inner join Topics t
       on t.TopicID = mtr.TopicID
   and
       t.TopicName = 'topic2'
...

Upvotes: 1

Paul Creasey
Paul Creasey

Reputation: 28864

Here's a profoundly inelegant solution

SELECT
     m.MessageID
    ,m.MessageText
FROM
    Messages m
WHERE
    m.MessageID IN (
    SELECT
        mt.MessageID
    FROM
        MessageTopicRelations mt
    WHERE
        TopicID IN (1,4,5)// List of topic IDS
    GROUP BY
        mt.MessageID
    HAVING
        count(*) = 3 //Number of topics
    )

Upvotes: 1

Related Questions