codeNinja
codeNinja

Reputation: 1462

SQL distinct for two column

So i have a SQL table like below. This table is a bunch of sms messages between people. I want to get all the thread that exists. This basically means the last message between two people. How do i do this?

-------------------------------------------------------
| sender_id  |  receiver_id  | message   | time         |
-------------------------------------------------------
| 123        |  456          | hi        | 4/17/2013    | 
--------------------------------------------------------
| 123        |  111          | hi        | 4/18/2013    | 
--------------------------------------------------------
| 123        |  555          | hi        | 4/19/2013    | 
-------------------------------------------------------- 
| 555        |  123          | hi        | 4/20/2013    | 
--------------------------------------------------------
| 444        |  333          | hi        | 4/21/2013    | 
--------------------------------------------------------
| 123        |  555          | hi        | 4/22/2013    | 
--------------------------------------------------------
| 777        |  123          | hi        | 4/23/2013    | 
--------------------------------------------------------

I would like to get the response rows as below for user= 123. Notice how the sender_id and receiver_id as a whole are unique. Meaning a message that joe sent to bob is in the same thread as one that bob sent to joe.

-------------------------------------------------------
| sender_id  |  receiver_id  | message   | time         |
-------------------------------------------------------
| 123        |  456          | hi        | 4/17/2013    | 
--------------------------------------------------------
| 123        |  111          | hi        | 4/18/2013    | 
--------------------------------------------------------
| 123        |  555          | hi        | 4/22/2013    | 
--------------------------------------------------------
| 777        |  123          | hi        | 4/23/2013    | 
--------------------------------------------------------

Upvotes: 0

Views: 152

Answers (5)

witkacy26
witkacy26

Reputation: 101

the user 123 was only the example, I think the more general query is needed here. This solution avoids time consuming joins, there is only assumption of maximum 10000 users (easily extendable)

SELECT sender_id, receiver_id, message, MAX(time), 
IF(sender_id<receiver_id, sender_id*10000+receiver_id, receiver_id*10000+sender_id) as thread_id 
FROM messages
GROUP BY thread_id
ORDER BY MAX(time) DESC

http://sqlfiddle.com/#!2/c65d3/30

UPDATE: This version is free of number of users limitation:

SELECT sender_id, receiver_id, message, MAX(time), 
IF(sender_id<receiver_id, CONCAT(sender_id,receiver_id), CONCAT(receiver_id,sender_id)) as thread_id 
FROM messages
GROUP BY thread_id
ORDER BY MAX(time) DESC

http://sqlfiddle.com/#!2/c65d3/31

Upvotes: 0

Branko Dimitrijevic
Branko Dimitrijevic

Reputation: 52107

Haven't actually tested it, but the following query should return the newest1 message for every distinct sender_id/receiver_id combination (in both "directions"):

SELECT *
FROM message m1
WHERE time = (
    SELECT MAX(time)
    FROM message m2
    WHERE
        (m1.sender_id = m2.sender_id AND m1.receiver_id = m2.receiver_id)
        OR (m1.sender_id = m2.receiver_id AND m1.receiver_id = m2.sender_id)
)        

In plain English: select every message such that it has maximal time of all the messages for the same sender/receiver or receiver/sender.

If desired, you can easily restrict that to a given user by amending the outer WHERE clause, for example:

--- || ---
AND (sender_id = 123 OR receiver_id = 123)

1 Note that there may be multiple "newest" messages if time is not unique.

Upvotes: 0

Esoteric Screen Name
Esoteric Screen Name

Reputation: 6112

An easier to read version, which correctly handles date ordering (as shown in the question) and utilizes indexes:

SELECT sender_id, receiver_id, message, time FROM
(
SELECT sender_id, receiver_id, message, time
FROM myTable 
WHERE sender_id = 123 OR receiver_id = 123
ORDER BY time DESC
) a
GROUP BY (CASE WHEN sender_id = 123 THEN receiver_id
   ELSE sender_id END);

SQL fiddle.

Upvotes: 4

georgepsarakis
georgepsarakis

Reputation: 1957

This should do it:

SELECT * FROM (
  SELECT sender_id, receiver_id, 
    IF(sender_id > receiver_id, CONCAT(sender_id, '_', receiver_id), 
   CONCAT(receiver_id, '_', sender_id)
    ) AS conversation_key, 
    message, `time` 
FROM sms_messages WHERE sender_id = 123
UNION ALL
SELECT sender_id, receiver_id, 
    IF(
      sender_id > receiver_id, 
      CONCAT(sender_id, '_', receiver_id), 
  CONCAT(receiver_id, '_', sender_id)
    ) AS conversation_key, message, `time` 
FROM sms_messages WHERE receiver_id = 123
) t 
GROUP BY conversation_key ORDER BY NULL

Upvotes: 0

witkacy26
witkacy26

Reputation: 101

This should work for you:

SELECT sender_id, receiver_id, message, time FROM your_table GROUP BY sender_id, receiver_id ORDER BY time DESC;

Upvotes: 0

Related Questions