MJ13
MJ13

Reputation: 195

Confused with an SQL query

Let's assume we have a database for a chat application:

CREATE TABLE Users (uid int PRIMARY KEY, name text, phone text );

CREATE TABLE Messages (recipient int REFERENCES Users(uid), sender int 
REFERENCES Users(uid), time timestamp NOT NULL, message text NOT NULL, 
PRIMARY KEY (recipient, sender, time));

I want to find all the messages that have been sent between at least two different pairs of users. For instance, if message "Hello" has been sent from User 1 to User 2 and from User 75 to User 83 as well then it must be shown to the result. However, if it has been sent only between User 1 and User 2 then it shouldn't be shown to the result.

I consider grouping all the messages that appear at least two times as follows:

SELECT message 
FROM Messages 
GROUP BY message 
HAVING COUNT(*) > 1

However, this won't help since the same pair of users might have sent the same message over and over again. Can I group by sender, recipient and message at the same time? If yes, what is this gonna give me as a result?

In addition, is there a way that I could easily fill these two tables with random values and practise queries by myself in order to check them?

Thanks in advance!

Upvotes: 0

Views: 587

Answers (4)

thewallrus
thewallrus

Reputation: 725

Yea first start by grouping user1, user2, & message. This gives you a unique message for every pair:

SELECT case when recipient > sender then recipient else sender end user1,
       case when recipient > sender then sender else recipient end user2,
       message 
FROM Messages 
GROUP BY user1, user2, message

Then from that result group by Message and return only count greater than 1. You can use a nested query to do this:

SELECT message, COUNT(message) 
FROM (SELECT case when recipient > sender then recipient else sender end user1,
            case when recipient > sender then sender else recipient end user2,
            message 
     FROM Messages 
     GROUP BY user1, user2, message) PairMessages 
GROUP BY message 
HAVING COUNT(message) > 1

Maybe start with this as a test:

INSERT INTO Users VALUES (1,'john',1111111111)
INSERT INTO Users VALUES (2,'paul',2222222222)
INSERT INTO Users VALUES (75,'george',7575757575)
INSERT INTO Users VALUES (83,'ringo',8383838383)
INSERT INTO Messages VALUES (2,1,GETDATE(),'Yesterday')
INSERT INTO Messages VALUES (1,2,GETDATE(),'hello')
INSERT INTO Messages VALUES (75,83,GETDATE(),'yellow')
INSERT INTO Messages VALUES (75,83,GETDATE(),'hello')

You should be able to get hello as your message sent between more than 1 pair of users.

Edit: I updated the above with the correct answer to show that each pair of users is unique for each message. Also, it may be a good idea to create a groupID for every pair of users. Then you can add as many users as you want to that groupID. See here for an idea: http://sqlfiddle.com/#!9/fbc2e2/3

Upvotes: 1

Michele La Ferla
Michele La Ferla

Reputation: 6884

The easiest way to achieve this is by using the following query:

select message, count(*)
from Messages
group by message
HAVING COUNT(distinct least(recipient, sender), greatest(recipient, sender)) > 1

This improves on forpas's query by adding the number of times that message was used.

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269803

You can use EXISTS for the filtering:

SELECT m.message 
FROM Messages m
WHERE EXISTS (SELECT 1
              FROM Messages m2
              WHERE m2.message = m.message AND
                    m2.recipient NOT IN (m.recipient, m.sender) AND
                    m2.sender NOT IN (m.recipient, m.sender)
             )       
GROUP BY m.message ;

The EXISTS checks that another row with the same message has a different pair of users.

Upvotes: 0

forpas
forpas

Reputation: 164099

This will work for MySql:

SELECT message 
FROM Messages 
GROUP BY message 
HAVING 
  COUNT(distinct least(recipient, sender), greatest(recipient, sender)) > 1

See the demo.

Upvotes: 2

Related Questions