Dor1000
Dor1000

Reputation: 351

Don't grab duplicates (MYSQL DISTINCT GROUP BY)

I have a conversation-based message system. (Messages viewed as a forum thread. Akin to gmail.) Tables:

convos (PRIMARY cid)
messages (PRIMARY mid)
users (PRIMARY uid)

Each row in convos is a thread. Each row in messages is a reply, including cid it belongs to. I load the conversation by grabbing x most recent messages with that cid. I also join users table to messages ON uid. (Each message has uid of the sender.) From users i get user_name, avatar. To illustrate, the basic query will look like this:

SELECT
messages.body,
users.avatar
FROM messages
JOIN users ON (users.uid=messages.uid)
WHERE messages.cid=1

If same user makes many replies i'm grabbing her avatar several times, which is VARCHAR (a link to avatar image). For performance I want to grab each avatar only once. I'm thinking about a sub query with GROUP BY or DISTINCT

UPDATE

I'm going with Robby C's Idea. I have public chats where any user can reply and the replies grow. I will show 20 replies per page. To get only avatars needed for those 20 replies I do this:

    SELECT
    u.avatar
    FROM( 
        SELECT uid,cid 
        FROM messages  
        WHERE cid=1234
        ORDER BY date DESC LIMIT 20
    ) AS m
    JOIN users AS u ON (u.uid=m.uid)
    GROUP BY u.uid  

Subquery the 20 recent replies, get user data for them with JOIN, then GROUP BY users.uid cuts the duplicates. Seems to work well.

Upvotes: 2

Views: 112

Answers (2)

Robby Cornelissen
Robby Cornelissen

Reputation: 97120

I don't think the subquery you're considering will improve performance. It will probably be easier and more performant to get two resultsets and match them up in your code using the uid.

The query to get the message contents:

SELECT
    messages.body, messages.uid
FROM messages
WHERE messages.cid=1;

And the query to get the avatars:

SELECT DISTINCT
    users.uid, users.avatar
FROM messages
JOIN users ON (users.uid = messages.uid)
WHERE messages.cid=1;

In terms of query execution time, I don't think that you will see an improvement compared to the single query you originally had. You might however be able to reduce IO and/or network usage.

If you find yourself frequently getting the same user information, it's worthwhile to consider caching the entire user table in memory.

Upvotes: 2

kiks73
kiks73

Reputation: 3758

I think that you may have to change something in your workflow:

Do not store your avatar directly in db, but store only the path to the avatar file. In this way you will not to grab the avatar each time, but only the path to the file.

Upvotes: 0

Related Questions