Arian
Arian

Reputation: 146

Big MySQL database takes time fetching data

I have a big messages database there are 2.4m rows:

Showing rows 0 - 24 (2455455 total, Query took 0.0006 seconds.)

Messages, so I need conversations to load faster, for users that have less conversations it loads (user have 3.2k conversations):

 Showing rows 0 - 24 (3266 total, Query took 0.0345 seconds.) [id: 5009666... - 4375619...]

For users that have high number of conversations it loads slower (user have 40k conversations):

 Showing rows 0 - 24 (40296 total, Query took 5.1763 seconds.) [id: 5021561... - 5015545...]

I'm using index keys for these columns:

id, to_id, from_id, time, seen

Database Table:

CREATE TABLE `messages` (
  `id` int(255) NOT NULL,
  `to_id` int(20) NOT NULL,
  `from_id` int(20) NOT NULL,
  `message` longtext NOT NULL,
  `time` double NOT NULL,
  `seen` int(2) NOT NULL,
) ENGINE=InnoDB DEFAULT CHARSET=latin1;



INSERT INTO `messages` (`id`, `to_id`, `from_id`, `message`, `time`, `seen`) VALUES
(2, 6001, 2, 'Hi there', 1587581995.5222, 1);


ALTER TABLE `messages`
  ADD PRIMARY KEY (`id`),
  ADD KEY `time_idx` (`time`),
  ADD KEY `from_idx` (`from_id`),
  ADD KEY `to_idx` (`to_id`),
  ADD KEY `seenx` (`seen`),
  ADD KEY `idx` (`id`);


ALTER TABLE `messages`
  MODIFY `id` int(255) NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=5021570;
COMMIT;

I'm using this query:

SELECT
  *
FROM
  messages,
  (
    SELECT
      MAX(id) as lastid
    FROM
      messages
    WHERE
      (
        messages.to_id = '1' -- ID to compare with (logged in users's ID)
        OR messages.from_id = '1' -- ID to compare with (logged in users's ID)
      )
    GROUP BY
      CONCAT(
        LEAST(messages.to_id, messages.from_id),
        '.',
        GREATEST(messages.to_id, messages.from_id)
      )
  ) as conversations
WHERE
  id = conversations.lastid
ORDER BY
  messages.id DESC

I don't know how to make it faster for users that have a lot of conversations, should i re create the database structure.

Upvotes: 0

Views: 490

Answers (3)

Rick James
Rick James

Reputation: 142366

Notes:

  • Use UNION instead of OR (see below)
  • There are redundant keys. The PRIMARY KEY is a key, so toss KEY(id)
  • Don't blindly index every column; instead use the queries to determine what indexes, especially composite indexes, would actually be useful.
  • CONCAT is unnecessary and possibly counterproductive in GROUP BY and ORDER BY.
  • Length fields on INT are ignored. What you have is limited to 2 billion values. (This is overkill for seen, which assume is 0 or 1?)
  • Use the new syntax: JOIN..ON.
  • If seen is just true/false, then toss the index on it. (Or show me the query that you think will benefit from it.)

CONCAT-LEAST-GREATEST -- This is to construct a "friends_id"? Perhaps you really wanted a "conversation_id"? Currently, two users can never have more than one "conversation", correct?

Make a new column for conversation_id if it is really needed. (Currently, the GROUP BY is inefficient.) The code below obviates the need for such an id.

( SELECT lastid FROM (
    ( SELECT from_id, MAX(id) AS lastid FROM messages
           WHERE to_id = ? GROUP BY from_id )
    UNION DISTINCT
    ( SELECT to_id,   MAX(id) AS lastid FROM messages 
           WHERE from_id = ? GROUP BY to_id )
                     ) AS x
) AS conversations

And have these these 'covering' and 'composite' indexes:

INDEX(to_id, from_id, id)
INDEX(from_id, to_id, id)

and toss KEY(to_id), KEY(from_id) as since my new indexes handle anything else that these were to do.

I think this has the same effect but will run much faster.

Putting it together:

SELECT  *
    FROM (
            ( SELECT from_id AS other_id,
                     MAX(id) AS lastid
                  FROM messages
                  WHERE to_id = ? GROUP BY from_id )
            UNION ALL
            ( SELECT to_id AS other_id,
                     MAX(id) AS lastid
                  FROM messages 
                  WHERE from_id = ? GROUP BY to_id )
         ) AS latest
    JOIN  messages  ON messages.id = latest.lastid
    ORDER BY  messages.id DESC

(plus the two indexes)

More

I was thinking (incorrectly) that UNION DISTINCT would replace the need for a conversation_id. But it won't. Offhand I see some solutions:

  • Add a conversation_id and dedup using it. (Meanwhile, I changed UNION DISTINCT to UNION ALL, making the query a little faster without changing the results.)
  • Put the output of my query into a temp table with (from_id, to_id, latestid); then do your CONCAT-LEAST-GREATEST trick to dedup conversations; finally do the JOIN back to messages to get the rest of the columns.
  • That temp table technique makes it easier to write and debug. My 3rd suggestion is simply cramming the pieces together to do it is a single (hard-to-read) query with Selects nested at 3 levels deep.

Upvotes: 2

Valeriu Ciuca
Valeriu Ciuca

Reputation: 2094

You can also partition the message table using the time for example.

Partitioning is a way in which a database (MySQL in this case) splits its actual data down into separate tables, but still get treated as a single table by the SQL layer. When partitioning in MySQL, it's a good idea to find a natural partition key

https://www.percona.com/blog/2017/07/27/what-is-mysql-partitioning/#:~:text=So%2C%20What%20is%20MySQL%20Partitioning,find%20a%20natural%20partition%20key.

Upvotes: 0

Denilson Anachury
Denilson Anachury

Reputation: 345

Hmm, maybe you can try adding indexes to your table: https://www.drupal.org/docs/7/guidelines-for-sql/the-benefits-of-indexing-large-mysql-tables#:~:text=Creating%20Indexes&text=The%20statement%20to%20create%20index,the%20index%20must%20be%20distinct. Make sure to add composed Indexes by rows you are querying.

If that doesn't improve your query time, then the query should be improved.

Upvotes: 1

Related Questions