HeNiNnG
HeNiNnG

Reputation: 13

How do I select the max(timestamp) from a relational mysql table fast

We are developing a ticket system and for the dashboard we want to show the tickets with it's latest status. We have two tables. The first one for the ticket itself and a second table for the individual edits.

The system is running already, but the performance for the dashboard is very bad (6 seconds for ~1300 tickets). At first we used a statemant which selected 'where timestamp = (select max(Timestamp))' for every ticket. In the second step we created a view which only includes the latest timestamp for every ticket, but we are not able to also include the correct status into this view.

So the main Problem might be, that we can't build a table in which for every ticket the lastest ins_date and also the latest status is selected.

Simplyfied database looks like:

CREATE TABLE `ticket` (
  `id` int(10) NOT NULL,
  `betreff` varchar(100) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE `ticket_relation` (
  `id` int(11) NOT NULL,
  `ticket` int(10) NOT NULL,
  `info` varchar(10000) DEFAULT NULL,
  `status` int(1) NOT NULL DEFAULT '0',
  `ins_date` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
  `ins_user` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;


INSERT INTO `ticket` (`id`, `betreff`) VALUES
(1, 'Technische Frage'),
(2, 'Ticket 2'),
(3, 'Weitere Fragen');

INSERT INTO `ticket_relation` (`id`, `ticket`, `info`, `status`, `ins_date`, `ins_user`) VALUES
(1, 1, 'Betreff 1', 0, '2019-05-28 11:02:18', 123),
(2, 1, 'Betreff 2', 3, '2019-05-28 12:07:36', 123),
(3, 2, 'Betreff 3', 0, '2019-05-29 06:49:32', 123),
(4, 3, 'Betreff 4', 1, '2019-05-29 07:44:07', 123),
(5, 2, 'Betreff 5', 1, '2019-05-29 07:49:32', 123),
(6, 2, 'Betreff 6', 3, '2019-05-29 08:49:32', 123),
(7, 3, 'Betreff 7', 2, '2019-05-29 09:49:32', 123),
(8, 2, 'Betreff 8', 1, '2019-05-29 10:49:32', 123),
(9, 3, 'Betreff 9', 2, '2019-05-29 11:49:32', 123),
(10, 3, 'Betreff 10', 3, '2019-05-29 12:49:32', 123);

I have created a SQL Fiddle: http://sqlfiddle.com/#!9/a873b6/3 The first three Statements are attempts that won't work correct or way too slow. The last one is the key I think, but I don't understand, why this gets the status wrong.

The attempt to create the table with latest ins_date AND status for each ticket:

SELECT
  ticket, status, MAX(ins_date) as max_date 
FROM 
  ticket_relation 
GROUP BY 
  ticket
ORDER BY 
  ins_date DESC;

This query gets the correct (latest) ins_date for every ticket, but not the latest status:

+--------+--------+----------------------+
| ticket | status | max_date             |
+--------+--------+----------------------+
| 3      | 1      | 2019-05-29T12:49:32Z |
+--------+--------+----------------------+
| 2      | 0      | 2019-05-29T10:49:32Z |
+--------+--------+----------------------+
| 1      | 0      | 2019-05-28T12:07:36Z |
+--------+--------+----------------------+

Expected output would be this:

+--------+--------+----------------------+
| ticket | status | max_date             |
+--------+--------+----------------------+
| 3      | 3      | 2019-05-29T12:49:32Z |
+--------+--------+----------------------+
| 2      | 1      | 2019-05-29T10:49:32Z |
+--------+--------+----------------------+
| 1      | 3      | 2019-05-28T12:07:36Z |
+--------+--------+----------------------+

Is there a efficient way to select the latest timestamp and status for every ticket in the tiket-table?

Upvotes: 1

Views: 310

Answers (3)

Raymond Nijland
Raymond Nijland

Reputation: 11602

Other approach is to think filtering not GROUPing..

Query

SELECT 
   ticket_relation_1.ticket
 , ticket_relation_1.status 
 , ticket_relation_1.ins_date
FROM 
 ticket_relation AS ticket_relation_1
LEFT JOIN
 ticket_relation AS ticket_relation_2
ON
   ticket_relation_1.ticket = ticket_relation_2.ticket
 AND
   ticket_relation_1.ins_date < ticket_relation_2.ins_date
 WHERE 
  ticket_relation_2.id IS NULL
ORDER BY 
 ticket_relation_1.id DESC

Result

| ticket | status | ins_date            |
| ------ | ------ | ------------------- |
| 3      | 3      | 2019-05-29 12:49:32 |
| 2      | 1      | 2019-05-29 10:49:32 |
| 1      | 3      | 2019-05-28 12:07:36 |

see demo

This query would require a index KEY(ticket, ins_date, id) to get max performance..

Upvotes: 1

Fahmi
Fahmi

Reputation: 37473

You can try below query -

SELECT
  ticket, status, ins_date as max_date 
FROM ticket_relation a
where ins_date in (select max(ins_date) from ticket_relation b where a.ticket=b.ticket)

Upvotes: 1

GMB
GMB

Reputation: 222582

One solution would be to use a subquery to compute the latest insert date for each ticket, and then to join the results with the original table, like:

SELECT t.ticket, t.status, t.ins_date
FROM ticket_relation  t
INNER JOIN (
    SELECT ticket, max(ins_date) max_ins_date
    FROM ticket_relation 
    GROUP BY ticket
) x ON t.ticket = x.ticket AND t.ins_date = x.max_ins_date

For better performance with this query, you want an index on (ticket, ins_date).

Anoter option would be to use a NOT EXISTS condition to ensure that only the latest record is selected, like:

SELECT t.ticket, t.status, t.ins_date
FROM ticket_relation  t
WHERE NOT EXISTS (
    SELECT 1 
    FROM ticket_relation t1 
    WHERE t1.ticket = t.ticket AND t1.ins_date > t.ins_date)
)

NB: when dealing with GROUP BY, all non-aggregated columns must appear in the GROUP BY clause. Else, you will get either an error or unprectictable results (depending on whether server option ONLY_FULL_GROUP_BY is, respectively, enabled or disabled).


If you are able to upgrade to a recent version of mysql (8.0), then window functions can be used to simplify the query and possibly increase its performance, like:

SELECT ticket, status, ins_date
FROM (
    SELECT 
        ticket, 
        status, 
        ins_date, 
        row_number() over(partition by ticket order by ins_date desc) rn
    FROM ticket_relation  
) x WHERE rn = 1

Upvotes: 1

Related Questions