Reputation: 24895
I have two databases with identical schemas. The one database is from production, the other is a test database. I'm doing a query against a single table from the database. On the production table the query takes around 4.3 seconds, while on the test database it takes about 130 ms. . However, the production table has less then 50.000 records, while I've seeded the test table with more than 100.000. I've compared the two tables and both have the same indexes. To me, it seems that the problem is in the data. While seeding I tried to generate as random data as possible, so that I can simulate production conditions, but still I couldn't reproduce the slow query.
I looked the the results from EXPLAIN
for the two queries. They have significant differences in the last two columns.
Production:
+-------+-------------------------+
| rows | Extra |
+-------+-------------------------+
| 24459 | Using where |
| 46 | Using where; Not exists |
+-------+-------------------------+
Test:
+------+------------------------------------+
| rows | Extra |
+------+------------------------------------+
| 3158 | Using index condition; Using where |
| 20 | Using where; Not exists |
+------+------------------------------------+
The create statement for the table on production is:
CREATE TABLE `usage_logs` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`operation` varchar(30) COLLATE utf8_unicode_ci NOT NULL,
`check_time` datetime NOT NULL,
`check_in_log_id` int(11) DEFAULT NULL,
`daily_usage_id` int(11) DEFAULT NULL,
`duration_units` decimal(11,2) DEFAULT NULL,
`is_deleted` tinyint(1) NOT NULL DEFAULT '0',
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
`facility_id` int(11) NOT NULL,
`notes` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`mac_address` varchar(20) COLLATE utf8_unicode_ci NOT NULL DEFAULT '00:00:00:00:00:00',
`login` varchar(40) COLLATE utf8_unicode_ci DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_usage_logs_on_user_id` (`user_id`),
KEY `index_usage_logs_on_check_in_log_id` (`check_in_log_id`),
KEY `index_usage_logs_on_facility_id` (`facility_id`),
KEY `index_usage_logs_on_check_time` (`check_time`),
KEY `index_usage_logs_on_mac_address` (`mac_address`),
KEY `index_usage_logs_on_operation` (`operation`)
) ENGINE=InnoDB AUTO_INCREMENT=145147 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
while the same in the test database is:
CREATE TABLE `usage_logs` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`operation` varchar(30) COLLATE utf8_unicode_ci NOT NULL,
`check_time` datetime NOT NULL,
`check_in_log_id` int(11) DEFAULT NULL,
`daily_usage_id` int(11) DEFAULT NULL,
`duration_units` decimal(11,2) DEFAULT NULL,
`is_deleted` tinyint(1) NOT NULL DEFAULT '0',
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
`facility_id` int(11) NOT NULL,
`notes` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`mac_address` varchar(20) COLLATE utf8_unicode_ci NOT NULL DEFAULT '00:00:00:00:00:00',
`login` varchar(40) COLLATE utf8_unicode_ci DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_usage_logs_on_check_in_log_id` (`check_in_log_id`),
KEY `index_usage_logs_on_check_time` (`check_time`),
KEY `index_usage_logs_on_facility_id` (`facility_id`),
KEY `index_usage_logs_on_mac_address` (`mac_address`),
KEY `index_usage_logs_on_operation` (`operation`),
KEY `index_usage_logs_on_user_id` (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=104001 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
The full query is:
SELECT `usage_logs`.*
FROM `usage_logs`
LEFT OUTER JOIN usage_logs AS usage_logs_latest ON usage_logs.facility_id = usage_logs_latest.facility_id
AND usage_logs.user_id = usage_logs_latest.user_id
AND usage_logs.mac_address = usage_logs_latest.mac_address
AND usage_logs.check_time < usage_logs_latest.check_time
WHERE `usage_logs`.`facility_id` = 5
AND `usage_logs`.`operation` = 'checkIn'
AND (usage_logs.check_time >= '2018-06-08 00:00:00')
AND (usage_logs.check_time <= '2018-06-08 11:23:05')
AND (usage_logs_latest.id IS NULL)
I execute the query on the same machine against two different databases, so I don't think that other processes are interfering in the result.
What does this result mean and what further steps can I take in order to find out the reason for the big difference in the execution time?
Upvotes: 0
Views: 74
Reputation: 142298
What MySQL version(s) are you using?
There are many factors that lead to the decision by the Optimizer as to
Some of the factors:
These can also lead to different numbers (estimates) in the EXPLAIN
, which may lead to different query plans.
Also other activity in the server can interfere with the availability of CPU/IO/etc. In particular caching of the data can easily show a 10x difference. Did you run each query twice? Is the Query cache turned off? Is innodb_buffer_pool_size
the same? Is RAM size the same?
I see Using index condition
and no "composite" indexes. Often performance can be improved by providing a suitable composite index. More
I gotta see the query!
Seeding
Random, or not-so-random, rows can influence the Optimizer's choice of which index (etc) to use. This may have led to picking a better way to run the query on 'test'.
We need to see EXPLAIN SELECT ...
to discuss this angle further.
Composite indexes
These are likely to help on both servers:
INDEX(facility_id, operation, -- either order
check_time) -- last
INDEX(facility_id, user_id, max_address, check_time, -- any order
id) -- last
There is a quick improvement. Instead of finding all the later rows, but not use the contents of them, use a 'semi-join' which asks of the non-existence of any such rows:
SELECT `usage_logs`.*
FROM `usage_logs`
WHERE `usage_logs`.`facility_id` = 5
AND `usage_logs`.`operation` = 'checkIn'
AND (usage_logs.check_time >= '2018-06-08 00:00:00')
AND (usage_logs.check_time <= '2018-06-08 11:23:05')
AND NOT EXISTS ( SELECT 1 FROM usage_logs AS latest
WHERE usage_logs.facility_id = latest.facility_id
AND usage_logs.user_id = latest.user_id
AND usage_logs.mac_address = latest.mac_address
AND usage_logs.check_time < latest.check_time )
(The same indexes will be fine.)
The query seems to be getting "all but the latest"; is that what you wanted?
Upvotes: 2