Reputation: 6780
I have the following query:
SELECT
Impressions.id AS `Impressions__id`,
Impressions.timestamp AS `Impressions__timestamp`,
Impressions.name AS `Impressions__name`,
Impressions.lat AS `Impressions__lat`,
Impressions.lng AS `Impressions__lng`,
Impressions.personas_count AS `Impressions__personas_count`,
Impressions.modified AS `Impressions__modified`,
Beacons.id AS `Beacons__id`,
Beacons.uuid AS `Beacons__uuid`,
Beacons.major AS `Beacons__major`,
Beacons.minor_dec AS `Beacons__minor_dec`,
Beacons.minor_hex AS `Beacons__minor_hex`,
Beacons.impressions_count AS `Beacons__impressions_count`,
Beacons.created AS `Beacons__created`,
Beacons.modified AS `Beacons__modified`,
Zones.id AS `Zones__id`,
Zones.location_id AS `Zones__location_id`,
Zones.beacon_id AS `Zones__beacon_id`,
Zones.fixture_no AS `Zones__fixture_no`,
Zones.placement AS `Zones__placement`,
Zones.floor AS `Zones__floor`,
Zones.impressions_count AS `Zones__impressions_count`,
Zones.ignore_further_incidents AS `Zones__ignore_further_incidents`,
Zones.is_reviewed AS `Zones__is_reviewed`,
Zones.review_date AS `Zones__review_date`,
Zones.created AS `Zones__created`,
Zones.modified AS `Zones__modified`,
Locations.id AS `Locations__id`,
Locations.retailer_id AS `Locations__retailer_id`,
Locations.google_place_id AS `Locations__google_place_id`,
Locations.regional_name AS `Locations__regional_name`,
Locations.location AS `Locations__location`,
Locations.store_no AS `Locations__store_no`,
Locations.lat AS `Locations__lat`,
Locations.lng AS `Locations__lng`,
Locations.address1 AS `Locations__address1`,
Locations.address2 AS `Locations__address2`,
Locations.address3 AS `Locations__address3`,
Locations.city AS `Locations__city`,
Locations.state AS `Locations__state`,
Locations.postal_code AS `Locations__postal_code`,
Locations.region_id AS `Locations__region_id`,
Locations.country_id AS `Locations__country_id`,
Locations.zones_count AS `Locations__zones_count`,
Locations.contacts_count AS `Locations__contacts_count`,
Locations.created AS `Locations__created`,
Locations.modified AS `Locations__modified`,
Devices.id AS `Devices__id`,
Devices.os AS `Devices__os`,
Devices.bluetooth_enabled AS `Devices__bluetooth_enabled`,
Devices.impressions_count AS `Devices__impressions_count`,
Devices.modified AS `Devices__modified`,
Regions.id AS `Regions__id`,
Regions.country_name AS `Regions__country_name`,
Regions.subdiv AS `Regions__subdiv`,
Regions.subdiv_name AS `Regions__subdiv_name`,
Regions.level_name AS `Regions__level_name`,
Regions.alt_names AS `Regions__alt_names`,
Regions.subdiv_star AS `Regions__subdiv_star`,
Regions.subdiv_id AS `Regions__subdiv_id`,
Regions.country_id AS `Regions__country_id`,
Regions.country_code_2 AS `Regions__country_code_2`,
Regions.country_code_3 AS `Regions__country_code_3`,
Countries.id AS `Countries__id`,
Countries.country_name AS `Countries__country_name`,
Countries.alt_names AS `Countries__alt_names`,
Countries.code2 AS `Countries__code2`,
Countries.code3 AS `Countries__code3`,
Countries.iso_cc AS `Countries__iso_cc`,
Countries.fips_code AS `Countries__fips_code`,
Countries.fips_country_name AS `Countries__fips_country_name`,
Countries.un_region AS `Countries__un_region`,
Countries.un_subregion AS `Countries__un_subregion`,
Countries.comments AS `Countries__comments`,
Countries.created AS `Countries__created`,
Countries.modified AS `Countries__modified`
FROM
impressions Impressions
inner join beacons Beacons ON Beacons.id = (Impressions.beacon_id)
inner JOIN zones Zones ON Zones.id = (Impressions.zone_id)
inner JOIN devices Devices ON Devices.id = (Impressions.device_id)
INNER JOIN locations Locations ON Locations.id = (Zones.location_id)
LEFT JOIN regions Regions ON Regions.id = (Locations.region_id)
LEFT JOIN countries Countries ON Countries.id = (Locations.country_id)
ORDER BY
Impressions.timestamp desc
LIMIT
15 OFFSET 15
This query takes about 6 seconds to run. The EXPLAIN
output is as follows:
+----+-------------+-------------+--------+---------------------------------------+----------------+---------+---------------------------------+-------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+--------+---------------------------------------+----------------+---------+---------------------------------+-------+---------------------------------+
| 1 | SIMPLE | Devices | ALL | PRIMARY | NULL | NULL | NULL | 43274 | Using temporary; Using filesort |
| 1 | SIMPLE | Impressions | ref | zone_idx,device_id_idx2,beacon_id_idx | device_id_idx2 | 8 | gen1_d2go.Devices.id | 3 | NULL |
| 1 | SIMPLE | Zones | eq_ref | PRIMARY,fk_location_idx,comp | PRIMARY | 8 | gen1_d2go.Impressions.zone_id | 1 | NULL |
| 1 | SIMPLE | Beacons | eq_ref | PRIMARY | PRIMARY | 8 | gen1_d2go.Impressions.beacon_id | 1 | NULL |
| 1 | SIMPLE | Locations | eq_ref | PRIMARY | PRIMARY | 8 | gen1_d2go.Zones.location_id | 1 | NULL |
| 1 | SIMPLE | Regions | eq_ref | PRIMARY | PRIMARY | 4 | gen1_d2go.Locations.region_id | 1 | NULL |
| 1 | SIMPLE | Countries | eq_ref | PRIMARY | PRIMARY | 4 | gen1_d2go.Locations.country_id | 1 | NULL |
+----+-------------+-------------+--------+---------------------------------------+----------------+---------+---------------------------------+-------+---------------------------------+
7 rows in set (0.00 sec)
I do not understand why it is favoring the full scan of the Devices
table. The tables are all indexed, and the CREATE
statements for Impressions
and Devices
is as follows:
Impressions
CREATE TABLE `impressions` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`device_id` bigint(20) unsigned NOT NULL,
`beacon_id` bigint(20) unsigned NOT NULL,
`zone_id` bigint(20) unsigned NOT NULL,
`timestamp` datetime NOT NULL,
`google_place_id` bigint(20) unsigned DEFAULT NULL,
`name` varchar(60) DEFAULT NULL,
`lat` decimal(12,7) DEFAULT NULL,
`lng` decimal(12,7) DEFAULT NULL,
`personas_count` int(10) unsigned DEFAULT '0',
`created` datetime DEFAULT NULL,
`modified` datetime DEFAULT NULL,
PRIMARY KEY (`id`,`timestamp`),
KEY `zone_idx` (`zone_id`),
KEY `device_id_idx2` (`device_id`),
KEY `beacon_id_idx` (`beacon_id`),
KEY `timestamp_idx` (`id`,`timestamp`),
KEY `ALL` (`id`,`timestamp`,`name`,`lat`,`lng`,`personas_count`,`modified`),
CONSTRAINT `beacon_id` FOREIGN KEY (`beacon_id`) REFERENCES `beacons` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `device2` FOREIGN KEY (`device_id`) REFERENCES `devices` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `zone_FK` FOREIGN KEY (`zone_id`) REFERENCES `zones` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=303907 DEFAULT CHARSET=utf8;
DEVICES
CREATE TABLE `devices` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`device_id` bigint(20) unsigned NOT NULL,
`advertiser_id` char(36) NOT NULL,
`os` varchar(80) DEFAULT NULL,
`bluetooth_enabled` tinyint(1) DEFAULT NULL,
`impressions_count` int(10) unsigned DEFAULT '0',
`created` datetime DEFAULT NULL,
`modified` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `advertiser_idx` (`advertiser_id`),
KEY `ad_dev` (`device_id`,`advertiser_id`),
KEY `device_id` (`device_id`)
) ENGINE=InnoDB AUTO_INCREMENT=53628 DEFAULT CHARSET=utf8;
The kicker is:
When i use FORCE INDEX (timestamp_idx)
after the FROM impressions, it works great. It uses that index, and runs in about 0.078s. I do not know why it is trying to avoid using that index, or doing a select from that table first at all.
UPDATED
Including EXPLAIN with FORCE INDEX
Current database: gen1_d2go
+----+-------------+-------------+--------+------------------------------+---------------+---------+---------------------------------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+--------+------------------------------+---------------+---------+---------------------------------+------+-------+
| 1 | SIMPLE | Impressions | index | NULL | timestamp_idx | 5 | NULL | 15 | NULL |
| 1 | SIMPLE | Zones | eq_ref | PRIMARY,fk_location_idx,comp | PRIMARY | 8 | gen1_d2go.Impressions.zone_id | 1 | NULL |
| 1 | SIMPLE | Beacons | eq_ref | PRIMARY | PRIMARY | 8 | gen1_d2go.Impressions.beacon_id | 1 | NULL |
| 1 | SIMPLE | Locations | eq_ref | PRIMARY | PRIMARY | 8 | gen1_d2go.Zones.location_id | 1 | NULL |
| 1 | SIMPLE | Regions | eq_ref | PRIMARY | PRIMARY | 4 | gen1_d2go.Locations.region_id | 1 | NULL |
| 1 | SIMPLE | Countries | eq_ref | PRIMARY | PRIMARY | 4 | gen1_d2go.Locations.country_id | 1 | NULL |
| 1 | SIMPLE | Devices | eq_ref | PRIMARY | PRIMARY | 8 | gen1_d2go.Impressions.device_id | 1 | NULL |
+----+-------------+-------------+--------+------------------------------+---------------+---------+---------------------------------+------+-------+
7 rows in set (0.01 sec)
Upvotes: 2
Views: 1261
Reputation: 262
regarding the full table scan you can't optimize it even more. refer here for more information https://cryptkcoding.com/blog/2012/04/06/how-to-optimize-mysql-join-queries-through-indexing/
Upvotes: 0
Reputation: 142433
Impressions
needs an index starting with timestamp
. That way, the optimizer will hopefully decide to scan through Impressions
in timestamp
order, thereby avoiding the sort, etc.
A side lesson... You have 3 indexes starting with id, timestamp
. One is the PRIMARY KEY
. That means that the other two are unnecessary.
An additional speedup can be had thus:
ALTER TABLE Impressions
DROP INDEX timestamp_idx, -- as already mentioned
DROP INDEX ALL, -- ditto
DROP PRIMARY KEY, -- to rearrange it
ADD PRIMARY KEY(timestamp, id), -- thus
ADD INDEX(id); -- and keep AUTO_INCREMENT happy
Why? By having the PK start with timestamp, the query can scan the data instead of hopping between some index and the data. This will speed up the query in question. Caveat: It may hurt other queries.
Other notes...
CHAR(36)
smells like a UUID, correct? But with utf8, it takes exactly 108 bytes!. Change to CHAR(36) CHARACTER SET ascii NOT NULL
so that it will take only 36 bytes. (Or you could convert to `BINARY(16) for more savings; but that is another story, and more code.)
Unless you have billions of rows, BIGINT
(at 8 bytes) is overkill for ids. INT UNSIGNED
is only 4 bytes.
Smaller translates into faster in various ways.
Upvotes: 1