Reputation: 10408
I have a MySQL table running on the InnoDB engine called squares
that has roughly 2,250,000 rows with the following table structure:
`squares` (
`square_id` int(7) unsigned NOT NULL,
`ref_coord_lat` double(8,6) NOT NULL,
`ref_coord_long` double(9,6) NOT NULL,
PRIMARY KEY (`square_id`),
KEY `ref_coord_lat` (`ref_coord_lat`),
KEY `ref_coord_long` (`ref_coord_long`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
The first column square_id
holds a simple incrementing value from 0 - 2.25M, while ref_coord_lat
& ref_coord_long
hold a set of latitude and longitude coordinates in decimal degrees for a point, respectively.
This is a read-only table. No additional rows will be added, and the only query which needs to be run against it is the following:
SELECT * FROM `squares` WHERE
`ref_coord_lat` BETWEEN :southLat AND :northLat AND
`ref_coord_long` BETWEEN :westLong AND :eastLong
...where the values following the colons are PHP PDO placeholders. Essentially, the goal of this query is to fetch all coordinate points in the table that are currently in the viewport of a Google Maps window which is bounded by the 4 coordinates in the query.
I've limited the zoom level where this query is run with the Google Maps API, so that the maximum amount of rows that can be fetched is ~5600. As the zoom level increases, the resultant fetch total decreases significantly.
Running such an example query directly in PHPMyAdmin takes 1.40-1.45 seconds. This is far too long. I'm already running standard indices on ref_coord_lat
and ref_coord_long
which brought the query time down from ~5 seconds, but this is still much too large for a map where an end user expects a timely response.
My question is simply: How can I further optimize this table/query to increase the speed at which results are fetched?
Upvotes: 6
Views: 1581
Reputation: 56809
The content here is originally written by OP (Antilogical) as an edit to the question. I moved the answer portion from the question to here and made this a community wiki. @Antilogical, please ping me with comment if you want to post your own answer to gain reputation.
I managed to reduce my query time from an initial 5s down to 0.6-0.7ms. I stumbled upon this question, 'How to further optimize this MySQL table for a single Query' here. Which lead me to switch my table from InnoDB to MyISAM and use geospatial abstractions to represent my coordinate point.
Firstly, I switched from InnoDB to MyISAM which is better designed for MySQL spatial extensions.
ALTER TABLE `squares` ENGINE=MyISAM;
Then, I created a geospatial column called coordinate
which held a point object (which was simply just a concatenation of ref_coord_lat
and ref_coord_long
:
UPDATE `squares` SET `coordinate` = GeomFromText(CONCAT('POINT(', `ref_coord_lat`,' ', `ref_coord_long`, ')'));
I the added a spatial index to coordinate
- which dramatically increased the query performance. Initially while not using geospatial extensions, my query to select fields from my database was:
SELECT * FROM `squares` WHERE `ref_coord_lat` BETWEEN *somecoordinate* AND *somecoordinate* AND `ref_coord_long` BETWEEN *somecoordinate* and *somecoordinate*
What this query essentially does is imitates a bounding box by putting two limits/conditions of each axis (latitude & longitude). Note that *
currently means all three fields of my database, I hadn't created coordinate
yet. When I switched to using MySQL spatial extensions, I could now check against my new coordinate
column by using a Minimum Bounding Rectangle with the function MBRContains()
which is part of the MySQL geospatial extensions set.
SELECT * FROM `squares` WHERE MBRContains(GeomFromText(POLYGON((*my four bounds now go here in lat,lon pairs*))), `coordinate`);
Notice how I'm still selecting all fields with *
? You don't need to do that. The coordinate
column is merely acting as a index to look values up against, so I now select everything but that column via this query below, which gives a significant speed increase over the one directly above.
SELECT `square_id`, `ref_coord_lat`, `ref_coord_long` FROM `squares` WHERE MBRContains(GeomFromText(POLYGON((*my four bounds now go here in lat,lon pairs*))), `coordinate`);
There's been a multiple order of magnitude increase in speed:
squares
table (InnoDB, no indices)ref_coord_lat
& ref_coord_long
inno_db_buffer_pool
from 16M to 256M. coordinate
column. Database optimization? Done.
Upvotes: 3
Reputation: 116078
Creating compound index on (lat, long)
should help a lot.
However, right solution is to take a look at MySQL spatial extensions. Spatial support was specifically created to deal with two-dimensional data and queries against such data. If you create appropriate spatial indexes, your typical query performance should easily exceed performance of compound index on (lat, long)
.
Upvotes: 3
Reputation: 1596
Your structure seems quite OK. 2,25M rows in not that much. Your rows are small, and the comparison you do are only on double values. It should be faster.
Try to run ANALYZE
, OPTIMIZE
, CHECK
, REPAIR
commands on your table to make sure your indexes are correctly constructed.
Once this is done, you should try investigate deeper in the system. What is slowing down the query ? It can be :
Use monitoring to have data about your sql cache, memory usage etc. It will help you diagnose the issue.
Good luck with your project.
Upvotes: 3
Reputation: 21728
While not very elegant, splitting into multiple tables (say one per 30 degrees of latitude) usually helps. Your query is such that it is obvious which tables contain the required points.
Also, use EXPLAIN to investigate the problem.
Upvotes: 0