TPoy
TPoy

Reputation: 962

Attempting to create index for MySQL query

I don't have a lot of experience yet with MySQL and with databases in general, though I'm going head on into the development of a large-scale web app anyway. The following is the search query for my app that allows users to search for other users. Now that the primary table for this query dev_Profile has about 14K rows, the query is considerably slow (about 5 secs when running a query that returns the largest set possible). I'm sure there are many optimization tweaks that could be made here, but would creating an index be the most fundamental first step to do here? I've been first trying to learn about indexes on my own, and how to make an index for a query with multiple joins, but I'm just not quite grasping it. I'm hoping that seeing things in the context of my actual query could be more educational.

Here's the basic query:

SELECT 
  dev_Profile.ID AS pid,
  dev_Profile.Name AS username,
  IF(TIMESTAMPDIFF(SECOND, st1.lastActivityTime, UTC_TIMESTAMP()) > 300 OR ISNULL(TIMESTAMPDIFF(SECOND, st1.lastActivityTime, UTC_TIMESTAMP())), 0, 1) AS online, 
  FLOOR(DATEDIFF(CURRENT_DATE, dev_Profile.DOB) / 365) AS age,
  IF(dev_Profile.GenderID=1, 'M', 'F') AS sex, 
  IF(ISNULL(st2.Description), 0, st2.Description) AS relStatus, 
  st3.Name AS country, 
  IF(dev_Profile.RegionID > 0, st4.Name, 0) AS region, 
  IF(dev_Profile.CityID > 0, st5.Name, 0) AS city, 
  IF(ISNULL(st6.filename), 0, IF(st6.isApproved=1 AND st6.isDiscarded=0 AND st6.isModerated=1 AND st6.isRejected=0 AND isSizeAvatar=1, 1, 0)) AS hasPhoto, 
  IF(ISNULL(st6.filename), IF(dev_Profile.GenderID=1, 'http://www.mysite.com/lib/images/avatar-male-small.png', 'http://www.mysite.com/lib/images/avatar-female-small.png'), IF(st6.isApproved=1 AND st6.isDiscarded=0 AND st6.isModerated=1 AND st6.isRejected=0 AND isSizeAvatar=1, CONCAT('http://www.mysite.com/uploads/', st6.filename), IF(dev_Profile.GenderID=1, 'http://www.mysite.com/lib/images/avatar-male-small.png', 'http://www.mysite.com/lib/images/avatar-female-small.png'))) AS photo,
  IF(ISNULL(dev_Profile.StatusMessage), IF(ISNULL(dev_Profile.AboutMe), IF(ISNULL(st7.AboutMyMatch), 0, st7.AboutMyMatch), dev_Profile.AboutMe), dev_Profile.StatusMessage) AS text
FROM 
  dev_Profile 
  LEFT JOIN dev_User AS st1 ON st1.ID = dev_Profile.UserID 
  LEFT JOIN dev_ProfileRelationshipStatus AS st2 ON st2.ID = dev_Profile.ProfileRelationshipStatusID 
  LEFT JOIN Country AS st3 ON st3.ID = dev_Profile.CountryID 
  LEFT JOIN Region AS st4 ON st4.ID = dev_Profile.RegionID 
  LEFT JOIN City AS st5 ON st5.ID = dev_Profile.CityID 
  LEFT JOIN dev_Photos AS st6 ON st6.ID = dev_Profile.PhotoAvatarID 
  LEFT JOIN dev_DesiredMatch AS st7 ON st7.ProfileID = dev_Profile.ID
WHERE 
  dev_Profile.ID != 11222 /* $_SESSION['ProfileID'] */
  AND st1.EmailVerified = 'true'
  AND st1.accountIsActive=1
ORDER BY st1.lastActivityTime DESC LIMIT 900;

The speed of this query (too slow, as you can see):

900 rows in set (5.20 sec)

The EXPLAIN for this query:

+----+-------------+-------------+--------+---------------+---------+---------+---------------------------------------------+-------+----------------------------------------------+
| id | select_type | table       | type   | possible_keys | key     | key_len | ref                                         | rows  | Extra                                        |
+----+-------------+-------------+--------+---------------+---------+---------+---------------------------------------------+-------+----------------------------------------------+
|  1 | SIMPLE      | dev_Profile | range  | PRIMARY       | PRIMARY | 4       | NULL                                        | 13503 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | st2         | eq_ref | PRIMARY       | PRIMARY | 1       | syk.dev_Profile.ProfileRelationshipStatusID |     1 |                                              |
|  1 | SIMPLE      | st3         | eq_ref | PRIMARY       | PRIMARY | 4       | syk.dev_Profile.CountryID                   |     1 |                                              |
|  1 | SIMPLE      | st4         | eq_ref | PRIMARY       | PRIMARY | 4       | syk.dev_Profile.RegionID                    |     1 |                                              |
|  1 | SIMPLE      | st5         | eq_ref | PRIMARY       | PRIMARY | 4       | syk.dev_Profile.CityID                      |     1 |                                              |
|  1 | SIMPLE      | st1         | eq_ref | PRIMARY       | PRIMARY | 4       | syk.dev_Profile.UserID                      |     1 | Using where                                  |
|  1 | SIMPLE      | st6         | eq_ref | PRIMARY       | PRIMARY | 4       | syk.dev_Profile.PhotoAvatarID               |     1 |                                              |
|  1 | SIMPLE      | st7         | ALL    | NULL          | NULL    | NULL    | NULL                                        |   442 |                                              |
+----+-------------+-------------+--------+---------------+---------+---------+---------------------------------------------+-------+----------------------------------------------+

It's also possible that the query can have more WHERE and HAVING clauses, if a user's search contains additional criteria. The additional clauses are (set with example values):

AND dev_Profile.GenderID = 1 
AND dev_Profile.CountryID=127
AND dev_Profile.RegionID=36
AND dev_Profile.CityID=601
HAVING (age >= 18 AND age <= 50)
AND online=1 
AND hasPhoto=1 

This is the EXPLAIN for the query using all possible WHERE and HAVING clauses:

+----+-------------+-------------+--------+---------------+---------+---------+---------------------------------------------+-------+----------------------------------------------+
| id | select_type | table       | type   | possible_keys | key     | key_len | ref                                         | rows  | Extra                                        |
+----+-------------+-------------+--------+---------------+---------+---------+---------------------------------------------+-------+----------------------------------------------+
|  1 | SIMPLE      | dev_Profile | range  | PRIMARY       | PRIMARY | 4       | NULL                                        | 13503 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | st2         | eq_ref | PRIMARY       | PRIMARY | 1       | syk.dev_Profile.ProfileRelationshipStatusID |     1 |                                              |
|  1 | SIMPLE      | st3         | const  | PRIMARY       | PRIMARY | 4       | const                                       |     1 |                                              |
|  1 | SIMPLE      | st4         | const  | PRIMARY       | PRIMARY | 4       | const                                       |     1 |                                              |
|  1 | SIMPLE      | st5         | const  | PRIMARY       | PRIMARY | 4       | const                                       |     1 |                                              |
|  1 | SIMPLE      | st1         | eq_ref | PRIMARY       | PRIMARY | 4       | syk.dev_Profile.UserID                      |     1 | Using where                                  |
|  1 | SIMPLE      | st6         | eq_ref | PRIMARY       | PRIMARY | 4       | syk.dev_Profile.PhotoAvatarID               |     1 |                                              |
|  1 | SIMPLE      | st7         | ALL    | NULL          | NULL    | NULL    | NULL                                        |   442 |                                              |
+----+-------------+-------------+--------+---------------+---------+---------+---------------------------------------------+-------+----------------------------------------------+

I'm not even sure if this is TMI or not enough.

Is an index the right step to take here? If so, could someone get me going in the right direction?

Upvotes: 1

Views: 82

Answers (2)

One should use indexes that are relevant for frequent queries. An index just slightly degrades write-performance while immensely speeds searches. As a rule of thumb, objects own IDs should be indexed as PRIMARY key and it's a good idea to have an index on column-groups that appear always together in a query. I figure you should index GenderID, CountryID, RegionID, CityID, age, online and hasPhoto. You should provide the schema of at least dev_Profile if you think that the right indexes are not used.

Notice that country/region/city IDs might represent redundant information. Your design may be suboptimal.

Notice2: you're doing awfully lot of application logic in SELECT. SQL is not designed for these lots of IF-in-IF-in-IF clauses, and because of the URLs the query is returning much larger a table than if would if you just requested the relevant field (i.e. filename, genderID, and so on). There might be times when those exact interpreted values have to be returned by the query, by in general you are better off (in the aspects of speed and readability) to code these processing steps into your application code.

Upvotes: 0

Neil
Neil

Reputation: 55402

The right step is whatever speeds up your query!

With your original query, I would say that you end up doing a table scan on the dev_Profile table as there are no indexable conditions on it. With your modified query, it depends on the number of different values allowed in the column - if there are may duplicates then the index may not get used as it has to fetch the table anyway in order to complete the rest of the query.

I have read your plan correctly then you are joining all of your other tables on an indexed non-nullable column already (except for st7, which doesn't seem to be using an index for some reason). It therefore looks as if you should not be using left joins. This would then allow the use of an index on (EmailVerified, accountIsActive, lastActivityTime) on table st1.

Upvotes: 1

Related Questions