Jakob Nielsen
Jakob Nielsen

Reputation: 5198

How can I optimize this slow sql query?

I have a user table with about 400k rows and need to retrieve some of those (max 100) while doing multiple JOINS with other tables. (Using mysql)

My problem is that the query currently has a Execution time of around 800ms (fetching 46ms). My goal is to optimize the query to reduce this execution time.

My initial query looks like this:

SELECT iduser, imageurl, birthdate, active, last_active, created, gender_idgender, orientation_idorientation, region_idregion, moji.moji_idmoji as moji_idmoji, status.value as status, requests.value as requestsRecv, requestssent.value as requestsSent, username.value as username, likecount.value as likeCount, reportcount.value as reportCount
FROM user u
  JOIN user_has_moji moji ON ( u.iduser = moji.user_iduser )
  JOIN user_has_data status ON ( u.iduser = status.user_iduser AND status.datatype_iddatatype = 1 )
  JOIN user_has_data requests ON ( u.iduser = requests.user_iduser AND requests.datatype_iddatatype = 3 )
  JOIN user_has_data requestssent ON ( u.iduser = requestssent.user_iduser AND requestssent.datatype_iddatatype = 4 )
  JOIN user_has_data username ON ( u.iduser = username.user_iduser AND username.datatype_iddatatype = 5 )
  JOIN user_has_data likecount ON ( u.iduser = likecount.user_iduser AND likecount.datatype_iddatatype = 6 )
  JOIN user_has_data reportcount ON ( u.iduser = reportcount.user_iduser AND reportcount.datatype_iddatatype = 7 )
WHERE banned = 0 AND active = 1 AND u.reviewstatus_idreviewstatus = 3 AND DATEDIFF(last_active,'2017-03-03 10:06:36') >=0 AND DATEDIFF(birthdate, '1999-03-03 15:06:36') >= 0 AND DATEDIFF(birthdate, '1967-03-03 15:06:36') <= 0 ORDER BY last_active DESC LIMIT 100

I thought that reason for the long execution time is that it has to JOIN the whole user table with the other tables before doing the query therefore I tried this approach:

SELECT iduser, imageurl, birthdate, active, last_active, created, gender_idgender, orientation_idorientation, region_idregion, moji.moji_idmoji as moji_idmoji, status.value as status, requests.value as requestsRecv, requestssent.value as requestsSent, username.value as username, likecount.value as likeCount, reportcount.value as reportCount
FROM ( SELECT * FROM user WHERE banned = 0 AND active = 1 AND reviewstatus_idreviewstatus = 3 AND DATEDIFF(last_active,'2017-03-03 10:01:30') >=0 AND DATEDIFF(birthdate, '1999-03-03 15:01:30') >= 0 AND DATEDIFF(birthdate, '1967-03-03 15:01:30') <= 0 ORDER BY last_active DESC LIMIT 100 ) as u
  JOIN user_has_moji moji ON ( u.iduser = moji.user_iduser )
  JOIN user_has_data status ON ( u.iduser = status.user_iduser AND status.datatype_iddatatype = 1 )
  JOIN user_has_data requests ON ( u.iduser = requests.user_iduser AND requests.datatype_iddatatype = 3 )
  JOIN user_has_data requestssent ON ( u.iduser = requestssent.user_iduser AND requestssent.datatype_iddatatype = 4 )
  JOIN user_has_data username ON ( u.iduser = username.user_iduser AND username.datatype_iddatatype = 5 )
  JOIN user_has_data likecount ON ( u.iduser = likecount.user_iduser AND likecount.datatype_iddatatype = 6 )
  JOIN user_has_data reportcount ON ( u.iduser = reportcount.user_iduser AND reportcount.datatype_iddatatype = 7 )

Thinking the number of rows needed to be joined would be greatly reduced and therefore reduce the execution time. This change seemed to have helped a bit but the query is still too low.

Can somebody see some other reason for the slow execution?

Upvotes: 0

Views: 46

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269443

You have a very reasonable response time, so your tables would appear to be properly indexed. But, let's look at the WHERE clause (I added the presumed table aliases):

WHERE u.banned = 0 AND u.active = 1 AND
      u.reviewstatus_idreviewstatus = 3 AND
      DATEDIFF(u.last_active,'2017-03-03 10:06:36') >=0 AND
      DATEDIFF(u.birthdate, '1999-03-03 15:06:36') >= 0 AND
      DATEDIFF(u.birthdate, '1967-03-03 15:06:36') <= 0

First, remove the datediff() and do simple comparison. I think you want this:

WHERE u.banned = 0 AND u.active = 1 AND
      u.reviewstatus_idreviewstatus = 3 AND
      u.last_active >= '2017-03-03 10:06:36' AND
      u.birthdate >= '1999-03-03 15:06:36' AND
      u.birthdate <= '1967-03-03 15:06:36'

I'm a little suspicious about the last three conditions, but using datediff() is confusing. That is one reason why simple comparisons are better.

You need to find these rows quickly, that suggests one of two indexes:

users(banned, active, reviewstatus_idreviewstatus, last_active, birthdate)

or

users(banned, active, reviewstatus_idreviewstatus, birthdate, last_active)

This depends on which whittles down the data more. My guess is that the last_active is the more important. Also note that the first three keys can be in any order, as long as they are the first three.

This might help, but I doubt that you will see a spectacular improvement.

Upvotes: 3

Related Questions