Reputation: 542
Hi i am in need of help to optimize a query for large database records above 1 Millions . Current query is taking 27-30 seconds to execute.
SELECT SQL_CALC_FOUND_ROWS
candidate.candidate_id AS candidateID,
candidate.candidate_id AS exportID,
candidate.is_hot AS isHot,
candidate.date_modified AS dateModifiedSort,
candidate.date_created AS dateCreatedSort,
candidate.first_name AS firstName,
candidate.last_name AS lastName,
candidate.city AS city,
candidate.state AS state,
candidate.key_skills AS keySkills,
owner_user.first_name AS ownerFirstName,
owner_user.last_name AS ownerLastName,
CONCAT(owner_user.last_name,
owner_user.first_name) AS ownerSort,
DATE_FORMAT(candidate.date_created, '%m-%d-%y') AS dateCreated,
DATE_FORMAT(candidate.date_modified, '%m-%d-%y') AS dateModified,
candidate.email2 AS email2 FROM
candidate
LEFT JOIN
user AS owner_user ON candidate.owner = owner_user.user_id
LEFT JOIN
saved_list_entry ON saved_list_entry.data_item_type = 100
AND saved_list_entry.data_item_id = candidate.candidate_id WHERE
is_active = 1 GROUP BY candidate.candidate_id ORDER BY dateModifiedSort
DESC LIMIT 0 , 15
is there any method to reduce the execution time of the query. I have also added the index in tables but it is not working fine.
Upvotes: 2
Views: 154
Reputation: 142298
Get rid of saved_list_entry
, it adds nothing.
Delay joining to user
. This will let you get rid of the GROUP BY
, which is adding a bunch of time, and possibly inflating the value of FOUND_ROWS()
.
Something like:
SELECT c2.*,
ou.first_name AS ownerFirstName,
ou.last_name AS ownerLastName,
CONCAT(ou.last_name, ou.first_name) AS ownerSort,
FROM
( SELECT SQL_CALC_FOUND_ROWS
c.candidate_id AS candidateID, c.candidate_id AS exportID,
c.is_hot AS isHot, c.date_modified AS dateModifiedSort,
c.date_created AS dateCreatedSort, c.first_name AS firstName,
c.last_name AS lastName, c.city AS city, c.state AS state,
c.key_skills AS keySkills,
DATE_FORMAT(c.date_created, '%m-%d-%y') AS dateCreated,
DATE_FORMAT(c.date_modified, '%m-%d-%y') AS dateModified,
c.email2 AS email2
FROM candidate AS c
WHERE is_active = 1
GROUP BY c.candidate_id
ORDER BY c.date_modified DESC -- note change here
LIMIT 0 , 15
) AS c2
LEFT JOIN user AS ou ON c2.owner = ou.user_id;
(I messed up the column order, but you can fix that.)
Index needed:
candidate: INDEX(is_active, candidate_id, date_modified)
Upvotes: 1
Reputation: 48139
First, a candidate, I would suspect is always a single entry by the ID, so why you are doing a GROUP BY is beyond me, that can PROBABLY be removed and improve a little.
Second, you are doing a left-join to the "saved_list_entry" table, but not actually pulling any columns from it, so this could probably be removed completely.
Third, from consideration of the GROUP BY being no longer applicable, I would suggest updating your indexes as:
table index
CANDIDATE ( is_active, date_modified, candidate_id, owner )
user ( user_id )
saved_list_entry ( data_item_id, data_item_type )
Since your order is by the date modified in descending order, having IT in the second position to the is_active (Where condition), it will plow through your first 15 quickly. However, your SQL_CALC_FOUND_ROWS will still have to plow through all other qualifying, but the result set would be pre-ordered by the index to match.
SELECT SQL_CALC_FOUND_ROWS
c.candidate_id AS candidateID,
c.candidate_id AS exportID,
c.is_hot AS isHot,
c.date_modified AS dateModifiedSort,
c.date_created AS dateCreatedSort,
c.first_name AS firstName,
c.last_name AS lastName,
c.city AS city,
c.state AS state,
c.key_skills AS keySkills,
u.first_name AS ownerFirstName,
u.last_name AS ownerLastName,
CONCAT(u.last_name, u.first_name) AS ownerSort,
DATE_FORMAT(c.date_created, '%m-%d-%y') AS dateCreated,
DATE_FORMAT(c.date_modified, '%m-%d-%y') AS dateModified,
c.email2 AS email2
FROM
candidate c
LEFT JOIN user u
ON c.owner = u.user_id
LEFT JOIN saved_list_entry s
ON c.candidate_id = s.data_item_id
AND s.data_item_type = 100
WHERE
c.is_active = 1
GROUP BY
c.candidate_id
ORDER BY
c.date_modified DESC
LIMIT
0, 15
Upvotes: 1
Reputation: 1381
I've changed changed the table alias in the below query, use this This Must Solve Your Problem
SELECT SQL_CALC_FOUND_ROWS
candidate.candidate_id AS candidateID,
candidate.candidate_id AS exportID,
candidate.is_hot AS isHot,
candidate.date_modified AS dateModifiedSort,
candidate.date_created AS dateCreatedSort,
candidate.first_name AS firstName,
candidate.last_name AS lastName,
candidate.city AS city,
candidate.state AS state,
candidate.key_skills AS keySkills,
user.first_name AS ownerFirstName,
user.last_name AS ownerLastName,
CONCAT(user.last_name,
user.first_name) AS ownerSort,
DATE_FORMAT(candidate.date_created, '%m-%d-%y') AS dateCreated,
DATE_FORMAT(candidate.date_modified, '%m-%d-%y') AS dateModified,
candidate.email2 AS email2 FROM
candidate
LEFT JOIN
user ON candidate.owner = user.user_id
LEFT JOIN
saved_list_entry ON saved_list_entry.data_item_type = 100
AND saved_list_entry.data_item_id = candidate.candidate_id WHERE
is_active = 1 GROUP BY candidate.candidate_id ORDER BY dateModifiedSort
DESC LIMIT 0 , 15
use the below queries to create indexes for join conditions
create index index_user user(user_id);
create index index_saved_list_entry saved_list_entry(data_item_type,data_item_id);
create index index_candidate candidate(is_active,candidate_id,dateModifiedSort);
Upvotes: 1
Reputation: 108651
You're using the query pattern
SELECT a vast bunch of stuff
FROM a complex assembly of JOIN operations
ORDER BY some variable DESC
LIMIT 0,small number
This is inherently inefficient: to satisfy your query the MySQL server must construct a vast result set, then it must sort the whole thing, then it takes the first fifteen rows and discards the rest.
To make this more efficient, you need to sort less stuff. Here's a way to do that. It looks like you want to find the most recently modified fifteen candidates. This query will, fairly cheaply, retrieve the IDs of those candidates. It exploits one of your indexes.
SELECT candidate_id
FROM candidate
ORDER BY date_modified DESC
LIMIT 0, 15
Then, you can use that as a subquery in your main query. Add a clause like this:
WHERE candidate.candidate_id IN (
SELECT candidate_id
FROM candidate
ORDER BY date_modified DESC
LIMIT 0, 15)
to your query in the appropriate place.
Notice also that you're using a nonstandard and potentially harmful MySQL specific extension to GROUP BY. Your query works, but if a candidate has more than one owner, it will return just one after selecting it randomly.
Finally, you seem to have placed single-column indexes on many of the columns in your large table. This is a notorious SQL antipattern: all those indexes slow down INSERT and UPDATE operations, and most of them probably serve no purpose in speeding up a query. Certainly, for this query, the only useful indexes are the one on date_modified
and the primary key.
Many complex queries are satisfied best using specific multi-column indexes. A bunch of single-column indexes are no help to such queries.
Upvotes: 1