Reputation: 4300
The following query is used to do a member search, in this example, only using the last name. The query returns in a few seconds if searching on a full matching name; but if :LastName = 'S'
, then the query takes upwards of 12 seconds to return.
How can I speed up this query? If I can do it in under a second with two queries, shouldn't I be able to make it just one query, just as fast? Because of plugins and other methods, it would be easiest for me to have this be one query, hence my question.
The Member
table holds every member we've ever had. The table has some members who we don't have any registration for, so they only exist in this table, not in Registration
or Registration_History
. Registration_History
has extra information on most members that I want to display. Registration
has most of the same information as RH (RH has some fields that Reg doesnt), but sometimes it has members that RH does not have, which is why it is joined here. EDIT: Members can have multiple rows in Registration. I want to fill the columns from Registration_History, however, some legacy members ONLY exist in Registration. Unlike other members,these legacy members only ever have 1 row in Registration, so I don't need to worry about how Registration is sorted, just that it only grabs 1 row from there.
SQL Fiddle with sample database design
MemberID
is indexed in all 3 tables. Before I put the SELECT RHSubSelect.rehiId
subquery in, this query was taking almost a full minute to return.
If I split the query into 2 queries, doing this:
SELECT
MemberID
FROM
Member
WHERE
Member.LastName LIKE CONCAT('%', :LastName, '%')
And then putting those MemberID
s into an array and passing that array to RHSubSelect.MemberID IN ($theArray)
(instead of the Member subquery), the results come back very quickly (about a second).
Full query: (Full SELECT statement is in the Fiddle, SELECT *
for brevity)
SELECT
*
FROM
Member
LEFT JOIN
Registration_History FORCE INDEX (PRIMARY)
ON
Registration_History.rehiId = (
SELECT
RHSubSelect.rehiId
FROM
Registration_History AS RHSubSelect
WHERE
RHSubSelect.MemberID IN (
SELECT
Member.MemberID
FROM
Member
WHERE
Member.LastName LIKE CONCAT('%', :LastName, '%')
)
ORDER BY
RHSubSelect.EffectiveDate DESC
LIMIT 0, 1
)
LEFT JOIN
Registration FORCE INDEX(MemberID)
ON
Registration.MemberID = Member.MemberID
WHERE
Member.LastName LIKE CONCAT('%', :LastName, '%')
GROUP BY
Member.MemberID
ORDER BY
Relevance ASC,LastName ASC,FirstName asc
LIMIT 0, 1000
MySQL Explain, with the FORCE INDEX()
in the query:
(If the image with the explain doesn't show, it's also here: http://oi41.tinypic.com/2iw4t8l.jpg)
Upvotes: 1
Views: 436
Reputation: 21523
The main thing you seem to be checking is the last name with a leading % in the like. This renders the index on that column useless, and your SQL is searching for it twice.
I am not 100% sure what you are trying to do. Your SQL appears to get all the members who match on name to the one required, then get the last registration_history record for those. The one you get could be from any one of the matching members, which seems strange unless you only ever expect to get a single member.
If this is the case the following minor tidy (removing and IN and changing it to a JOIN) up will possibly slightly improve things.
SELECT
COALESCE(NULLIF(Registration_History.RegYear, ''), NULLIF(Registration.Year, '')) AS RegYear,
COALESCE(NULLIF(Registration_History.RegNumber, ''), NULLIF(Registration.RegNumber, ''), NULLIF(Member.MemberID, '')) AS RegNumber,
Member.MemberID,
Member.LastName,
Member.FirstName,
CASE
WHEN Member.LastNameTrimmed = :LastName
THEN 1
WHEN Member.LastNameTrimmed LIKE CONCAT(:LastName, '%')
THEN 2
ELSE 3
END AS Relevance
FROM Member
LEFT JOIN Registration_History FORCE INDEX (PRIMARY)
ON Registration_History.rehiId =
(
SELECT RHSubSelect.rehiId
FROM Registration_History AS RHSubSelect
INNER JOIN Member
ON RHSubSelect.MemberID = Member.MemberID
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
ORDER BY RHSubSelect.EffectiveDate DESC
LIMIT 0, 1
)
LEFT JOIN Registration FORCE INDEX(MemberID)
ON Registration.MemberID = Member.MemberID
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
GROUP BY Member.MemberID
ORDER BY Relevance ASC,LastName ASC,FirstName asc
LIMIT 0, 1000
However if this is not quite what you want then further changes might be possible.
Bit more of a clean up, eliminating one of the LIKEs with a leading wildcard:-
SELECT
COALESCE(NULLIF(Sub2.RegYear, ''), NULLIF(Registration.Year, '')) AS RegYear,
COALESCE(NULLIF(Sub2.RegNumber, ''), NULLIF(Registration.RegNumber, ''), NULLIF(Member.MemberID, '')) AS RegNumber,
Member.MemberID,
Member.LastName,
Member.FirstName,
CASE
WHEN Member.LastNameTrimmed = :LastName
THEN 1
WHEN Member.LastNameTrimmed LIKE CONCAT(:LastName, '%')
THEN 2
ELSE 3
END AS Relevance
FROM Member
LEFT OUTER JOIN Registration
ON Registration.MemberID = Member.MemberID
LEFT OUTER JOIN
(
SELECT Registration_History.MemberID, Registration_History.rehiID, Registration_History.RegYear, Registration_History.RegNumber
FROM Registration_History
INNER JOIN
(
SELECT RHSubSelect.MemberID, MAX(RHSubSelect.EffectiveDate) AS EffectiveDate
FROM Registration_History AS RHSubSelect
GROUP BY RHSubSelect.MemberID
) Sub1
ON Registration_History.MemberID = Sub1.MemberID AND Registration_History.EffectiveDate = Sub1.EffectiveDate
) Sub2
ON Sub2.MemberID = Member.MemberID
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
GROUP BY Member.MemberID
ORDER BY Relevance ASC,LastName ASC,FirstName asc
LIMIT 0, 1000
This is getting all the members with a matching name, their matching registration record and their registration_history record with the latest EffectiveDate.
I do not think the last GROUP BY is necessary (assuming that there is a 1 to 1 relationship between Members and Registration, and if not you probably want to use something other than GROUP BY), but I have left it in for now.
Afraid without table declares and some same data I can't really test it.
EDIT - Bit of a play, trying to reduce the quantities it is dealing with earlier in the select:-
SELECT
COALESCE(NULLIF(Registration_History.RegYear, ''), NULLIF(Sub1.Year, '')) AS RegYear,
COALESCE(NULLIF(Registration_History.RegNumber, ''), NULLIF(Sub1.RegNumber, ''), NULLIF(Sub1.MemberID, '')) AS RegNumber,
Sub1.MemberID,
Sub1.LastName,
Sub1.FirstName,
CASE
WHEN Sub1.LastName = :LastName
THEN 1
WHEN Sub1.LastName LIKE CONCAT(:LastName, '%')
THEN 2
ELSE 3
END AS Relevance
FROM
(
SELECT
Member.MemberID,
Member.LastName,
Member.FirstName,
Registration.Year,
Registration.RegNumber,
MAX(Registration_History.EffectiveDate) AS EffectiveDate
FROM Member
LEFT OUTER JOIN Registration
ON Registration.MemberID = Member.MemberID
LEFT OUTER JOIN Registration_History
ON Registration_History.MemberID = Member.MemberID
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
GROUP BY Member.MemberID,
Member.LastName,
Member.FirstName,
Registration.Year,
Registration.RegNumber
) Sub1
LEFT OUTER JOIN Registration_History
ON Registration_History.MemberID = Sub1.MemberID AND Registration_History.EffectiveDate = Sub1.EffectiveDate
ORDER BY Relevance ASC,LastName ASC,FirstName asc
LIMIT 0, 1000
EDIT again.
Give this a try. The items you are sorting on are all from the members table so possibly makes sense to exclude the as early as possible in a subselect.
SELECT
COALESCE(NULLIF(Registration_History2.EffectiveDate, ''), NULLIF(Registration2.Year, '')) AS RegYear,
COALESCE(NULLIF(Registration_History2.RegNumber, ''), NULLIF(Registration2.RegNumber, ''), NULLIF(Member.MemberID, '')) AS RegNumber,
Member.MemberID,
Member.LastName,
Member.FirstName,
Member.Relevance
FROM
(
SELECT Member.MemberID,
Member.LastName,
Member.FirstName,
CASE
WHEN Member.LastName = :LastName
THEN 1
WHEN Member.LastName LIKE CONCAT(:LastName, '%')
THEN 2
ELSE 3
END AS Relevance
FROM Member
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
ORDER BY Relevance ASC,LastName ASC,FirstName asc
LIMIT 0, 1000
) Member
LEFT OUTER JOIN
(
SELECT MemberID, MAX(EffectiveDate) AS EffectiveDate
FROM Registration_History
GROUP BY MemberID
) Registration_History
ON Registration_History.MemberID = Member.MemberID
LEFT OUTER JOIN Registration_History Registration_History2
ON Registration_History2.MemberID = Registration_History.MemberID
AND Registration_History2.EffectiveDate = Registration_History.EffectiveDate
LEFT OUTER JOIN
(
SELECT MemberID, MAX(Year) AS Year
FROM Registration
GROUP BY MemberID
) Registration
ON Registration.MemberID = Member.MemberID
LEFT OUTER JOIN
(
SELECT MemberID, Year, MAX(RegNumber) AS RegNumber
FROM Registration
GROUP BY MemberID, Year
) Registration2
ON Registration2.MemberID = Member.MemberID
AND Registration2.Year = Registration.Year
EDIT again
Not tested the following so this is more for just an idea of another way to try to get around the issue, using a little trick with GROUP_CONCAT:-
SELECT
COALESCE(NULLIF(Registration_History.EffectiveDate, ''), NULLIF(Registration.Year, '')) AS RegYear,
COALESCE(NULLIF(Registration_History.RegNumber, ''), NULLIF(Registration.RegNumber, ''), NULLIF(Member.MemberID, '')) AS RegNumber,
Member.MemberID,
Member.LastName,
Member.FirstName,
Member.Relevance
FROM
(
SELECT Member.MemberID,
Member.LastName,
Member.FirstName,
CASE
WHEN Member.LastName = :LastName
THEN 1
WHEN Member.LastName LIKE CONCAT(:LastName, '%')
THEN 2
ELSE 3
END AS Relevance
FROM Member
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
ORDER BY Relevance ASC,LastName ASC,FirstName asc
LIMIT 0, 1000
) Member
LEFT OUTER JOIN
(
SELECT MemberID,
SUBSTRING_INDEX(GROUP_CONCAT(EffectiveDate ORDER BY EffectiveDate DESC), ",", 1) AS EffectiveDate,
SUBSTRING_INDEX(GROUP_CONCAT(RegNumber ORDER BY EffectiveDate DESC), ",", 1) AS RegNumber
FROM Registration_History
GROUP BY MemberID
) Registration_History
ON Registration_History.MemberID = Member.MemberID
LEFT OUTER JOIN
(
SELECT MemberID,
SUBSTRING_INDEX(GROUP_CONCAT(Year ORDER BY Year DESC), ",", 1) AS Year,
SUBSTRING_INDEX(GROUP_CONCAT(RegNumber ORDER BY Year DESC), ",", 1) AS RegNumber
FROM Registration
GROUP BY MemberID
) Registration
ON Registration.MemberID = Member.MemberID
Upvotes: 1
Reputation: 23001
My suggestion would be a query like this:
SELECT *
FROM Member
LEFT JOIN Registration USING (MemberID)
LEFT JOIN Registration_History ON rehiID = (
SELECT rehiID
FROM Registration_History AS RHSubSelect
WHERE RHSubSelect.MemberID = Member.MemberID
ORDER BY EffectiveDate DESC
LIMIT 1
)
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
The way it works, is you start by selecting from the Member table matching against the LastName. You then have simple LEFT JOIN
to the Registration table, since a particular member can have at most 1 entry in that table. Finally you LEFT JOIN
the Registration_History table with a subselect.
The subselect looks for the most recent EffectiveDate matching the current MemberID and returns the rehiID for that record. The LEFT JOIN
must then match that rehiID exacty. If there are no entries in the Registration_History for that member, then nothing is joined.
In theory this should be relatively fast, since you are only performing the LIKE
comparison in the main query. The Registration join should be fast since the table is indexed on MemberID. However, I suspect you'll need an additional index on the Registration_History to get the best performance.
You've already got the primary key, rehID, indexed which is what we need for the LEFT JOIN
on rehID. However, the subquery needs to match the MemberID in the WHERE
clause as well as sorting by the EffectiveDate. For the best performance there, I think you'll need an additional index combining the MemberID and EffectiveDate columns.
Note that my example query is just the bare minimum to keep things simple. You'll obviously need to replace the *
with all the fields you want returned (the same as your original query). Also you'll need to add your ORDER BY
and LIMIT
clauses. However, the GROUP BY
should not be required.
SQL Fiddle link: http://sqlfiddle.com/#!2/4a947a/1
The above fiddle show the full query except it has the last name hardcoded. I've modified your original sample data to include a couple more records and changed some of the values. I also added the extra index on the Registration_History table.
Optimising for the LIMIT
If you're going to doing timing runs again, I'd be curious to see how my query performs when using the modification suggested by Kickstart to do a subselect against the Member table first, before joining the Registration and Registration_History tables.
SELECT
COALESCE(NULLIF(Registration_History.RegYear, ''), NULLIF(Registration.Year, '')) AS RegYear,
COALESCE(NULLIF(Registration_History.RegNumber, ''), NULLIF(Registration.RegNumber, ''), NULLIF(Member.MemberID, '')) AS RegNumber,
Member.MemberID,
Member.LastName,
Member.FirstName,
Member.Relevance
FROM (
SELECT MemberID, LastName, FirstName,
CASE
WHEN Member.LastNameTrimmed = :LastName THEN 1
WHEN Member.LastNameTrimmed LIKE CONCAT(:LastName, '%') THEN 2
ELSE 3
END AS Relevance
FROM Member
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
ORDER BY Relevance ASC,LastName ASC,FirstName ASC
LIMIT 0, 1000
) Member
LEFT JOIN Registration USING (MemberID)
LEFT JOIN Registration_History ON rehiID = (
SELECT rehiID
FROM Registration_History AS RHSubSelect
WHERE RHSubSelect.MemberID = Member.MemberID
ORDER BY EffectiveDate DESC
LIMIT 1
)
When using a LIMIT, this should perform significantly better than my original query, since it won't have to carry out a bunch of unnecessary joins for the records that are excluded by the LIMIT.
Upvotes: 1
Reputation: 59515
If I understood your problem correctly (you just need to select particular users and their latest history record - is that correct)? If yes, your problem is actually very easy variant of greatest record per group problem. No need for any subqueries:
SELECT Member.*, rh1.*
FROM Member
LEFT JOIN Registration_History AS rh1 USING (MemberID)
LEFT JOIN Registration_History AS rh2
ON rh1.MemberId = rh2.MemberId AND rh1.EffectiveDate < rh2.EffectiveDate
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
AND rh2.MemberId IS NULL
ORDER BY Relevance ASC,LastName ASC,FirstName ASC
LIMIT 0, 1000
(#2 was removed, taking #3 here to avoid confusion in comments)
SELECT Member.*, max(rh1.EffectiveDate), rh1.*
FROM Member
LEFT JOIN Registration_History AS rh1 USING (MemberID)
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
GROUP BY Member.MemberID
ORDER BY Relevance ASC,LastName ASC,FirstName ASC
LIMIT 0, 1000
This one was inspired by James query, but removing the limit
and order by
(note that you should have defined index on EffectiveDate for not only this, but all queries to be efficient!)
select *
from Member
left join Registration_History AS rh1 on rh1.MemberID = Member.MemberID
and rh1.EffectiveDate = (select max(rh2.EffectiveDate)
from Registration_History as rh2
where rh2.MemberID = Member.MemberID)
)
WHERE Member.LastName LIKE CONCAT('%', :LastName, '%')
ORDER BY Relevance ASC,LastName ASC,FirstName ASC
LIMIT 0, 1000
Please post the actual durations in your db!
Upvotes: 0
Reputation: 48139
Ok, here's my shot and I used a variety of pieces. One, I had to take the "relevance" field from one as you did not indicate how to make it work. Next, since you wanted the latest entry from the registration history for a given member (if they existed in R/H), it appears that the effective date correlated with the ReHiID so I used that as it appears that would be a great key to work on for subsequent left-join.
So, the inner query makes the preliminary pass on just the criteria of the name you are looking for, applies relevance and limits the 1000 entries there. This way it doesn't have to go through 20,000 entries at the outer level and join... just the 1000 that could qualify.
That result is then left-joined to the other tables as indicated... Registration of only a single entry (if exist) and left-joined to the R/H on the member AND the max ReHiID.
To apply the name you are looking for, just change the ( select @LookForMe := 'S' ) sqlvars line in the query...
select *
from
( select
M.*,
max( RH.EffectiveDate ) as MaxEffectiveDate,
max( R.RegNumber ) as MaxRegNumber,
CASE WHEN M.LastNameTrimmed = @LookForMe THEN 1
WHEN M.LastNameTrimmed LIKE CONCAT(@LookForMe, '%') THEN 2
ELSE 3 END AS Relevance
from
( select @LookForMe := 'S' ) sqlvars,
Member M
LEFT JOIN Registration_History RH
on M.MemberID = RH.MemberID
LEFT JOIN Registration R
on M.MemberID = R.MemberID
where
M.LastName LIKE CONCAT('%', 'S', '%')
group by
M.MemberID
order by
Relevance,
M.LastName,
M.FirstName
limit
0,1000 ) PreQuery
LEFT JOIN Registration R2
on PreQuery.MemberNumber = R2.MemberNumber
AND PreQuery.MaxRegNumber = R2.RegNumber
LEFT JOIN Registration_History RH2
ON PreQuery.MemberNumber = RH2.MemberNumber
AND PreQuery.MaxEffectiveDate = RH2.EffectiveDate
Lets see how quick this runs with your production data and how close we get.
Upvotes: 0
Reputation: 36107
Try this query:
set @lastname = 'Smith1';
-- explain extended
SELECT
COALESCE(NULLIF(Registration_History.RegYear, ''), NULLIF(Registration.Year, '')) AS RegYear,
COALESCE(NULLIF(Registration_History.RegNumber, ''), NULLIF(Registration.RegNumber, ''), NULLIF(Member.MemberID, '')) AS RegNumber,
Member.MemberID,
Member.LastName,
Member.FirstName,
CASE
WHEN Member.LastNameTrimmed = 'Smith' THEN 1
WHEN Member.LastNameTrimmed LIKE CONCAT(@lastname, '%') THEN 2
ELSE 3
END AS Relevance
FROM (
SELECT Member.*,
( SELECT RHSubSelect.rehiId
FROM Registration_History AS RHSubSelect
WHERE RHSubSelect.MemberID = Member.MemberID
ORDER BY RHSubSelect.EffectiveDate DESC
LIMIT 0,1
) rh_MemberId
FROM Member
WHERE Member.LastName LIKE CONCAT('%', @lastname, '%')
) Member
LEFT JOIN Registration_History
ON Registration_History.rehiId = Member.rh_MemberId
LEFT JOIN Registration -- FORCE INDEX(MemberID)
ON Registration.MemberID = Member.MemberID
GROUP BY Member.MemberID
ORDER BY Relevance ASC,LastName ASC,FirstName asc
LIMIT 0, 1000
;
Upvotes: 0