Reputation: 11
I am trying to optimize the following T-SQL query:
SELECT Person.*
FROM Person
WHERE ZipCode LIKE '123%'
AND City = 'Washington'
AND NumberOfHomes in (1, 2, 3)
AND
(
EXISTS
(
SELECT * FROM House
WHERE Person.ID = House.PersonID
AND House.Type = 'TOWNHOUSE'
AND House.Size = 'Medium'
)
OR
EXISTS
(
SELECT * FROM Color
WHERE Person.ID = Color.PersonID
AND Color.Foreground IN ('Green', 'Blue', 'Purple')
)
)
I'd greatly appreciate any response in optimizing the query.
In particular, is there a way to convert the query into a more efficient query using only a single SELECT statement without any of the inner SELECT statements?
Thanks!
Upvotes: 1
Views: 355
Reputation: 4934
Left join and checking for null will be quicker than doing existence checks. Also, if NumberofHomes is an integer, doing BETWEEN
will be the same as IN
.
SELECT p.*
FROM Person p
LEFT JOIN House h
ON p.ID = h.PersonID
AND h.Type = 'TOWNHOUSE'
AND h.Size = 'Medium'
LEFT JOIN Color c
ON p.ID = c.PersonID
AND c.Foreground IN ('Green', 'Blue', 'Purple')
WHERE p.ZipCode LIKE '123%'
AND p.City = 'Washington'
AND p.NumberOfHomes BETWEEN 1 AND 3
AND (h.PersonID is not null or c.PersonID is not null)
OR you can try something like this...
select t.*
from (
select personid from house
where type = 'townhouse' and size = 'medium'
union
select personid from color
where foreground in ('green','blue','purple')
) pid
cross apply (
select *
from person p
where p.id = pid.personid
and p.zipcode like '123%'
and p.city = 'washington'
and p.numberofhomes between 1 and 3
) t
where t.id is not null
It's really difficult to optimize these blind. Depending on the distribution of your data, the above query may give you better results.
Upvotes: 0
Reputation: 11
Often optimizing and having several different select statements are different topics as the query optimizer (SQL Server) often will take your sql statement and run it the way it sees to be the most efficient way it sees fit.
Saying that yes are several different ways you can take your statements and combine them into one sql statement here is an example. This will preserve your person table and get matches from House OR Color tables that match your criteria.
<!-- language:SQL-->
SELECT *
FROM Person Left Outer Join House ON Person.ID = House.PersonID Left Outer Join Color ON
Person.ID= Color.PersonID
WHERE (ZipCode LIKE '123%'
AND City = 'Washington'
AND Person.NumberofHomes in (1, 2, 3) )
AND (
House.Type = 'TOWNHOUSE'
AND House.Size = 'Medium'
)
OR(
Color.Foreground IN ('Green', 'Blue', 'Purple')
)
I would recommend that you reconsider your model. For example, having PersonID in color is very suspect as is having numberofhomes (that could be possibly calculated for example, from a count on the House table that has the person's id). There are some other questionable normalization attributes as well. Not part of your question but I thought you might want to consider it.
Upvotes: -1
Reputation: 656
Please try this:
SELECT p.*
FROM Person p
WHERE Substring(Ltrim(Rtrim(p.ZipCode)),1,3) = '123' AND p.City = 'Washington'AND
(p.NumberOfHomes=1 or p.NumberOfHomes=2 or p.NumberOfHomes=3))
AND
(
EXISTS
(
SELECT 1 FROM House h
WHERE p.ID = h.PersonID
AND h.Type = 'TOWNHOUSE'
AND h.Size = 'Medium'
)
OR
EXISTS
(
SELECT 1 FROM Color c
WHERE p.ID = c.PersonID
AND (c.Foreground ='Green' or c.Foreground='Blue' or c.Foreground='Purple')
)
);
Also this will work better:
SELECT
p.*
FROM Person p
Left join House h
On (p.Id=h.PersonID)
Left join Color c
On (p.id=c.PersonID)
WHERE Substring(Ltrim(Rtrim(p.ZipCode)),1,3) = '123' AND p.City = 'Washington'AND
(p.NumberOfHomes=1 or p.NumberOfHomes=2 or p.NumberOfHomes=3)) and Isnull(h.Type,'') = 'TOWNHOUSE' AND Isnull(h.Size,'') = 'Medium' AND
(Isnull(c.Foreground,'') ='Green' or Isnull(c.Foreground,'')='Blue' or Isnull(c.Foreground,'')='Purple') and
(h.PersonID is not null or c.PersonID is not null);
Upvotes: 0
Reputation: 1269633
This is the query:
SELECT p.*
FROM Person p
WHERE p.ZipCode LIKE '123%' AND p.City = 'Washington' AND p.NumberOfHomes in (1, 2, 3) AND
(EXISTS (SELECT *
FROM House h
WHERE p.ID = h.PersonID AND h.Type = 'TOWNHOUSE' AND h.Size = 'Medium'
) OR
EXISTS (SELECT *
FROM Color c
WHERE p.ID = c.PersonID AND c.Foreground IN ('Green', 'Blue', 'Purple')
)
);
Without rewriting the query, you can optimize this with indexes. I would recommend:
Person(City, ZipCode, NumberOfHomes, Id);
House(PersonId, Type, Size);
Color(PersonID, Foreground)
Question, though. Are you sure that the ids in the
Houseand
Colortables really match back to
Person.Id? Normally, they would have a column called something like
PersonId`.
Upvotes: 4