Reputation: 1142
Although the below query works perfectly fine for my uses, I am trying to understand if there is a way to optimize, since it uses the same nested subquery in both statements of the UNION. My instincts say there should be a way to look up the subquery once and use that result for both parts of the UNION, but I am running into syntactic issues when trying to use join across the UNIONs.
SELECT ClientLocalName, ClientLocalID, ClientMasterID, 1 HierarchyType
FROM dbo.ClientLocal
WHERE clientLocalID =
ANY (SELECT HierarchyItem FROM dbo.ClientPhone WHERE PhoneNumber LIKE '%0123456789')
UNION
SELECT ClientMasterName, '0', clientMasterID, 2
FROM dbo.ClientMaster
WHERE clientMasterID =
ANY (SELECT HierarchyItem FROM dbo.ClientPhone WHERE PhoneNumber LIKE '%0123456789')
ORDER BY HierarchyType ASC;
To help explain why the original query is necessary, imagine a concept of a "local" client and a "master" client -- the local client is tied to a master client, but the master client may or may not have any local clients. Likewise, the phone number associated with each should only be tied to either the local client or the master client (hence why in the second statement, a '0' is hardcoded for the local client ID -- because there is no association).
This would be running against Microsoft SQL Server 2008 R2 by the way.
Upvotes: 2
Views: 687
Reputation: 6002
Since nobody seems to mention it: having less code does not necessarily mean that it will be faster in execution too. Too often people seem to think that CTE's are executed first (or "in order" in case there are multiple of them) and then the final query is applied; they are NOT. In fact, the server will pretty much handle the cte as if it was a derived table somewhere in the FROM or JOIN clause and as such pick exactly the same query plan to execute them. In order avoid the system to have to scan the ClientPhone
table twice (LIKE '%0123...'
is a very 'annoying' operation) you'd have to do it once upfront, store the result in a temp-table and then use said temp-table in your actual query.
To give some context to this, I took the liberty to create some sample-data (see code below). The results of my test were:
So even though the server needs to create the temp-table and fill it, combined the latter still only takes 49ms instead of the original 71ms!
Off course your mileage may vary depending on the number of records involved and the complexity of the repeated query, like I said, LIKE '%blah'
is an annoying thing to do because it requires a full scan of the table (or with a bit of luck of a (covering) index). If it had been WHERE pk_field = @value
the effect might have been very different)
Happy querying...
IF DB_ID('test') IS NULL CREATE DATABASE test
GO
USE test
GO
-- setup
IF OBJECT_ID('ClientPhone') IS NOT NULL DROP TABLE ClientPhone
IF OBJECT_ID('ClientLocal') IS NOT NULL DROP TABLE ClientLocal
IF OBJECT_ID('ClientMaster') IS NOT NULL DROP TABLE ClientMaster
GO
SELECT TOP 50000
HierarchyItem = IDENTITY(int, 1, 1),
PhoneNumber = Convert(varchar(100), NewID())
INTO dbo.ClientPhone
FROM sys.objects x1, sys.columns x2, sys.objects x3, sys.columns x4
SELECT ClientLocalName = 'client dummy',
ClientLocalID = Convert(int, Rand(HierarchyItem * 37) * 50000),
ClientMasterID = Convert(int, Rand(HierarchyItem * 47) * 50000),
HierarchyType = 1
INTO dbo.ClientLocal
FROM dbo.ClientPhone
SELECT ClientMasterName = 'master dummy',
ClientLocalID = Convert(int, Rand(HierarchyItem * 51) * 50000),
ClientMasterID = Convert(int, Rand(HierarchyItem * 53) * 50000),
HierarchyType = 2
INTO dbo.ClientMaster
FROM dbo.ClientPhone
GO
-- original
SELECT ClientLocalName, ClientLocalID, ClientMasterID, 1 HierarchyType
FROM dbo.ClientLocal
WHERE ClientLocalID = ANY (SELECT HierarchyItem FROM dbo.ClientPhone WHERE PhoneNumber LIKE '%01')
UNION
SELECT ClientMasterName, '0', ClientMasterID, 2
FROM dbo.ClientMaster
WHERE ClientMasterID = ANY (SELECT HierarchyItem FROM dbo.ClientPhone WHERE PhoneNumber LIKE '%01')
ORDER BY HierarchyType ASC;
-- CTE
;WITH cte
AS (SELECT HierarchyItem FROM dbo.ClientPhone WHERE PhoneNumber LIKE '%01')
SELECT ClientLocalName, ClientLocalID, ClientMasterID, 1 HierarchyType
FROM dbo.ClientLocal
WHERE ClientLocalID = ANY (SELECT HierarchyItem FROM cte)
UNION
SELECT ClientMasterName, '0', ClientMasterID, 2
FROM dbo.ClientMaster
WHERE ClientMasterID = ANY (SELECT HierarchyItem FROM cte)
ORDER BY HierarchyType ASC;
-- temp table
IF OBJECT_ID('tempdb..#ClientPhone') IS NOT NULL DROP TABLE #ClientPhone
SELECT HierarchyItem INTO #ClientPhone FROM dbo.ClientPhone WHERE PhoneNumber LIKE '%01'
SELECT ClientLocalName, ClientLocalID, ClientMasterID, 1 HierarchyType
FROM dbo.ClientLocal
WHERE ClientLocalID = ANY (SELECT HierarchyItem FROM #ClientPhone)
UNION
SELECT ClientMasterName, '0', ClientMasterID, 2
FROM dbo.ClientMaster
WHERE ClientMasterID = ANY (SELECT HierarchyItem FROM #ClientPhone)
ORDER BY HierarchyType ASC;
Upvotes: 1
Reputation: 176024
If you don't like repeating code you could use common table expression:
WITH cte AS (
SELECT HierarchyItem FROM dbo.ClientPhone WHERE PhoneNumber LIKE '%0123456789'
)
SELECT ClientLocalName, ClientLocalID, ClientMasterID, 1 HierarchyType
FROM dbo.ClientLocal
WHERE clientLocalID IN (SELECT HierarchyItem FROM cte)
UNION
SELECT ClientMasterName, '0', clientMasterID, 2
FROM dbo.ClientMaster
WHERE clientMasterID IN (SELECT HierarchyItem FROM cte)
ORDER BY HierarchyType ASC;
Upvotes: 3
Reputation: 1270713
The only obvious optimization I can see is to use union all
instead of union
:
SELECT ClientLocalName, ClientLocalID, ClientMasterID, 1 HierarchyType
FROM dbo.ClientLocal
WHERE clientLocalID = ANY (SELECT HierarchyItem FROM dbo.ClientPhone WHERE PhoneNumber LIKE '%0123456789')
UNION ALL
SELECT ClientMasterName, '0', clientMasterID, 2
FROM dbo.ClientMaster
WHERE clientMasterID = ANY (SELECT HierarchyItem FROM dbo.ClientPhone WHERE PhoneNumber LIKE '%0123456789')
ORDER BY HierarchyType ASC;
You clearly have no duplicates between the sub-selects (because of the last column). This assumes you have no duplicates within each sub-select either.
I don't know about SQL Server's treatment of = ANY
, but I assume it is basically the same as IN
. The = ANY
subquery would be run once (for each subquery), but the second time the results would be cached. You could simplify this to:
SELECT ClientLocalName, ClientLocalID, ClientMasterID, 1 HierarchyType
FROM ((SELECT ClientLocalName, ClientLocalID, ClientMasterID
FROM dbo.ClientLocal
) UNION ALL
(SELECT ClientLocalName, '0', ClientMasterID
FROM dbo.ClientMasterName
)
) c
WHERE c.clientLocalID = ANY (SELECT HierarchyItem FROM dbo.ClientPhone WHERE PhoneNumber LIKE '%0123456789')
ORDER BY HierarchyType ASC;
I don't know if this would be much of an improvement.
The LIKE
pattern has a wildcard at the beginning of the number, making that hard to optimize.
Upvotes: 0
Reputation: 2124
;with subQry as (
SELECT ClientLocalName, ClientLocalID, ClientMasterID
FROM dbo.ClientLocal
WHERE clientLocalID =
ANY (SELECT HierarchyItem FROM dbo.ClientPhone WHERE PhoneNumber LIKE '%0123456789')
)
SELECT ClientLocalName, ClientLocalID, ClientMasterID, 1 HierarchyType
FROM subQry
UNION
SELECT ClientMasterName, '0', clientMasterID, 2
FROM subQry
ORDER BY HierarchyType ASC;
Upvotes: 0