Reputation: 1627
I have two linked SQL Servers (2008R2). The remote server contains a table with a huge number of rows. It has to be joined with a local table. In such a situation a good solution is using a REMOTE
hint for JOIN
.
But there is a restriction for using REMOTE JOIN
:
REMOTE can be used only for INNER JOIN operations.
I want to make a left join operation, but it will be done on the local side and whole HugeRemoteTable will be loaded on the local server:
SELECT HugeRemoteTable.*
FROM HugeRemoteTable
LEFT JOIN LocalTable
ON HugeRemoteTable.col = LocalTable.id
WHERE LocalTable.id IS NULL
Is it possible to rewrite query to join tables on the remote side and load only filtered data?
EDIT1:
I experimented wtih queries and it looks like SQL Server runs query remotely if condition is LocalTable.id IS NULL
. The query takes a few time.
But a query with condition LocalTable.id IS NOT NULL
takes much more time and obviously runs locally.
The same behavior demonstrate queries with JOIN
rewrited as NOT IN
or IN
(they even have execution plans similar to plans for queries with JOIN).
This query runs very quickly:
SELECT *
FROM HugeRemoteTable
WHERE col NOT IN (SELECT id FROM LocalTable)
And this one takes a lot of time and generates a lot of requests to remote server (I don't know exactly why, but it looks like local server makes a request per each row from LocalTable):
SELECT *
FROM HugeRemoteTable
WHERE col IN (SELECT id FROM LocalTable)
Now we are going to OPENQUERY
. I've connected my local server as linked server for remote SQL Server (in opposite direction) using alias 'localServer' and started with OPENQUERY
in such a way
SELECT * FROM OPENQUERY(remote,'
SELECT * FROM HugeRemoteTable
LEFT JOIN localServer.localTable ON HugeRemoteTable.col=localTable.id
WHERE localTable.id is not null')
it was as fast as query with condition LocalTable.id IS NULL
SELECT * FROM OPENQUERY(remote,'
SELECT * FROM HugeRemoteTable
LEFT JOIN localServer.localTable ON HugeRemoteTable.col=localTable.id
WHERE localTable.id is null')
Both queries with OPENQUERY were very fast, but they need two-way linking between servers.
Are there any other solutions?
Upvotes: 2
Views: 2982
Reputation: 13
If you are looking for speed and using linked servers, give this a try.
Step 1). Create a temp table with the data you need. Step 2). Using Openquery is the way you want to go if you want speed. Create a temp table with the data you want from the other server. Using openquery and dynamic sql grab the data you want filtered. Step3). Use the temp table to join this.
DECLARE @FlatTSQL varchar(8000)
DECLARE @AsOfDate date = '9/24/2015'
CREATE TABLE #MyData
(AsOfDate date)
SELECT @FlatTSQL = 'SELECT * FROM OPENQUERY(EUC,''
SELECT
cal.AsOfDate
FROM RiskReporting.Reference.lkpCalendar cal WITH(NOLOCK)
WHERE CONVERT(date,cal.AsOfDate) = ''''' + CONVERT(varchar(25),@AsOfDate) + ''''''')'
INSERT INTO #MyData
EXEC (@FlatTSQL)
SELECT
*
FROM #MyData md
DROP TABLE #MyData
Let me know if it works, Alex
Upvotes: 1
Reputation: 4094
OPENQUERY can be used to move the heavy execution to the appropriate server and the results can be brought back.
Your left join wouldn't filter rows by itself so you must specify a where clause or other join type in your remote query (EDIT: unless you specify your small table on the left side to match rows in large table when they exist).
Upvotes: 0