Reputation: 301
I have a table of ZipCodes, that contains US and Canadian postal codes and related information - including Latitude and Longitude.
I have records being inserted into a table from a separate application that I cannot modify. These records lack latitude and longitude data, so I'd like to populate the records accordingly. (Before anyone asks, doing a JOIN with the ZipCodes table at runtime didn't seem to be an option because of performance, which is why I've ended up where i am.)
I have the query shown below, which is to be set to run as a scheduled task, that should update any records that have not yet had their geo data looked up. This query (100 at a time) takes about 1m 35s to run.
I am seeking any and all options to what i'm doing here, that may improve performance.
I resorted to a scheduled task because I fear having the entire server being brought to its knees with INSTEAD OF INSERT or AFTER INSERT triggers to do the same thing... but that would have been my preference. Given the current performance issues, triggers seem out of the question.
The ZipCodes table has a non-clustered index on CityName and ProvinceAbbr, with additional included columns PostalCode, Latitude and Longitude. I have Allow Row Locks and Allow Page Locks set to false. The data in this table will change MAYBE once a quarter, so dirty reads are fine.
I can provide execution plan results if necessary... but not sure how exactly so give me a hint as to how to generate copy/paste material. =D
The ZipCodes table has 947,172 records, and the LoadsAvail table has around 38k records at any given time... records are being inserted, updated, and deleted in real time/constantly, sometimes in larger batches (I'd say maximum of 20 inserted at a time, usually more like one or two at a time).
Once most of the records have their geo data in place, there will probably be less than a 100 records being updated every minute, but I'd like to know if there's just a better way to do all this.
UPDATE TOP (100)
LoadsAvail
SET
coordinatesChecked = 1,
FromLatitude = ( SELECT TOP (1) Latitude
FROM ZipCodes AS ZipCodes_1 WITH(NOLOCK)
WHERE (CityName = loadsavail.FromCity) AND (ProvinceAbbr = loadsavail.FromState)
),
FromLongitude = ( SELECT TOP (1) Longitude
FROM ZipCodes AS ZipCodes_2 WITH(NOLOCK)
WHERE (CityName = loadsavail.FromCity) AND (ProvinceAbbr = loadsavail.FromState)
),
ToLatitude = ( SELECT TOP (1) Latitude
FROM ZipCodes AS ZipCodes_3 WITH(NOLOCK)
WHERE (CityName = loadsavail.ToCity) AND (ProvinceAbbr = loadsavail.toState)
),
ToLongitude = ( SELECT TOP (1) Longitude
FROM ZipCodes AS ZipCodes_4 WITH(NOLOCK)
WHERE (CityName = loadsavail.ToCity) AND (ProvinceAbbr = loadsavail.toState)
)
WHERE
coordinatesChecked = 0
Upvotes: 1
Views: 530
Reputation: 37313
You can use Joins instead of subqueries:
UPDATE T1
SET T1.coordinatesChecked = 1,
T1.FromLatitude = T2.Latitude,
T1.FromLongitude = T2.Longitude
T1.ToLatitude = T3.Latitude,
T1.ToLongitude = T3.Longitude
FROM LoadsAvail AS T1 LEFT JOIN ZipCodes AS T2 ON T1.FromCity = T2.CityName AND T1.FromState = T2.ProvinceAbbr
LEFT JOIN ZipCodes AS T3 ON T1.toCity = T3.CityName AND T1.toState = T3.ProvinceAbbr
WHERE T1.coordinatesChecked = 0
Upvotes: 3