Reputation: 25
I am using MaxMind free databases to do IP lookups. I convert the data to the following table:
CREATE TABLE [dbo].[GeoBlocks](
[StartIPNum] [varchar](50) NULL,
[EndIPNumb] [varchar](50) NULL,
[LocationNum] [varchar](50) NULL,
[PostalCode] [varchar](50) NULL,
[Latitude] [varchar](50) NULL,
[Longitude] [varchar](50) NULL)
There are about 3.5M records in this lookup table.
My goal is to determine the LocationNum for an IP(decimal form) by finding the record where the IP is between StartIPNum and EndIPNum
My stored procedure looks like this: Parameter: @DecimalIP bigint
select GeoBlocks.StartIPNum ,@DecimalIP as DecimalIp
,GeoBlocks.Postalcode ,GeoBlocks.Latitude as Latitude
,GeoBlocks.Longitude as Longitude
from GeoBlocks
where @DecimalIP between GeoBlocks.StartIPNum and GeoBlocks.EndIPNumb
I have created unique indexes on StartIPNum and EndIPNum.
However, when I run this, SQL server does a table scan for the Where portion of the query. This query takes 650-750ms. (Most queries on my server take 0-2ms)
How do I speed up this query?
Added Sample Data:
StartIPNum EndIPNumb LocationNum PostalCode Latitude Longitude
1350218632 1350218639 2782113 48.2000 16.3667
1350218640 1350218655 2782113 48.2000 16.3667
1350218656 1350218687 2782113 48.2000 16.3667
1350218688 1350218751 2782113 48.2000 16.3667
1350218752 1350218783 2782113 48.2000 16.3667
Upvotes: 0
Views: 75
Reputation: 15816
Update:
To summarize information scattered among various comments:
The IP address columns are VarChar(50)
strings containing decimal values without left padding. An index on those columns will sort them alphabetically, not numerically, i.e. "10" < "2". (With left padding the sort will be correct numerically as well: "10" > "02".)
The WHERE
clause (where @DecimalIP between GeoBlocks.StartIPNum and GeoBlocks.EndIPNumb
) uses mixed datatypes. @DecimalIP
is a BIGINT
while the two columns are VarChar(50)
. SQL handles operations among mixed datatypes by implementing a data type precedence scheme. (Ref.) This causes the IP addresses in each row to be converted from strings to BIGINT
values, hence the comparison is done numerically and the "expected" results are returned at a considerable cost. The indexes are (all but) useless in this case.
Changing the columns to BIGINT
will allow the use of an index to improve performance and ensure that comparisons are done numerically rather than alphabetically. An single index containing both the StartIPNum
and EndIPNumb
columns will greatly improve performance. Note that if overlapping address ranges are not allowed then the index will effectively be unique on StartIPNum
and could be replaced with an index on StartIPNum
with EndIPNumb
as an included column for performance.
Original Answer:
If you are using IPV4 addresses in dotted notation, e.g. "192.168.0.42", you can convert the strings into BIGINT
values with this UDF:
create function [dbo].[IntegerIPV4Address]( @IPV4Address VarChar(16) )
returns BigInt
with SchemaBinding
begin
declare @Dot1 as Int = CharIndex( '.', @IPV4Address );
declare @Dot2 as Int = CharIndex( '.', @IPV4Address, @Dot1 + 1 );
declare @Dot3 as Int = CharIndex( '.', @IPV4Address, @Dot2 + 1 );
return Cast( Substring( @IPV4Address, 0, @Dot1 ) as BigInt ) * 0x1000000 +
Cast( Substring( @IPV4Address, @Dot1 + 1, @Dot2 - @Dot1 - 1 ) as BigInt ) * 0x10000 +
Cast( Substring( @IPV4Address, @Dot2 + 1, @Dot3 - @Dot2 - 1 ) as BigInt ) * 0x100 +
Cast( Substring( @IPV4Address, @Dot3 + 1, Len( @IPV4Address ) * 1 ) as BigInt );
end
You can either store the integer values or create an index on a computed column based on the functions result. Note that you need to change your query to reference the integer column in the WHERE
clause.
If you store the values as integers the following function will convert them back to normalized strings where each part of the address is three digits. These values can be used in comparisons since they will sort the same way both alphabetically and numerically.
create function [dbo].[NormalizedIPV4Address]( @IntegerIPV4Address as BigInt )
returns VarChar(16)
with SchemaBinding -- Deterministic function.
begin
declare @BinaryAddress as VarBinary(4) = Cast( @IntegerIPV4Address as VarBinary(4) );
return Right( '00' + Cast( Cast( Substring( @BinaryAddress, 1, 1 ) as Int ) as VarChar(3) ), 3 ) +
'.' + Right( '00' + Cast( Cast( Substring( @BinaryAddress, 2, 1 ) as Int ) as VarChar(3) ), 3 ) +
'.' + Right( '00' + Cast( Cast( Substring( @BinaryAddress, 3, 1 ) as Int ) as VarChar(3) ), 3 ) +
'.' + Right( '00' + Cast( Cast( Substring( @BinaryAddress, 4, 1 ) as Int ) as VarChar(3) ), 3 )
end
You could round-trip the string values in your table to get them all into "normalized" form so that they sort correctly by using both functions. Not an ideal solution since it requires that all future inserts and updates be normalized, but it may help for the moment.
Upvotes: 1
Reputation: 4196
I guess that the index is not properly set. You can improve it like this:
Now the query will execute, and the execution plan will be displayed. And if the index is bad, it will show you a hint which index is missing and what you can do.
It will display a precise SQL statement which you need to create the index. Just copy + paste that statement and execute it, and then your index should work.
Upvotes: 0