Randy Minder
Randy Minder

Reputation: 48432

How to search efficiently for IP addresses ranges?

I have a table that contains ranges of IP addresses (as integer values) and corresponding country, region and city assigned to that range. It looks as follows:

CREATE TABLE [dbo].[IpToRegion]
(
    [BeginRange] [bigint] NOT NULL,
    [EndRange] [bigint] NOT NULL,
    [CountryCode] [varchar](10) NOT NULL,
    [Country] [varchar](50) NOT NULL,
    [Region] [varchar](100) NOT NULL,
    [City] [varchar](100) NOT NULL
) ON [PRIMARY]

CREATE UNIQUE CLUSTERED INDEX [ClusteredIndex-20151031-193911] ON [dbo].[IpToRegion]
(
    [BeginRange] ASC,
    [EndRange] ASC
)
GO

There are 9.1M rows in this table. In order to find the location of a single IP address, I first convert it to a big int and then execute the following query:

DECLARE @IPNumber BIGINT
DECLARE @IPAddress varchar(20)

Set @IPNumber = (CONVERT(bigint, PARSENAME(@IPAddress,1)) + CONVERT(bigint, PARSENAME(@IPAddress,2)) * 256 + CONVERT(bigint, PARSENAME(@IPAddress,3)) * 65536 + CONVERT(bigint, PARSENAME(@IPAddress,4)) * 16777216)

Select City + ', ' + Region + ', ' + Country
From IpToRegion 
Where @IPNumber Between BeginRange And EndRange

The problem is this query can take anywhere from 5 to 20 seconds to execute. Here is the query plan:

enter image description here

Of course, my question is how can this query take so long to execute? It's doing a seek on the clustered index and returns a single row. I could try a few different indexing strategies. But, at this point, I'm more curious as to why this query can perform so poorly.

Upvotes: 3

Views: 628

Answers (2)

Randy Minder
Randy Minder

Reputation: 48432

It turns out that the clustered index I had on BeginRange + EndRange was not efficient as Vladimir Baranov stated in his answer. What I did was to create a PK / clustered index on BeginRange and a separate index on EndRange. Now the query performs instantly.

Upvotes: 0

Vladimir Baranov
Vladimir Baranov

Reputation: 32695

This kind of search can't be done efficiently with the index that you have.

If you look at details of the Index Seek operator in the plan you'd see two predicates.

@IPNumber >= BeginRange 
@IPNumber <= EndRange

Index helps to quickly O(log(n)) find the start (or end) of the range, but then it has to check the second predicate for the rest of the rows in the table.

Check out the actual number of rows read in the plan. It will be large.

If I'm not mistaken, there was somewhat similar (more complicated) question before. Even though, it was asked for Postgres, the approach would work in SQL Server as well. In that question this kind of search was done not once, but 600K times.


The answer to the question "How to make this search efficient" depends on few things. First: can you guarantee that IP ranges in your table do not overlap? In other words, can you guarantee that any search will return 0 or 1 rows?

If yes, adding a simple TOP(1) to the query could be enough.

Upvotes: 5

Related Questions