MrValdez
MrValdez

Reputation: 8613

How would you compare IP address?

For my server app, I need to check if an ip address is in our blacklist.

What is the most efficient way of comparing ip addresses? Would converting the IP address to integer and comparing them efficient?

Upvotes: 13

Views: 21753

Answers (12)

gsaandy
gsaandy

Reputation: 619

The following one is I've used in JavaScript

function isValidIPv4Range(iPv4Range = '') {
  if (IP_V4_RANGE_REGEX.test(iPv4Range)) {
    const [fromIp, toIp] = iPv4Range.split('-');

    if (!isValidOctets(fromIp) || !isValidOctets(toIp)) {
      return false;
    }

    const convertToNumericWeight = ip => {
      const [octet1, octet2, octet3, octet4] = ip.split('.').map(parseInt);

      return octet4 + (octet3 * 256) + (octet2 * 256 * 256) + (octet1 * 256 * 256 * 256);
    };

    return convertToNumericWeight(fromIp) < convertToNumericWeight(toIp);
  }
  return false;
}

Upvotes: 1

shoosh
shoosh

Reputation: 78914

You mean if you should compare it as a text string or convert int to int and compare as an int?

That's not usually the bottleneck in this sort of lookups. you can just try to implement both methods and see which one runs faster.

The real issue with IP address lookup is usually making efficient queries, taking advantage of the fact that you are dealing with IP addresses and not just random numbers. to accomplish this you can lookup LC trie.

Obviously this should interest you only if your blacklist holds tens of thousands or millions of entries. If it has only 10-20 entries a linear search should be preferred and indeed the more interesting question is textual comparison vs integer comparison.

Upvotes: 7

mparaz
mparaz

Reputation: 2069

The Radix or PATRICIA Trie is the optimal structure for this.

Check out the C source for flow-tools: http://www.splintered.net/sw/flow-tools/

I worked on this years ago.

Upvotes: 3

Robert Paulson
Robert Paulson

Reputation: 18061

I once inherited code where somebody thought that storing IP addresses as 4 int's was a really good thing, except they spent all their time converting to/from int's.

Keeping them as strings in the database was far easier, and it only required a single index. You'd be surprised how well sql server can index strings as opposed to 4 columns of integers. But this IP list wasn't for blacklisting. A database round-trip is pretty costly.

If a database is overkill, store them in a dictionary in memory, but that's just a guess since we've no idea how many you need to compare. Since most hashcodes are 32-bit int's, and IPv4 addresses are 32 bits, the IP address itself might just be a good hashcode.

But as others point out, the best option might be to reduce the load on your server and buy specialized hardware. Maybe you keep recently blacklisted IP's in memory and periodically publish new one's to the router.

If you're the one trying to make some software inside a router, then you'll need to fish out your data-structures book and create something like a b-tree.

Upvotes: 3

Guy
Guy

Reputation: 67260

I've done this and I've tested it, using an unsigned int (32 bit) is the fastest - I'm assuming that you're comparing this to the string representation.

Another thing that might help you is when creating the table, in the past I've had 2 colums: LowIP and HighIP; that way I have been able to blacklist entire ranges of IP's with 1 record entry and still get good performance by checking for the IP in the range.

Upvotes: 3

Adam Pierce
Adam Pierce

Reputation: 34345

Integer comparisons are much faster than string comparisons.

If you store the integers in a sorted list, you can find them faster than in an unsorted list.

Upvotes: 2

sk.
sk.

Reputation: 6396

Depends what language you're using, but an IP address is usually stored as a 32-bit unsigned integer, at least at the network layer, making comparisons quite fast. Even if it's not, unless you're designing a high performance packet switching application it's not likely to be a performance bottleneck. Avoid premature optimization - design your program for testability and scalability and if you have performance problems then you can use a profiler to see where the bottlenecks are.

Edit: to clarify, IPv4 addresses are stored as 32-bit integers, plus a netmask (which is not necessary for IP address comparisons). If you're using the newer and currently more rare IPv6, then the addresses will be 128 bits long.

Upvotes: 30

Mark Glorie
Mark Glorie

Reputation: 3473

Do you have an existing problem with efficiency?

If so then by all means post the code (or pseudo-code) and we can pick at the corpse.

If not then I would suggest trying something simple like storing the entries in a sorted list and using your environment's existing Sort() and Find().

Upvotes: 2

Steven A. Lowe
Steven A. Lowe

Reputation: 61233

if you receive the IP address as a string, comparing it to a string may be more efficient than converting it to integer representation

but i'd profile both solutions to be certain, if a few milliseconds (nanoseconds!) are going to matter on this operation ;-)

Upvotes: 1

yfeldblum
yfeldblum

Reputation: 65435

Use a tool like PeerGuardian which disallows incoming TCP/IP connections at the driver level to IPs on a blacklist. Highly secure, no code required (arguably: highly secure, because no code required).

Upvotes: 3

Graeme Perrow
Graeme Perrow

Reputation: 57248

32-bit integers are the way to go -- until you start dealing with 128-bit IPv6 addresses.

Upvotes: 7

eulerfx
eulerfx

Reputation: 37719

Yes I have found that to be efficient, it will be a long though, and of course you have to index blacklisted IPs in integer form.

Upvotes: 4

Related Questions