Oliver
Oliver

Reputation: 77

IP range in SQL

I have hundreds of thousands of IP's and want to identify which ones are within a certain range. Ranges:

64.233.160.0 /  8192
66.102.0.0 / 4096
66.249.64.0 / 8192
72.14.192.0 / 16384
74.125.0.0  / 65536
209.85.128.0 / 32768
216.239.32.0 / 8192

So I converted these ranges to the following:

64.233.160.0    -   64.233.192.0
66.102.0.0      -   66.102.16.0
66.249.64.0     -   66.249.96.0
72.14.192.0     -   72.15.0.0
74.125.0.0      -   74.126.0.0
209.85.128.0    -   209.86.0.0
216.239.32.0    -   216.239.64.0

So now I want to query if an IP address is within any of these ranges. SQL isn't going to understand the octets so I don't know what to do.

Could use some Hex2Dec/Dec2Hex conversions?

I figure this should be something that has been done before, I'm sure I'm not the first person to try and identify particular ip's in a list using an ip range.

I will be doing look ups on multiple IP addresses so some might be 20.0.1.123 and another might be 124.123.123.1 ie the format of the octets won't be the same

Upvotes: 5

Views: 2438

Answers (5)

Jonathan Benjamin
Jonathan Benjamin

Reputation: 11

As part of our GDPR processing, we transformed the IP range dataset into 3 key-value dataset per octet (without the 4th), and it worked well on over 10MM records through SQL in multiple warehouses.

Basically, instead of doing:

FROM DATA
JOIN IP_RANGE ON DATA.IP_NUM BETWEEN IP_RANGE.IP_NUM_FROM AND IP_RANGE.IP_NUM_TO

we tested:

FROM DATA
LEFT JOIN IP_OCTET1 ON DATA.IP_OCTET1 = IP_OCTET1.IP_OCTET1
LEFT JOIN IP_OCTET2 ON DATA.IP_OCTET1 = IP_OCTET2.IP_OCTET1 AND DATA.IP_OCTET2 = IP_OCTET2.IP_OCTET2
LEFT JOIN IP_OCTET3 ON DATA.IP_OCTET1 = IP_OCTET3.IP_OCTET1 AND DATA.IP_OCTET2 = IP_OCTET3.IP_OCTET2 AND DATA.IP_OCTET3 = IP_OCTET3.IP_OCTET3
  • We found out that our clients mostly ask for countries or US states and we designed our datasets to those specific needs ONLY.

you can read more about it in our latest blog: https://cloudinary.com/blog/identifying_countries_by_ip_address_in_columnar_databases_through_sql

Disclaimer: I'm the author of the article.

Upvotes: 1

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172994

For BigQuery Standard SQL the way to Convert IP into integer is as below

#standardSQL  
SELECT NET.IPV4_TO_INT64(NET.IP_FROM_STRING('64.233.160.0'))   

Along with SQL UDF the usage can be simplified down to something like below

#standardSQL
CREATE TEMP FUNCTION ip2int(ip STRING) AS (
  NET.IPV4_TO_INT64(NET.IP_FROM_STRING(ip))
);
WITH Ranges AS (
  SELECT '64.233.160.0' AS IP1, '64.233.192.0' AS IP2 UNION ALL
  SELECT '66.102.0.0', '66.102.16.0' UNION ALL
  SELECT '66.249.64.0', '66.249.96.0' UNION ALL
  SELECT '72.14.192.0', '72.15.0.0' UNION ALL
  SELECT '74.125.0.0', '74.126.0.0' UNION ALL
  SELECT '209.85.128.0', '209.86.0.0' UNION ALL
  SELECT '216.239.32.0', '216.239.64.0' 
),
IPs AS (
  SELECT '64.233.160.2' AS IP UNION ALL
  SELECT '72.14.192.101'
)
SELECT *
FROM IPs AS i
JOIN Ranges AS r
ON ip2int(IP) BETWEEN ip2int(IP1) AND ip2int(IP2)  

with Output as

IP              IP1             IP2  
72.14.192.101   72.14.192.0     72.15.0.0    
64.233.160.2    64.233.160.0    64.233.192.0     

More info here about NET functions and SQL UDF

Upvotes: 1

Joey Harwood
Joey Harwood

Reputation: 981

Pentium10's answer is correct for the Legacy Bigquery syntax. For anyone who happens to be using StandardSQL Bigquery Syntax you are looking for this:

NET.IP_FROM_STRING('64.233.160.0') returns 1089052672

Upvotes: 0

Pentium10
Pentium10

Reputation: 207863

You can us the IP functions for this:

PARSE_IP('64.233.160.0') returns 1089052672

and then you can wrap them in a BETWEEN state.

Upvotes: 3

Emil Vikström
Emil Vikström

Reputation: 91912

IP numbers are really just integers. What you have done here is that you have saved them as human-readable strings. You will need to convert them back to their original integer representation so that you can state a query with normal BETWEEN.

Upvotes: 3

Related Questions