Reputation: 165
I have a table that shows IP addresses assigned to equipment belonging to a customer. This table is used to measure user's activity and performance and generated on a regular basis. I would like to map the IP addresses of these customers to customer ID or name. Due to size and multiple IP addresses that can be assigned to the same customer we have to use subnet instead of individual IP addresses.
The report table contains an IPaddress field and the subnet mapping table is something like below.
+-------------+------+-------------------+
| Customer_ID | VLAN | Subnet |
+-------------+------+-------------------+
| HARRIS | 1012 | 10.111.253.46/32 |
| HARRIS | 1012 | 10.164.20.49/30 |
| HARRIS | 1012 | 10.223.165.193/28 |
| HARRIS | 1013 | 10.155.75.128/26 |
| HARRIS | 1014 | 10.155.75.128/26 |
| HARRIS | 1330 | 10.121.30.192/27 |
| HARRIS | 1331 | 10.120.30.192/27 |
| HARRIS | 1332 | 10.122.30.192/27 |
| HARRIS | 3910 | 100.104.12.144/32 |
| HARRIS | 3927 | 10.70.24.233/32 |
| HARRIS | 3959 | 10.102.11.182/32 |
| HARRIS | 3966 | 10.98.11.170/32 |
| STEPHANIE | 1010 | 100.72.0.33/32 |
| STEPHANIE | 3896 | 10.96.11.169/29 |
| JOE | 1010 | 100.69.72.10/32 |
| NED | 1010 | 100.72.255.2/32 |
| ESTHER | 1010 | 100.72.255.66/32 |
| BEN | 1010 | 100.72.255.30/32 |
| SHAWN | 1010 | 100.72.254.230/32 |
| JACK | 1010 | 1.1.1.7/32 |
| TONY | 1010 | 100.72.255.242/32 |
| 335553339 | 1010 | 100.72.254.250/32 |
| 335553342 | 1010 | 100.72.254.186/32 |
| 335553343 | 1010 | 100.72.254.238/32 |
| 335553346 | 1010 | 100.72.255.182/32 |
| 335553347 | 1010 | 1.1.1.2/32 |
| 335553348 | 1010 | 100.72.255.82/32 |
| 335553349 | 1010 | 100.72.254.30/32 |
| 335553351 | 1010 | 1.1.1.1/32 |
| 335553352 | 1010 | 100.80.255.174/32 |
| 335553411 | 1010 | 100.72.255.18/32 |
| 335553412 | 1010 | 100.72.255.22/32 |
| 335553413 | 1010 | 100.72.255.253/32 |
| 335553414 | 1010 | 100.72.255.222/32 |
| 335553415 | 1010 | 100.72.255.202/32 |
| 335553416 | 1010 | 100.72.255.210/32 |
| 335553417 | 1010 | 100.72.255.10/32 |
| 335553418 | 1010 | 100.72.255.218/32 |
| 335553419 | 1010 | 100.72.255.206/32 |
| 335553420 | 1010 | 100.72.255.14/32 |
| 335553421 | 1010 | 100.72.255.114/32 |
| 335553431 | 1010 | 100.72.255.166/32 |
| 335553632 | 1010 | 100.72.4.176/32 |
| 335553632 | 3898 | 10.98.0.49/28 |
| RANDY | 100 | 100.72.5.6/32 |
| RANDY | 1012 | 10.111.230.41/32 |
| RANDY | 1012 | 10.164.31.185/30 |
| RANDY | 1012 | 10.228.5.1/28 |
| RANDY | 1013 | 10.154.10.0/26 |
| RANDY | 1014 | 10.154.10.0/26 |
| HASAN | 1015 | 100.72.8.94/32 |
| HASAN | 1015 | 100.73.13.56/32 |
| HASAN | 3910 | 100.104.3.66/32 |
| HASAN | 3927 | 10.70.7.157/32 |
| HASAN | 3959 | 10.102.1.228/32 |
| HASAN | 3966 | 10.98.2.68/32 |
| DICKY | 100 | 100.72.10.203/32 |
| DICKY | 3897 | 1.1.1.1/29 |
| RINA | 100 | 100.73.2.12/32 |
| RINA | 1430 | 10.64.9.0/27 |
| RINA | 3910 | 100.104.2.12/32 |
| RINA | 3927 | 10.70.2.155/32 |
| RINA | 3959 | 10.102.1.15/32 |
| RINA | 3959 | 10.104.8.24/29 |
| RINA | 3966 | 10.98.1.13/32 |
+-------------+------+-------------------+
How do I map the addresses in my dynamic table (check if it belongs to which subnet) in an SQL query and then add a column (i.e. left join) showing the associated Customer name?
Thank you for the help
Upvotes: 4
Views: 1669
Reputation: 59175
A query that looks for IP addresses and subnets, and published in:
# replace with your source of IP addresses
WITH source_of_ip_addresses AS (
SELECT REGEXP_REPLACE(contributor_ip, 'xxx', '0') ip, COUNT(*) c
FROM `publicdata.samples.wikipedia`
WHERE contributor_ip IS NOT null
GROUP BY 1
)
SELECT country_name, SUM(c) c
FROM (
SELECT ip, country_name, c
FROM (
SELECT *, NET.SAFE_IP_FROM_STRING(ip) & NET.IP_NET_MASK(4, mask) network_bin
FROM source_of_ip_addresses, UNNEST(GENERATE_ARRAY(9,32)) mask
WHERE BYTE_LENGTH(NET.SAFE_IP_FROM_STRING(ip)) = 4
)
JOIN `fh-bigquery.geocode.201806_geolite2_city_ipv4_locs`
USING (network_bin, mask)
)
GROUP BY 1
ORDER BY 2 DESC
The math here is pretty fast, and doesn't require JS UDFs. Just replace country_name
with customer_name
.
Upvotes: 0
Reputation: 172994
Below example for BigQuery Standard SQL
#standardSQL
CREATE TEMP FUNCTION SubnetToRange(CIDR STRING)
RETURNS STRUCT<start_IP STRING, end_IP STRING>
LANGUAGE js AS """
var beg = CIDR.substr(CIDR,CIDR.indexOf('/'));
var end = beg;
var off = (1<<(32-parseInt(CIDR.substr(CIDR.indexOf('/')+1))))-1;
var sub = beg.split('.').map(function(a){return parseInt(a)});
var buf = new ArrayBuffer(4);
var i32 = new Uint32Array(buf);
i32[0] = (sub[0]<<24) + (sub[1]<<16) + (sub[2]<<8) + (sub[3]) + off;
var end = Array.apply([],new Uint8Array(buf)).reverse().join('.');
return {start_IP: beg, end_IP: end};
""";
SELECT e.*, t.* EXCEPT(start_IP, end_IP)
FROM `project.dataset.equipments` e,
(
SELECT s.*,
NET.IPV4_TO_INT64(NET.IP_FROM_STRING(start_IP)) AS start_IP,
NET.IPV4_TO_INT64(NET.IP_FROM_STRING(end_IP)) AS end_IP
FROM `project.dataset.subnets` s,
UNNEST([SubnetToRange(Subnet)])
) t
WHERE NET.IPV4_TO_INT64(NET.IP_FROM_STRING(IP)) BETWEEN start_IP AND end_IP
You can test, play with above using sample/dummy data as in below example
#standardSQL
CREATE TEMP FUNCTION SubnetToRange(CIDR STRING)
RETURNS STRUCT<start_IP STRING, end_IP STRING>
LANGUAGE js AS """
var beg = CIDR.substr(CIDR,CIDR.indexOf('/'));
var end = beg;
var off = (1<<(32-parseInt(CIDR.substr(CIDR.indexOf('/')+1))))-1;
var sub = beg.split('.').map(function(a){return parseInt(a)});
var buf = new ArrayBuffer(4);
var i32 = new Uint32Array(buf);
i32[0] = (sub[0]<<24) + (sub[1]<<16) + (sub[2]<<8) + (sub[3]) + off;
var end = Array.apply([],new Uint8Array(buf)).reverse().join('.');
return {start_IP: beg, end_IP: end};
""";
WITH `project.dataset.subnets` AS (
SELECT 'HARRIS' Customer_ID, 1012 VLAN, '10.111.253.46/32' Subnet UNION ALL
SELECT 'HARRIS', 1012, '10.164.20.49/30' UNION ALL
SELECT 'HARRIS', 1012, '10.223.165.193/28' UNION ALL
SELECT '335553632', 3898, '10.98.0.49/28' UNION ALL
SELECT 'RINA', 1430, '10.64.9.0/27' UNION ALL
SELECT 'RINA', 3966, '10.98.1.13/32'
), `project.dataset.equipments` AS (
SELECT 'equipment A' equipment, '10.164.20.50' IP UNION ALL
SELECT 'equipment B', '10.64.9.0'
)
SELECT e.*, t.* EXCEPT(start_IP, end_IP)
FROM `project.dataset.equipments` e,
(
SELECT s.*,
NET.IPV4_TO_INT64(NET.IP_FROM_STRING(start_IP)) AS start_IP,
NET.IPV4_TO_INT64(NET.IP_FROM_STRING(end_IP)) AS end_IP
FROM `project.dataset.subnets` s,
UNNEST([SubnetToRange(Subnet)])
) t
WHERE NET.IPV4_TO_INT64(NET.IP_FROM_STRING(IP)) BETWEEN start_IP AND end_IP
with output
Row equipment IP Customer_ID VLAN Subnet
1 equipment A 10.164.20.50 HARRIS 1012 10.164.20.49/30
2 equipment B 10.64.9.0 RINA 1430 10.64.9.0/27
Upvotes: 2