Reputation: 3
I have a list of approximately 200 CIDR IP blocks for my company. I am trying to verify whether the visitors to a webpage (several thousand hits a day) are coming from those blocks or not. Ideally, the output I'd like is a percent not in range and a list of those IPs so I could check further on them.
I've found the ip_in_range()
function from iptools package but it is a 1 to 1 comparison. I think that some sort of lapply or other loop should be able to accomplish what I want, but I am a novice an have not been able to come up with the right notation so far. I believe I would want to take IP#1 and test it against the various CIDR. Once it gets a TRUE, it could be stopped, but this isn't going to be repeated so often that I can't just cycle through all the blocks. Then the loop would go to IP#2 and go again. Truthfully, my failure rate is expected to be low enough that even just an output of TRUE and FALSE beside each IP would be enough for me to pull out the failures manually.
I know there has to be some generic method already for looping a function that takes 2 inputs, I just couldn't think of the right way to phrase the search to find anything.
Example data:
visitor_ip_addresses <- c("10.10.1.2", "10.34.21.4", "192.168.23.34", "172.16.34.78", "1.2.3.4", "192.168.4.6")
ip_ranges <- c("10.0.0.0/8", "192.168.0.0/16", "172.16.0.0/12")
Upvotes: 0
Views: 248
Reputation: 78792
devtools::install_github("hrbrmstr/iptools")
library(iptools)
visitor_ip_addresses <- c("10.10.1.2", "10.34.21.4", "192.168.23.34",
"172.16.34.78", "1.2.3.4", "192.168.4.6")
ip_ranges <- c("10.0.0.0/8", "192.168.0.0/16", "172.16.0.0/12")
ips_in_cidrs(visitor_ip_addresses, ip_ranges)
## [1] TRUE TRUE TRUE TRUE FALSE TRUE
ip_in_any(visitor_ip_addresses, ip_ranges)
## # A tibble: 6 × 2
## ips in_cidr
## <chr> <lgl>
## 1 10.10.1.2 TRUE
## 2 10.34.21.4 TRUE
## 3 192.168.23.34 TRUE
## 4 172.16.34.78 TRUE
## 5 1.2.3.4 FALSE
## 6 192.168.4.6 TRUE
There's a reason there are two different functions for this that we haven't documented yet, but one uses some clever math and the other uses tries. I'd test each to see which performs better for production use.
Upvotes: 2