Reputation: 797
I'm writing an optimization where you are performing a search for my application and if the string looks like an ip address, then don't bother searching MAC addresses. And if the search looks like a MAC address, don't bother looking in the IP address db column.
I have seen expressions that match ips and mac addresses exactly, but its hard to come by one that matches partial strings and quite a fun brain teaser and I thought I'd get other people's opinions. Right now I have a solution without regex.
use List::Util qw(first);
sub query_is_a_possible_mac_address {
my ($class, $possible_mac) = @_;
return 1 unless $possible_mac;
my @octets = split /:/, $possible_mac, -1;
return 0 if scalar @octets > 6; # fail long MACS
return 0 if (first { $_ !~ m/[^[:xdigit:]]$/ } @octets; # fail any non-hex characters
return not first { hex $_ > 2 ** 8 }; # fail if the number is too big
}
# valid tests
'12:34:56:78:90:12'
'88:11:'
'88:88:F0:0A:2B:BF'
'88'
':81'
':'
'12:34'
'12:34:'
'a'
''
# invalid tests
'88:88:F0:0A:2B:BF:00'
'88z'
'8888F00A2BBF00'
':81a'
'881'
' 88:1B'
'Z'
'z'
'a12:34'
' '
'::88:'
Upvotes: 2
Views: 2187
Reputation: 797
The best way I found to do this was to try and make the possible match become what you are trying to match. For example if you have a string: 1.2, try and make it look like an ip address: 1.2.1.1. Then apply the regex
sub contains_ip {
my ($possible_ip) = @_;
my @splits = split /\./, $possible_ip;
return 0 if @splits > 4;
while (@splits < 4) {
push @splits, '1';
}
$possible_ip = join '.', @splits;
my ($match) = $possible_ip =~ m/^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$/;
return defined $match ? 1 : 0;
}
warn contains_ip('1.2'); # 1
warn contains_ip('127.0.0.1'); # 1
warn contains_ip('1.2asd'); # 0
warn contains_ip('1.2.3.4.5'); # 0
The same thing applies to mac addresses: If you had 11:22, try and make it look like a fully qualified mac address, 11:22:00:00:00:00, then apply the mac address regex to it.
Upvotes: 0
Reputation: 6271
Given the (new) tests, this works:
/^[0-9A-Fa-f]{0,2}(:[0-9A-Fa-f]{2}){0,5}:?$/
Here are the lines that match given the above tests (note that single hex characters like 'a' and 'A' are correctly matched:
12:34:56:78:90:12
88:11:
88:88:F0:0A:2B:BF
88
:81
:
12:34
12:34:
a
'' (<-- empty space)
Upvotes: 1