Reputation: 729
So I need to match an ipv6 address which may or may not have a mask. Unfortunately I can't just use a library to parse the string.
The mask bit is easy enough, in this case:
(?:\/\d{1,3})?$/
The hard part is the different formats of an ipv6 address. It needs to match ::beef, beef::, beef::beef, etc.
An update: I'm almost there..
/^(\:\:([a-f0-9]{1,4}\:){0,6}?[a-f0-9]{0,4}|[a-f0-9]{1,4}(\:[a-f0-9]{1,4}){0,6}?\:\:|[a-f0-9]{1,4}(\:[a-f0-9]{1,4}){1,6}?\:\:([a-f0-9]{1,4}\:){1,6}?[a-f0-9]{1,4})(\/\d{1,3})?$/i
I am, in this case restricted to using perl's regex.
Upvotes: 6
Views: 11867
Reputation: 6452
This is a comprehensive IPv6 regular expression that tests all the valid IPv6 text notations (expanded, compressed, expanded-mixed, compressed-mixed) with an optional prefix length. It will also capture the various parts into capture groups. You can skip the capture groups by putting a ?:
right after the opening paren for a capture group.
This is the regular expression I created and use in my IPvX IP calculator for both IPv4 and IPv6.
^# Anchor
(# BEGIN Compressed-mixed *** Group 1 ***
(# BEGIN Hexadecimal Notation *** Group 2 ***
(?:
(?:[0-9A-F]{1,4}:){5}[0-9A-F]{1,4} # No ::
| (?:[0-9A-F]{1,4}:){4}:[0-9A-F]{1,4} # 4::1
| (?:[0-9A-F]{1,4}:){3}(?::[0-9A-F]{1,4}){1,2} # 3::2
| (?:[0-9A-F]{1,4}:){2}(?::[0-9A-F]{1,4}){1,3} # 2::3
| [0-9A-F]{1,4}:(?::[0-9A-F]{1,4}){1,4} # 1::4
| (?:[0-9A-F]{1,4}:){1,5} # :: End
| :(?::[0-9A-F]{1,4}){1,5} # :: Start
| : # :: Only
):
)# END Hexadecimal Notation
(# BEGIN Dotted-decimal Notation *** Group 3 ***
(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])\. # 0 to 255. *** Group 4 ***
(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])\. # 0 to 255. *** Group 5 ***
(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])\. # 0 to 255. *** Group 6 ***
(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9]) # 0 to 255 *** Group 7 ***
)# END Dotted-decimal Notation
)# END Compressed-mixed
|
(# BEGIN Compressed *** Group 8 ***
(?:# BEGIN Hexadecimal Notation
(?:[0-9A-F]{1,4}:){7}[0-9A-F]{1,4} # No ::
| (?:[0-9A-F]{1,4}:){6}:[0-9A-F]{1,4} # 6::1
| (?:[0-9A-F]{1,4}:){5}(?::[0-9A-F]{1,4}){1,2} # 5::2
| (?:[0-9A-F]{1,4}:){4}(?::[0-9A-F]{1,4}){1,3} # 4::3
| (?:[0-9A-F]{1,4}:){3}(?::[0-9A-F]{1,4}){1,4} # 3::4
| (?:[0-9A-F]{1,4}:){2}(?::[0-9A-F]{1,4}){1,5} # 2::5
| [0-9A-F]{1,4}:(?::[0-9A-F]{1,4}){1,6} # 1::6
| (?:[0-9A-F]{1,4}:){1,7}: # :: End
| :(?::[0-9A-F]{1,4}){1,7} # :: Start
| :: # :: Only
) # END Hexadecimal Notation
)# END Compressed
(?:# BEGIN Optional Length
/(12[0-8]|1[0-1][0-9]|[1-9]?[0-9]) # /0 to /128 *** Group 9 ***
)? # END Optional Length
$# Anchor
Bonus IPv4 regular expression:
^# Anchor
(?:# BEGIN Dotted-decimal Notation
(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])\. # 0 to 255. *** Group 1 ***
(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])\. # 0 to 255. *** Group 2 ***
(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9])\. # 0 to 255. *** Group 3 ***
(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[1-9]?[0-9]) # 0 to 255 *** Group 4 ***
) # END Dotted-decimal Notation
(?:# BEGIN Optional Length
/(3[0-2]|[1-2]?[0-9]) # /0 to /32 *** Group 5 ***
)? # END Optional Length
$# Anchor
Upvotes: 0
Reputation: 2672
here is the one worked for all the examples of IPv6 I've managed to find:
/^\s*((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:)))(%.+)?\s*$/
make sure it's the one line before using. it's been found here:
verified on all examples from the question page, the community page and the wikipedia site from here:
https://en.wikipedia.org/wiki/IPv6
the tool for the verification being used the one from here:
Upvotes: 1
Reputation: 21
Try:
/^(((?=(?>.*?(::))(?!.+\3)))\3?|([\dA-F]{1,4}(\3|:(?!$)|$)|\2))(?4){5}((?4){2}|((2[0-4]|1\d|[1-9])?\d|25[0-5])(\.(?7)){3})\z/ai
From: http://home.deds.nl/~aeron/regex/
Upvotes: 2
Reputation: 21
This mostly works...
^([0-9a-fA-F]{0,4}|0)(\:([0-9a-fA-F]{0,4}|0)){7}$
Cons: ::
like cases not handled correctly
Upvotes: 2
Reputation: 11
If you need in perl check if a string is an IPv6 address you can try this:
if (/(([\da-f]{0,4}:{0,2}){1,8})/i) { print("$1") };
Upvotes: 1
Reputation: 164639
This contains a patch to Regexp::Common demonstrating a complete, accurate, tested IPv6 regex. Its a straight translation of the IPv6 grammar. Regexp::IPv6 is also accurate.
More importantly, it contains a test suite. Running it with your regex shows you're still a ways off. 10 out of 19 missed. 1 out of 12 false positives. IPv6 contains a lot of special shorthands making it very easy to get subtly wrong.
Best place to read up on what goes into an IPv6 address is RFC 3986 section 3.2.2.
Upvotes: 13
Reputation: 47829
What do you mean you can't just use a library? How about a module? Regexp::IPv6
will give you what you need.
Upvotes: 10
Reputation: 5072
I'm not an IPv6 expert, but please trust me when I tell you that matching (let alone validating) IPv6 addresses is not easy with a very simple regex such as the one you suggest. There's many shorthands and various conventions for combining the address with a port, just to name an example. One such shorthand is that you can write 0:0:0:0:0:0:0:1 as ::1, but there's more. If you read German, I would suggest looking at the slides of Steffen Ullrich's talk at the 11th German Perl Workshop.
You say you can't use a library, but if you're going to reinvent the whole complexity of the library, then you could as well just import it verbatim into your project.
Upvotes: 5
Reputation: 57936
Try this:
^([0-9a-fA-F]{4}|0)(\:([0-9a-fA-F]{4}|0)){7}$
From Regular Expression Library: IPv6 address
You should also read this: A Regular Expression for IPv6 Addresses
Upvotes: 1