Reputation: 418
I would like to validate a hostname using only regualr expression.
Host Names (or 'labels' in DNS jargon) were traditionally defined by RFC 952 and RFC 1123 and may be composed of the following valid characters.
List item
The rules say:
How would you write Regular Expression to validate hostname ?
Upvotes: 26
Views: 44825
Reputation: 151
It is worth noting that DNS labels and hostname components have slightly different rules. Most notably: '_' is not legal in any component of a hostname, but is a standard part of labels used for things like SRV records.
A more readable and portable approach is to require a string to match both of these POSIX ERE's:
^([[:alnum:]][[:alnum:]\-]{0,61}[[:alnum:]]|[[:alpha:]])$
^.*[[:^digit:]].*$
Those should be easy to use in any standard-compatible ERE implementation. Perl-style backtracking as in the Python example is widely available, but has the problem of not being exactly the same everywhere that it seems to work. Ouch.
It is possible in principle to make a single ERE of those two lines, but it would be long and unwieldy. The first line handles all of the rules other than the ban on all-digits, the second kills those.
Upvotes: 5
Reputation: 1481
The k8s API responds with the regex that it uses to validate e.g. an RFC 1123-compliant string:
(⎈ minikube:default)➜ cloud-app git:(mc/72-org-ns-names) ✗ k create ns not-valid1234234$%
The Namespace "not-valid1234234$%" is invalid: metadata.name:
Invalid value: "not-valid1234234$%": a lowercase RFC 1123 label must consist of lower case
alphanumeric characters or '-', and must start and end with an alphanumeric character
(e.g. 'my-name', or '123-abc', regex used for validation is
'[a-z0-9]([-a-z0-9]*[a-z0-9])?')
Upvotes: 8
Reputation: 43457
While the accepted answer is correct, RFC2181 also states under Section 11, "Name Syntax":
The DNS itself places only one restriction on the particular labels that can be used to identify resource records. That one restriction relates to the length of the label and the full name. [...] Implementations of the DNS protocols must not place any restrictions on the labels that can be used. In particular, DNS servers must not refuse to serve a zone because it contains labels that might not be acceptable to some DNS client programs.
This in turn means other characters such as underscores should be allowed.
Upvotes: 3
Reputation: 1793
A revised regex based on comments here and my own reading of RFCs 1035 & 1123:
Ruby: \A(?!-)[a-zA-Z0-9-]{1,63}(?<!-)\z
(tests below)
Python: ^(?!-)[a-zA-Z0-9-]{1,63}(?<!-)$
(not tested by me)
Javascript: pattern = /^(?!-)[a-zA-Z0-9-]{1,63}$/g;
(based on Tom Lime's answer, not tested by me)
tests = [
['01010', true],
['abc', true],
['A0c', true],
['A0c-', false],
['-A0c', false],
['A-0c', true],
['o123456701234567012345670123456701234567012345670123456701234567', false],
['o12345670123456701234567012345670123456701234567012345670123456', true],
['', false],
['a', true],
['0--0', true],
["A0c\nA0c", false]
]
regex = /\A(?!-)[a-zA-Z0-9-]{1,63}(?<!-)\z/
tests.each do |label, expected|
is_match = !!(regex =~ label)
puts is_match == expected
end
<label> ::= <letter> [ [ <ldh-str> ] <let-dig> ]
Upvotes: 4
Reputation: 44880
Ruby regular expressions are multiline by default, and so something like Rails warns against using ^
and $
. This is Mark's answer with safe start- and end of string characters:
\A(?![0-9]+$)(?!-)[a-zA-Z0-9-]{,63}(?<!-)\z
Upvotes: 3
Reputation: 1204
Javascript regex based on Marks answer:
pattern = /^(?![0-9]+$)(?!.*-$)(?!-)[a-zA-Z0-9-]{1,63}$/g;
Upvotes: 15
Reputation: 838216
^(?![0-9]+$)(?!-)[a-zA-Z0-9-]{,63}(?<!-)$
I used the following testbed written in Python to verify that it works correctly:
tests = [
('01010', False),
('abc', True),
('A0c', True),
('A0c-', False),
('-A0c', False),
('A-0c', True),
('o123456701234567012345670123456701234567012345670123456701234567', False),
('o12345670123456701234567012345670123456701234567012345670123456', True),
('', True),
('a', True),
('0--0', True),
]
import re
regex = re.compile('^(?![0-9]+$)(?!-)[a-zA-Z0-9-]{,63}(?<!-)$')
for (s, expected) in tests:
is_match = regex.match(s) is not None
print is_match == expected
Upvotes: 22