Reputation: 37095
This question explains what is a valid tag in Git. However, is there a well tested and widely used regular expression that follows these rules?
What is a regex for valid Git tags?
Upvotes: 5
Views: 1693
Reputation: 85827
Here's how I'd translate those rules to Perl regexes:
my $base = qr{ [!"#\$%&'()+,\-0-9;<=>\@A-Z\]_`a-z{|}] }x;
This is a regex for a single character in the allowed base set. If you want to translate this to another language / regex dialect, note that $
and @
are only escaped here because they trigger variable interpolation in Perl.
It's a whitelist because I find it easier to think about things this way. As a side effect, this also disallows any non-ASCII characters.
If you want to allow the full Unicode set, a blacklist becomes easier to work with:
my $base = qr{ [^\x00-\x20\x1f~^:?*\[\\] }x;
(Or qr{ [^\x00-\x20\x1f~^:?*\[\\\@] | \@ (?! \{ ) }x
for a version that includes the @{
restriction; see below.)
my $part = qr{ $base+ (?: \. $base+ )* \.? (?<! \.lock ) }x;
This matches a single slash-separated part. It implements the restriction that parts cannot start with .
or contain ..
or end with .lock
.
my $full_ref = qr{\A (?! \@ \z | .* \@\{ ) $part (?: / $part )+ (?<! \. ) \z}sx;
This matches a full ref. It adds a few additional restrictions:
The whole thing cannot be @
. (This rule is technically redundant because we always require a /
, but I included it anyway.)
@{
cannot occur anywhere. Instead of a separate look-ahead check we could also have modified $base
thus:
my $base = qr{ [!"#\$%&'()+,\-0-9;<=>A-Z\]_`a-z{|}] | \@ (?! \{ ) }x;
There must be at least two parts, separated by /
.
The whole thing cannot end with .
.
Translation to e.g. C# is left as an exercise for the reader. :-)
Upvotes: 2