sdgfsdh
sdgfsdh

Reputation: 37095

What is a regex for valid Git tags?

This question explains what is a valid tag in Git. However, is there a well tested and widely used regular expression that follows these rules?

What is a regex for valid Git tags?

Upvotes: 5

Views: 1693

Answers (1)

melpomene
melpomene

Reputation: 85827

Here's how I'd translate those rules to Perl regexes:

my $base = qr{ [!"#\$%&'()+,\-0-9;<=>\@A-Z\]_`a-z{|}] }x;

This is a regex for a single character in the allowed base set. If you want to translate this to another language / regex dialect, note that $ and @ are only escaped here because they trigger variable interpolation in Perl.

It's a whitelist because I find it easier to think about things this way. As a side effect, this also disallows any non-ASCII characters.

If you want to allow the full Unicode set, a blacklist becomes easier to work with:

my $base = qr{ [^\x00-\x20\x1f~^:?*\[\\] }x;

(Or qr{ [^\x00-\x20\x1f~^:?*\[\\\@] | \@ (?! \{ ) }x for a version that includes the @{ restriction; see below.)

my $part = qr{ $base+ (?: \. $base+ )* \.? (?<! \.lock ) }x;

This matches a single slash-separated part. It implements the restriction that parts cannot start with . or contain .. or end with .lock.

my $full_ref = qr{\A (?! \@ \z | .* \@\{ ) $part (?: / $part )+ (?<! \. ) \z}sx;

This matches a full ref. It adds a few additional restrictions:

  • The whole thing cannot be @. (This rule is technically redundant because we always require a /, but I included it anyway.)

  • @{ cannot occur anywhere. Instead of a separate look-ahead check we could also have modified $base thus:

    my $base = qr{ [!"#\$%&'()+,\-0-9;<=>A-Z\]_`a-z{|}] | \@ (?! \{ ) }x;
    
  • There must be at least two parts, separated by /.

  • The whole thing cannot end with ..

Translation to e.g. C# is left as an exercise for the reader. :-)

Upvotes: 2

Related Questions