vlad_tepesch
vlad_tepesch

Reputation: 6891

subdivide version number with regular expressions

I have a regular expression that almost fits my needs:

I want to split some version numbers into different parts. The expected outcome is:

                      baseBranch           curBranch           curRevision
1.36.1.5                1.36                 1.36.1.             5
1.31                    <empty>              1.                  31
1.14.2.21.1.16.1.13     1.14.2.21.1.16       1.14.2.21.1.16.1.   13
1.31.34                 <no match (illegal number - always have to be pairs)>
1.31.34.2.4             <no match (illegal number - always have to be pairs)>
1.31.34.2.4.4.5         <no match (illegal number - always have to be pairs)>

the expression

/^(?<curBranch>(?<baseBranch>(?:\d+\.\d+\.)*?)?\d+\.)(?<curRevision>\d+)$/

The expression almost does what it should but i could not get rid of the last dot at the baseBranch field (the dot at the curBranch is intended)

currently the output looks as follows

                      baseBranch           curBranch           curRevision
1.36.1.5                1.36.                1.36.1.             5
1.31                    <empty>              1.                  31
1.14.2.21.1.16.1.13     1.14.2.21.1.16.      1.14.2.21.1.16.1.   13
1.31.34                 <no match (illegal number - always have to be pairs)>
1.31.34.2.4             <no match (illegal number - always have to be pairs)>
1.31.34.2.4.4.5         <no match (illegal number - always have to be pairs)>

A link for online for testing: https://regex101.com/r/0aU07q/3

Note:
the negative cases are nice to have - they should not appear in the data

Upvotes: 0

Views: 77

Answers (2)

Pushpesh Kumar Rajwanshi
Pushpesh Kumar Rajwanshi

Reputation: 18357

You need a little change in the way you have grouped your captures. You need to take out the dot out of baseBranch group (of course) and need to change your branch base group to this regex,

(?<baseBranch>\d+(?:\.\d+)+)

which basically captures the first digit and then recursively expects data of form \.\d+ one or more.

Your over all regex after modification becomes,

^(?<curBranch>(?:(?<baseBranch>\d+(?:\.\d+)+)\.)?\d+\.)(?<curRevision>\d+)$

Demo

Let me know if this is what you needed. And if yes, then let me know if you want me to add explanation to this regex further.

Edit: For sharpened negative test cases

You can use this regex for matching your new cases, which will match only if baseBranch group is having even number (including 0 like 2, 4, 6 and so on) of dot separated numbers.

^(?<curBranch>(?:(?<baseBranch>\d+\.\d+(?:(?:\.\d+){2})*)\.)?\d+\.)(?<curRevision>\d+)$

This regex will match if the whole input is having even number of dot separated numbers. Hence these will match,

1.36.1.5
1.31
1.14.2.21.1.16.1.13
1.31.34.2.4.4.5.6

But these does not match as these are odd number of numbers

1.31.34
1.31.34.2.4
1.31.34.2.4.4.5

Demo for updated and better regex

Upvotes: 1

Orange
Orange

Reputation: 7

you can split the string use dot

and join it。

my @v=split /\./,$versionstr;

return if (scalar @array) <4;

more...

Upvotes: 1

Related Questions