jomsk1e
jomsk1e

Reputation: 3625

Regex to match Table of contents pattern

Considering

NN = number/digit
x = any single letter

I want to match these patterns:

1. NN
2. NNx
3. NN.NN
4. NN.NNx
5. NN.NN.NN
6. NN.NN.NNx

Example that needs to be match:

1. 20
2. 20a
3. 20.20
4. 20.20a
5. 20.20.20
6. 20.20.20a

Right now I am trying to use this regex:

\b\d+\.?\d+\.?\d+?[a-z]?\b

But if fails.

Any help would be greatly appreciate, thanks! XD

EDIT:

I am matching this:

<fn:footnote fr="10.23.20a">    (Just a sample)

Now I have a regex that will extract the '10.23.20a'

Now I will check if this value will be valid, the 6 examples above will be the only string that will be accepted.

This examples are invalid:

1. 20.a
2. 20a.20.20
3. etc.

Many thanks for your help men! :D

Upvotes: 3

Views: 1481

Answers (2)

prageeth
prageeth

Reputation: 7395

Try this

^\d+(?:(?:\.\d+)*[a-z]?)$

Upvotes: 0

Martin Ender
Martin Ender

Reputation: 44259

You always have \d+, which is one or more digits. So you require at least three digits. Try grouping the digits with their periods:

^\d+(?:[.]\d+){0,2}[a-z]?$

The ?: is just an optimization (and a good practice) that suppresses capturing. [.] and \. are completely interchangeable, but I prefer the readability of the former. Choose whatever you like best.

If you actually want to capture the numbers and the letter, there two options:

^(?<first>\d+)(?:[.](?<second>\d+))?(?:[.](?<third>\d+))?(?<letter>[a-z])?$

Note that the important point is to group a period and the digits together and make them optional together. You could as well use unnamed groups, it doesn't really matter. However, if you use my version, you can now access the parts through (for instance)

match.Groups["first"].Value

where match is a Match object returned by Regex.Match or Regex.Matches for example.

Alternatively, you can use .NET's feature of capturing multiple values with one group:

^(?<d>\d+)(?:[.](?<d>\d+){0,2}(?<letter>[a-z])?$

Now match.Groups["d"].Captures will contain a list of all captured numbers (1 to 3). And match.Groups["letter"].Value will still contain the letter if it was there.

Upvotes: 1

Related Questions