StackExchangeGuy
StackExchangeGuy

Reputation: 789

Isolating groups in regex match

I have am writing a regular expression to return the version number of OpenSSH installed on a Windows machine, for our monitoring system. I have one of two strings:

version=OpenSSH_for_Windows_7.7p1, LibreSSL 2.6.4
version=OpenSSH_7.1p1 Microsoft_Win32_port_with_VS Dec 22 2015, OpenSSL 1.0.2d 9 Jul 2015

When the regex is:

\S+Windows_(\d.\d)

Then 7.7 is in group 1 and the monitoring system sees it. But when I try to cover the 7.1 string, the grouping gets messed up.

(\S+Windows_(\d.\d)|\S+OpenSSH_(\d.\d))

How can I modify that string to isolate group 3 and group 1 (for 7.1 and 7.7 respectively)?

Thanks.

Upvotes: 3

Views: 282

Answers (4)

andi475
andi475

Reputation: 69

(?:(\S+Windows_)|(\S+OpenSSH_))(\d+\.\d+)

You can split up the groups like this, than it's always in the same group (in group3),the (?:) is a non capturing group. https://regex101.com/r/ZgtiYo/3

Upvotes: 0

Aankhen
Aankhen

Reputation: 2272

Use a non-capturing group:

(?:\S+Windows_(\d\.\d)|\S+OpenSSH_(\d\.\d))

Try it out.

Upvotes: 0

dawg
dawg

Reputation: 103844

You might consider changing the regex entirely so you only have one capture group.

Both digits you are trying to capture start with version=OpenSSH_ with some optional characters in the middle.

Therefore you can do:

version=OpenSSH_\D*(\d\.\d)

Which will capture the correct version in either case. The advantage is you do not need to know which match group to use -- the return is always group 1.

Demo

If you want to use the alteration form that you have, that can be refactored a bit as well to have a single capture group:

(?:Windows_|\S+OpenSSH_)(\d.\d)

Demo

Just know that format will have much more backtracking and may be 10x less efficient than the first form.

Upvotes: 2

Marco Luzzara
Marco Luzzara

Reputation: 6036

As you can see, this problem has more solutions. The interesting thing about the regex you tried is the creation of more than needed capturing groups inside the set. There is a specific structure you can use to tackle with this (if supported): Branch Reset Groups.

Essentially capturing groups inside a branch reset group are shared between all the options, think of it as a sort of advance backtracking, where groups are reused.

This is the new regex:

((?|\S+OpenSSH_(\d\.\d)|\S+Windows_(\d\.\d)))

Upvotes: 0

Related Questions