Chris
Chris

Reputation: 845

Regex: Find a string within different string variations

I need to find a Regex that gets hold of the

81.03

part (varies, but always has the structure XX.XX) in following string variations:

Projects/75100/75120/75124/AR1/75124_AR1_HM2_81.03-testing-b405.tgz

Projects/75100/75130/75138/LM1/75138_LM1_HM2_81.03.tgz

I´ve come up with:

var regex = new Regex("(.*_)(.*?)-");

but this only matches up to the first example string whereas

var regex = new Regex("(.*_)(.*?)(.*\.)");

only matches the second string.

The path to the file constantly changes as does the "-testing..." postfix.

Any ideas to point me out in the right direction?

Upvotes: 2

Views: 72

Answers (2)

The fourth bird
The fourth bird

Reputation: 163217

To match the value 81.03 another option is to match the digits with optional decimal part after the last forward slash after the first underscore.

_(\d+(?:\.\d+)?)[^/\r\n]*$

Regex demo

Explanation

  • _ Match literally
  • (\d+(?:\.\d+)?) Capture group 1, match 1+ digits with an optional decimal part
  • [^/\r\n]* Match 0+ chars except / or a newline
  • $ End of string

Upvotes: 0

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626748

You can use

var result = Regex.Match(text, @".*_(\d+\.\d+)")?.Groups[1].Value;

Or, if the string can have more dot+number parts:

var result = Regex.Match(text, @".*_(\d+(?:\.\d+)+)")?.Groups[1].Value;

See the regex demo.

In general, the regex will extract dot-separated digit chunks after the last _.

Details

  • .* - any 0 or more chars other than a newline, as many as possible
  • _ - a _ char
  • (\d+(?:\.\d+)+) - Group 1: one or more digits followed with one or more occurrences of a dot followed with one or more digits
  • \d+\.\d+ - one or more digits, . and one or more digits.

Upvotes: 2

Related Questions