GURKE
GURKE

Reputation: 153

Regex more lazy than

I have the following Regex:

(?<day>\d+). Tag, (?<way>.+)?( \((?<length>\d+?.?\d?)km\))?

And i want to match these three possibilities:

1. Tag, Berlin -> London (500.3km)
2. Tag, London -> Stockholm (183km)
3. Tag, Stockholm (day of rest)

The problem: It doesn't match the length anymore. If I remove the questionsmarks to this:

(?<day>\d+). Tag, (?<way>.+)( \((?<length>\d+?.?\d?)km\))

It matches the first and second one not the third one. I thought I could solve the problem by adding the question mark at the end. But then the last expression becomes lazy. So I add another question mark to the way-expression but it doesn't become more lazy than the last one. So the way is matching the whole length too!

So, is it possible to define different level of lazyness? And if there this doesn't exist, how should i change the pattern to match it right?

Julian

Upvotes: 1

Views: 34

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626936

Here is a way to match all the expected elements in your input:

(?<day>\d+)\.\s+Tag,\s+(?<way>(?:[^()]|\((?!\d+(?:\.\d+)?km)[^()]*\))*?)(?:$|\s*(?<length>\(\d+(?:\.\d+)?km\)))

See demo

You can match the whole way that consists of no parenthetical constructs or with them not having integer or float numbers with km right after. Length will be matched only if present. Also note that a literal dot must be escaped (\.).

Upvotes: 1

Related Questions