172d042d
172d042d

Reputation: 733

Regular expression that matches values after meetting some keyword

I have some path separated by spaces, but not separated by \s/\s i.e: space, slash, space

val1 / val2 val4 / val7 keyword / somevalue aaa / bbb ccc / ddd eee / fff

When I find the keyword with somevalue:

(keyword / [^/\s]*)

The only acceptable values after the above match can be aaa / bbb and ccc / ddd, no matter the order, no matter the duplications.

For example I should get a match for

  1. val1 / val2 val4 / val7 keyword / somevalue aaa / bbb ccc / ddd
  2. val1 / val2 val4 / val7 keyword / somevalue aaa / bbb
  3. val1 / val2 val4 / val7 keyword / somevalue ccc / ddd
  4. val1 / val2 val4 / val7 keyword / somevalue ccc / ddd aaa / bbb
  5. val1 / val2 val4 / val7 keyword / somevalue ccc / ddd aaa / bbb ccc / ddd

Any other combination should return unmatch, for example: When there is some extra 'element' after keyword / somevalue

  1. val1 / val2 val4 / val7 keyword / somevalue aaa / bbb ccc / ddd eee / fff
  2. val1 / val2 val4 / val7 keyword / somevalue eee / fff ccc / ddd
  3. val1 / val2 val4 / val7 keyword / somevalue aaa / bbb zzz / yyy ccc / ddd

(...)

I should get unmatch.

Is it possible to achieve it with a regular expression? I am trying to solve it with regex but I stuck.

Upvotes: 1

Views: 76

Answers (1)

Nikolas
Nikolas

Reputation: 44398

Let's try the following Regex:

keyword \/ \w+ ((?:aaa \/ bbb|ccc \/ ddd)(?: |$))+$

Where:

  • keyword is your fixed keyword
  • aaa, bbb, ccc and ddd are the ones to be matched literally
  • aaa \/ bbb is the first allowed couple and ccc \/ ddd is the second one
  • (?: |$) checks, whether a couple is followed with an empty space or the end of a line $.

I have to stress out that it's highly recommended to use a programming language for the extraction. You should split the string and examine the partitions.

Upvotes: 2

Related Questions