Anand Rockzz
Anand Rockzz

Reputation: 6658

Match after Prefix, when the Ending Delimiter Varies

These are some of my test Cases:

{APIDETAILS=FOO, BAR, SING, RUN, OP1/OPSUB1/RESULT=SOMETHING, OP1/OPSUB2/RESULT=SOMETHING, OP2/OPSUB1/RESULT=SOMETHING}
{APIDETAILS=FOO, OP1/OPSUB1/RESULT=SOMETHING, OP1/OPSUB2/RESULT=SOMETHING, OP2/OPSUB1/RESULT=SOMETHING}
{APIDETAILS=FOO, O.P.OP3/OPSUB1/RESULT=SOMETHING, OP1/OPSUB2/RESULT=SOMETHING, OP2/OPSUB1/RESULT=SOMETHING}
{APIDETAILS=FOO, OP.PO.OP4/OPSUB1/RESULT=SOMETHING, OP1/OPSUB2/RESULT=SOMETHING, OP2/OPSUB1/RESULT=SOMETHING}
{OP1/OPSUB1/RESULT=SOMETHING, OP1/OPSUB2/RESULT=SOMETHING, OP2/OPSUB1/RESULT=SOMETHING, APIDETAILS=FOO}
{OP1/OPSUB1/RESULT=SOMETHING, OP1/OPSUB2/RESULT=SOMETHING, OP2/OPSUB1/RESULT=SOMETHING, APIDETAILS=FOO, SING, BAR}
{OP1/OPSUB1/RESULT=SOMETHING, OP1/OPSUB2/RESULT=SOMETHING, OP2/OPSUB1/RESULT=SOMETHING, APIDETAILS=FOO, BAR, SING

Note: '}' is intentionally missing in the last line.

What I want to match: Everything followed by APIDETAILS, but only until end of APIDETAILS. The end if clearly not defined (look for above test cases for different scenarios)

The Regex I came up with:

(?:APIDETAILS=)(.*?)(?:}|\/|$)

What I'm able to match:

  1. FOO, BAR, SING, RUN, OP1
  2. FOO, OP1
  3. FOO, O.P.OP3
  4. FOO, OP.PO.OP4
  5. FOO
  6. FOO, SING, BAR
  7. FOO, BAR, SING

Question: How do I get rid of some noise in matches 1,2,3,4 above and end up having only with the following?

What I need to match:

  1. FOO, BAR, SING, RUN
  2. FOO
  3. FOO
  4. FOO
  5. FOO
  6. FOO, SING, BAR
  7. FOO, BAR, SING

Upvotes: 1

Views: 50

Answers (2)

hwnd
hwnd

Reputation: 70732

Use a Positive Lookahead:

APIDETAILS=(.*?)(?=}|,\s*\S+=|$)

Live Demo

Or simply add to your non-capturing group:

APIDETAILS=(.*?)(?:}|,\s*\S+=|$)

Upvotes: 2

zx81
zx81

Reputation: 41838

Use this:

(?m)(?<=APIDETAILS=).*?(?=,\s*\S+=|}|$)

See the matches in the regex demo.

  • (?m) turns on multi-line mode, allowing ^ and $ to match on each line
  • The lookbehind (?<=APIDETAILS=) asserts that what precedes is APIDETAILS=
  • .*? lazily matches chars up to...
  • A place where the lookahead (?=,\s*\S+=|}|$) can assert that what follows is a comma followed by optional whitespace, non-space chars and =, OR | the } character OR the end of the line $

Upvotes: 2

Related Questions