byron
byron

Reputation: 11

regex — string between multiple pattern

I need to get multiple matches between multiple possible pattern

This is raw data example:

DC00-01-14 blabla blabla MB00-07-10 blublu CN03 bli BLI2454 bli bli CN02 bloblo bloblo bloblo SYSA bloublou bloublou bloublou CN06 blaiblai blaiblai blaiblai METR blybly blybly blybly ppag blubliblouBFD 454

and the regex should match like this:

DC00-01-14 blabla blabla
MB00-07-10 blublu
CN03 bli BLI2454 bli bli
CN02 bloblo bloblo bloblo
SYSA bloublou bloublou bloublou
CN06 blaiblai blaiblai blaiblai
METR blybly blybly blybly
ppag blubliblouBFD 454

With this expression, I am able to detect the keys:

((DC\d{2}[-]\d{2}[-]\d{2})|(MB\d{2}[-]\d{2}[-]\d{2})|(CN0\d{1})|(SYSA)|(ppag)|(METR))

but I need to get the string in between with the first key (without the second key) like in my result example.

What should I do?

https://regex101.com/r/vyi864/1

Upvotes: 0

Views: 88

Answers (1)

anvita surapaneni
anvita surapaneni

Reputation: 369

What I did is placed your regexp in the beggining so it will match once and used a pattern similar to(?:(?!REGEXP).)* which would match till the REGEXP is found but does not include the regexp. put your regexp for the token in place of REGEXP.

((DC\d{2}[-]\d{2}[-]\d{2})|(MB\d{2}[-]\d{2}[-]\d{2})|(CN0\d{1})|(SYSA)|(ppag)|(METR))(?:(?!(((DC\d{2}[-]\d{2}[-]\d{2})|(MB\d{2}[-]\d{2}[-]\d{2})|(CN0\d{1})|(SYSA)|(ppag)|(METR)))).)*

For Ignoring New Line, Try something like this (?:\s*(?!REGEXP).)* instead of (?:(?!REGEXP).)* . The \s* would match newline if present.

(((DC|MB)\d{2}[-]\d{2}[-]\d{2})|(CN0\d{1})|(SYSA)|(ppag)|(METR))(?:\s*(?!((((DC|MB)\d{2}[-]\d{2}[-]\d{2})|(CN0\d{1})|(SYSA)|(ppag)|(METR)))).)*

Hope this will help.

You can see in the right hand that the full match is in the desired way. enter image description here

Upvotes: 2

Related Questions