Alan B
Alan B

Reputation: 2289

Regular Expression - start and end pattern

I have string as below in a file and would like to find each occurrences of string based on start and end pattern.

STX=ANAA:1+5013546100993+5033075994542LI0927030002+5033075994542'MTR=3'END=4'STX=ANAA:1+5013546100993+5033075994542:1:D:068::288:10941/101'OTR=8'MTR=53'END=7'UNA:+.? 'DNB=1'MTR=3'END=5''STX=ANAA:1+5013546100893+5033075994542:1:D:068::288:10941/101''OTR=8''MTR=53''END=9

I would like to find string which match the pattern starts with either STX Or UNA and end before start of the next segment of STX or UNA.

FOr the string above I would like to pull as below

1) STX=ANAA:1+5013546100993+5033075994542LI0927030002+5033075994542'MTR=3'END=4'

2) UNA:+.? 'DNB=1'MTR=3'END=5''

3) STX=ANAA:1+5013546100893+5033075994542:1:D:068::288:10941/101''OTR=8''MTR=53''END=9

I have written my regular expression as below

string pattern  = "(STX|UNA.*)STX|UNA"

But it always return the first match.

regards, Alan

Upvotes: 1

Views: 197

Answers (3)

Kutty Rajesh Valangai
Kutty Rajesh Valangai

Reputation: 635

The Regex in c#

string strRegex = @"((STX)[^""""""""]*END=[0-9])('STX)|((UNA:+.?)[^""""""""]*END=[0-9]'')|((STX)[^""""""""]*END=[0-9])";

It have seven matches 1,4,6 are your expected matches,try it

Upvotes: 0

Sebastian Schumann
Sebastian Schumann

Reputation: 3446

Your regex captures the start of the next match. You should exclude it:

(STX|UNA).*?(?=(STX|UNA|$))

Upvotes: 1

Avinash Raj
Avinash Raj

Reputation: 174706

You need to make your regex to do a non-greedy match. BTW your regex must be,

(STX|UNA).*?(STX|UNA)

(STX|UNA.*) in your regex would match STX or UNA plus zero or more characters.

Upvotes: 0

Related Questions