user7397354
user7397354

Reputation: 35

Regex with positive lookahead across multiple lines

I've been trying to isolate blocks containing a certain string in TextWrangler.

Here is the sample I'm working with.

<ROW num="381">
  <TO>8549672167</TO>
  <FROM>8936742582</FROM>
  <TIME>5/10/2009 19:49:3</TIME>
  <TEXT>Blah Blah Blah</TEXT>
</ROW>
<ROW num="382">
  <TO>8549672167</TO>
  <FROM>8591903412</FROM>
  <TIME>5/10/2009 19:49:37</TIME>
  <TEXT>Hme</TEXT>
</ROW>

What I'm trying to do is isolate all multi-line blocks beginning with <ROW and ending with </ROW>that contain the digits 412in the line beginning <FROM>

So in the above example, the second block would be highlighted, but not the first.

I have no idea where to begin with is, can anybody help? Thanks, MS.

Upvotes: 3

Views: 2086

Answers (2)

Mustofa Rizwan
Mustofa Rizwan

Reputation: 10466

Try this:

<ROW[^<]*?>[^<]*<TO>(?=[^<]*412)[^<]*<\/TO>.*?<\/ROW>

Demo

enter image description here

Updated answer as per op's updated question and comment :

<ROW(?=((?!ROW).)*<FROM>\d*412\d*<\/FROM>).*?<\/ROW>

Updated Link For Explanation and Demo

Upvotes: 3

Po Stevanus Andrianta
Po Stevanus Andrianta

Reputation: 712

<ROW.*>[\s\n]*<TO>.*412.*<\/TO>[\w\d\s\n<>\/:]*<\/ROW>

url : http://regexr.com/3f1e7

i update the solution to contain 412 in tag TO

hope this helps

Upvotes: -1

Related Questions