LocustHorde
LocustHorde

Reputation: 6400

Regex match everything between two string, spaning multiline

How do I regex match everything that is between two strings? The things between two strings span several lines and can contain all html characters too.

For example:

<p>something</p>

<!-- OPTIONAL -->

<p class="sdf"> some text</p>
<p> some other text</p>

<!-- OPTIONAL END -->

<p>The end</p>

I want to strip the whole optional part off. but the greedy any character match isn't doing what I wanted.. the pattern I'm using is

All of them match the first optional tag, if only the first part is given, but doesn't do well with complete lines.

Here's an example: http://regexr.com?352bk

Thanks

Upvotes: 11

Views: 36248

Answers (4)

playing with your example I think I found the answer, check this in your code:

<!-- OPTIONAL -->[\w\W]*<!-- OPTIONAL END -->

I'll hope this help

Upvotes: 5

gitaarik
gitaarik

Reputation: 46270

To make a regex ungreedy, use a ? after the *:

<!-- OPTIONAL -->(.*?)<!-- OPTIONAL END -->

Does this help you?

Also depending on your programming language you use, you have modifiers that will make your regex dot (.) match newlines too. For PHP you have the s (dotall) modifier for example:

http://php.net/manual/en/reference.pcre.pattern.modifiers.php

Upvotes: 9

sp00m
sp00m

Reputation: 48807

Check the dotall checkbox in RegExr :)

Without the dotall flag (the s in /regex/s), a dot (.) won't match carriage returns.

You should use .*? instead of .* to lazy match the optional content (see the PLEASE DO NOT MATCH! sentence in the examples).

Upvotes: 8

Bad Wolf
Bad Wolf

Reputation: 8349

Enable the "dotall" option so that the . in regex will match newline characters and work across multiple lines. There are various ways to do this depending on your implementation of regex, check the manual for your implementation.

Upvotes: 2

Related Questions