Reputation: 6400
How do I regex match everything that is between two strings? The things between two strings span several lines and can contain all html characters too.
For example:
<p>something</p>
<!-- OPTIONAL -->
<p class="sdf"> some text</p>
<p> some other text</p>
<!-- OPTIONAL END -->
<p>The end</p>
I want to strip the whole optional part off. but the greedy any character match isn't doing what I wanted.. the pattern I'm using is
<!-- OPTIONAL -->.*<!-- OPTIONAL END -->
<!-- OPTIONAL -->(.*)<!-- OPTIONAL END -->
<!-- OPTIONAL -->(.*)\s+<!-- OPTIONAL END -->
(?=<!-- OPTIONAL -->)(.*)\s+<!-- OPTIONAL END -->
All of them match the first optional tag, if only the first part is given, but doesn't do well with complete lines.
Here's an example: http://regexr.com?352bk
Thanks
Upvotes: 11
Views: 36248
Reputation: 51
playing with your example I think I found the answer, check this in your code:
<!-- OPTIONAL -->[\w\W]*<!-- OPTIONAL END -->
I'll hope this help
Upvotes: 5
Reputation: 46270
To make a regex ungreedy, use a ?
after the *
:
<!-- OPTIONAL -->(.*?)<!-- OPTIONAL END -->
Does this help you?
Also depending on your programming language you use, you have modifiers that will make your regex dot (.
) match newlines too. For PHP you have the s
(dotall) modifier for example:
http://php.net/manual/en/reference.pcre.pattern.modifiers.php
Upvotes: 9
Reputation: 48807
Check the dotall checkbox in RegExr :)
Without the dotall flag (the s
in /regex/s
), a dot (.
) won't match carriage returns.
You should use .*?
instead of .*
to lazy match the optional content (see the PLEASE DO NOT MATCH!
sentence in the examples).
Upvotes: 8
Reputation: 8349
Enable the "dotall" option so that the . in regex will match newline characters and work across multiple lines. There are various ways to do this depending on your implementation of regex, check the manual for your implementation.
Upvotes: 2