Reputation: 364
I tried to write regular expression that extract value between <ul class=\"theatre\">
and </ul>
I wrote that regex:
<ul class=\"theatre\">(\s)*[<>/ =":\._,)(a-zA-Z0-9(\s)ĄĘŚĆŻŹŁÓĆŃąęśćżźłóćń\-]+</ul>
My question is, how to modify this regular expression to get result ended by first encountered </ul>
tag? Here's my example:
It should by ended before <div class=
I know that regex shouldn't been used for html (I have read about that before on SO). I just have to do it, to understand why it's not ended on first </ul>
and how I can fix it.
Upvotes: 0
Views: 60
Reputation: 6738
You have to use the lazy ? modifier and use the dot-all flag to match the dot (matches any character) across end-of-lines. The "global" (multi-match) flag should not be set
Check this regexp, checking only the dot-all checkbox in your web regexp test:
<ul\s.*?</ul>
Upvotes: 1
Reputation: 13631
You could use
<ul class="theatre">([\s\S]+?)<\/ul>
For example, in Javascript, you could do
var str = '<ul class="theatre"> bananas </ul>',
m = str.match( /<ul class="theatre">([\s\S]+?)<\/ul>/ );
if ( m ) {
console.log( m[1] ); // " bananas "
}
Upvotes: 0