Mariusz Chw
Mariusz Chw

Reputation: 364

Regular expression, need to fix

I tried to write regular expression that extract value between <ul class=\"theatre\"> and </ul>

I wrote that regex:

<ul class=\"theatre\">(\s)*[<>/ =":\._,)(a-zA-Z0-9(\s)ĄĘŚĆŻŹŁÓĆŃąęśćżźłóćń\-]+</ul>

My question is, how to modify this regular expression to get result ended by first encountered </ul> tag? Here's my example:

http://regexr.com?33j92

It should by ended before <div class=

I know that regex shouldn't been used for html (I have read about that before on SO). I just have to do it, to understand why it's not ended on first </ul> and how I can fix it.

Upvotes: 0

Views: 60

Answers (3)

Gabriel Riba
Gabriel Riba

Reputation: 6738

You have to use the lazy ? modifier and use the dot-all flag to match the dot (matches any character) across end-of-lines. The "global" (multi-match) flag should not be set

Check this regexp, checking only the dot-all checkbox in your web regexp test:

<ul\s.*?</ul>

Upvotes: 1

MikeM
MikeM

Reputation: 13631

You could use

 <ul class="theatre">([\s\S]+?)<\/ul>

For example, in Javascript, you could do

var str = '<ul class="theatre"> bananas </ul>',
    m = str.match( /<ul class="theatre">([\s\S]+?)<\/ul>/ );

if ( m ) {
    console.log( m[1] );    // " bananas "
}    

Upvotes: 0

Explosion Pills
Explosion Pills

Reputation: 191749

Try:

<ul.*?>(.*?)</ul>

Upvotes: 0

Related Questions