Jack Slingerland
Jack Slingerland

Reputation: 2811

Ruby Regular Expressions: Matching if substring doesn't exist

I'm having an issue trying to capture a group on a string:

"type=gist\nYou need to gist this though\nbecause its awesome\nright now\n</code></p>\n\n<script src=\"https://gist.github.com/3931634.js\"> </script>\n\n\n<p><code>Not code</code></p>\n"

My regex currently looks like this:

/<code>([\s\S]*)<\/code>/

My goal is to get everything in between the code brackets. Unfortunately, it's matching up to the 2nd closing code bracket Is there a way to match everything inside the code brackets up until the first occurrence of ending code bracket?

Upvotes: 1

Views: 522

Answers (2)

ineiti
ineiti

Reputation: 414

And I just learned that for going through multiple parts, the

String.scan( /<code>(.*?)<\/code>/ ){
  puts $1
}

is a very nice way of going through all occurences of code - but yes, getting a proper parser is better...

Upvotes: 0

Martin Ender
Martin Ender

Reputation: 44279

All repetition quantifiers in regular expressions are greedy by default (matching as many characters as possible). Make the * ungreedy, like this:

/<code>([\s\S]*?)<\/code>/

But please consider using a DOM parser instead. Regex is just not the right tool to parse HTML.

Upvotes: 4

Related Questions