Reputation: 111040
Given text like:
body =
yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada
< via mobile device >
Yada Yada <[email protected]> wrote:
yada yada yada yada yada yada yada yada yada
I want to match the 2nd paragraph, so I'm doing:
body = body.split(/.* <[email protected]> wrote: .*/m).first
But that's not matching in ruby even though it is in Rubular. Any ideas why? thanks
Upvotes: 0
Views: 741
Reputation: 75252
Try this instead:
body = body.split(/.*<[email protected]> wrote:.*/).first
The space after the first .*
was useless, and (as @aef pointed out) the space before the second .*
was erroneous (maybe there was a space there in your rubular test).
Notice that I removed the m
modifier, too. If I hadn't, the regex would have matched the whole string, resulting in a empty array. That's what Ruby calls multiline mode (and everyone else calls single-line or dot-all mode): the .
matches anything including newlines.
EDIT: See it on ideone.com
Upvotes: 1
Reputation: 4698
The line
Yada Yada <[email protected]> wrote:
does end with a linebreak, not with a space. So your regular expression should be:
/.* <[email protected]> wrote:\n.*/m
Attention: Windows systems and some protocols like HTML can use different linebreak encodings. If you want to be sure to be compatible, convert your input to unix linebreak encoding first and then do the data extraction. You could use my linebreak gem for this.
Upvotes: 1