craigeley
craigeley

Reputation: 352

Ruby Regex: Match Until First Occurance of Character

I have a file with lines that vary in their format, but the basic idea is like this:

- A block of text #tag @due(2014-04-20) @done(2014-04-22)

For example:

- Email John Doe #email @due(2014-04-20) @done(2014-04-22)

The issue is the #tag and the @due date do not appear in every entry, so some are just like:

- Email John Doe @done(2014-04-22)

I'm trying to write a Ruby Regex that finds the item between the "- " and the first occurrence of EITHER a hashtag or a @done/@due tag.

I have been trying to use groups and look ahead, but I can't seem to get it right when there are multiple instances of what I am looking ahead for. Using my second example string, this Regex:

/-\s(.*)(?=[#|@])/ 

Yields this result for (.*):

Email John Doe #email @due(2014-04-22)

Is there any way I can get this right? Thanks!

Upvotes: 2

Views: 6207

Answers (2)

hwnd
hwnd

Reputation: 70722

You're missing the ? quantifier to make it a non greedy match. And I would remove | from inside of your character class because it's trying to match a single character in the list (#|@) literally.

/-\s(.*?)(?=[#@])/

See Demo

You really don't need a Positive Lookahead here either, just match up until those characters and print the result from your capturing group.

/-\s(.*?)[#@]/

You could also use negation in this case.

/-\s([^#@]*)/

Upvotes: 8

Cary Swoveland
Cary Swoveland

Reputation: 110675

This should do it:

str = "- Email John Doe #email @due(2014-04-20) @done(2014-04-22)"
str[/-(.*?)#|@due|@done/,1]
  #=> " Email John Doe "

(.*?) is a capture group, with ? making .* non-greedy. The result of the capture is retrieved by the ,1 at the end.

Credit to @hwnd for noticing the need to make .* non-greedy shortly before I posted, though I did not see the comment until later.

Upvotes: 2

Related Questions