nodrog
nodrog

Reputation: 3532

Ruby Regexp - Matching multiple result when within markup

I have the following string:

nothing to match
<-
this rocks should match as should this still and this rocks and still
->
should not match still or rocks
<- no matches here ->

And i want to find all matches of 'rocks' and 'still', but only when they are within <- ->

The purpose is to markup glossary words but be able to only mark them up in areas of text that are defined by the editor.

I currently have:

<-.*?(rocks|still).*?->

This unfortunately only matches the first 'rocks' and ignores all subsequent instances and all the 'still's

I have this in a Rubular

The usage of this will be somthing like

 Regexp.new( '<-.*?(' + self.all.map{ |gt| gt.name }.join("|") + ').*?->', Regexp::IGNORECASE, Regexp::MULTILINE )

Thanks in advance for any help

Upvotes: 1

Views: 370

Answers (3)

Jellicle
Jellicle

Reputation: 30256

In Ruby, it depends on what you want to do with the regexp. You're matching a regular expression against a string, so you'll be using String methods. Certain of these will have an effect on all matches (e.g. gsub or rpartition); others will have an effect on only the first match (e.g. rindex, =~).

If you're working with any of the latter (that return only the first match), you'll want to make use of a loop that calls the method again, starting from a certain offset. For example:

# A method to print the indices of all matches
def print_match_indices(string, regex)
  i = string.rindex(regex, 0)
  while !i.nil? do 
    puts i
    i = string.rindex(regex, i+1)
  end
end

(Yes, you can use split first, but I expect that a regex loop like the foregoing would require fewer system resources.)

Upvotes: 0

Andrew Clark
Andrew Clark

Reputation: 208635

There may be a way to do this with a single regex, but it will probably be simpler to just do it in two steps. First match all of the markups, and then search the markups for the glossary words:

text = <<END
nothing to match
<-
this rocks should match as should this still and this rocks and still
->
should not match still or rocks
<- no matches here ->
END

text.scan(/<-.*?->/m).each do |match| 
    print match.scan(/rocks|still/), "\n"
end

Also, you should probably note that regex is only a good solution here if there is never any nested markup (<-...<-...->...->) and no escaped <- or -> whether it is inside or outside of a markup.

Upvotes: 1

bash-o-logist
bash-o-logist

Reputation: 6921

Don't forget your Ruby string methods. Use them first before considering regular expressions

$ ruby -0777 -ne '$_.split("->").each{|x| x.split("<-").each{|y| puts "#{y}" if (y[/rocks.*still/]) }   }' file

Upvotes: 1

Related Questions