Multiline Regex in ruby

Question

I want to extract all text between two keywords(<<-DOC, DOC) from a file. For example, if my file content is as below

abc.rb

def abc
    <<-DOC abc:
        return "hahaha"
    DOC
    puts "hahaha"
end

def efg
    <<-DOC efg:
        return "hehehe"
    DOC
    puts "hehehe"
end

I want to get two matches:

<<-DOC abc:
    return "hahaha"
DOC

and

<<-DOC efg:
    return "hehehe"
DOC

I tried File.read("abc.rb").match(/<<-DOC(.*?)DOC/m) but it gives all text between first occurrence of <<-DOC (inside abc) and last occurrence of DOC (inside efg)

Kevin Schwerdtfeger · Accepted Answer

From what I can tell, your regex is correct and the (.*?) should be a non-greedy match. I think that the issue you are running into is that match in Ruby only returns the first match of the regex. For instance

File.read("abc.rb").match(/<<-DOC(.*?)DOC/m)
=> #

What you really want to use is scan

File.read("abc.rb").scan(/<<-DOC(.*?)DOC/m)
=> [[" abc:
        return "hahaha"
    "], [" efg:
        return "hehehe"
    "]]

This will return you an array of arrays, with each array containing the captured groups from the regex. See https://ruby-doc.org/core-2.2.0/String.html#method-i-scan

Multiline Regex in ruby

Answers (2)

Related Questions