Reputation: 655
tl;dr: How do I replace only specific characters (i.e. line breaks) in a regex match in Ruby?
I have an array of strings. Each element of the array has between 2 and 4 words (= any sequence of characters) divided by spaces in a specific sequence.
I also have a large string in which I want to check for instances of those word sequences which are broken by \n instead of space. For example, I want to match an element of the array:
arr[0] = "aaa bbbb ccccc"
to a string that looks like this:
zzzzzzzzz aaa\n
bbbb ccccc yyyyyyyyy
And make it look like this:
zzzzzzzzz aaa bbbb ccccc yyyyyyyyy
The thing is, I can think of at least two ways of doing it, but they seem very cumbersome. What I would do is:
I suspect, however, that this is a rather silly way to do it. Is there a way to do it in Ruby that is less "around"?
EDIT: How to implement the answer below with regexp.union? I have a function that generates the regex:
def generateMergeRx(arr_with_keywords)
arr_with_keywords.delete_if{|x| (x.include? " ") == false}
matchRegexMerge = Regexp.new("(%{keywordReplace})" % {
keywordReplace: Regexp.union(arr_with_keywords).source
})
end
This is what it looks like using puts regexMerge.to_s:
(?-mix:(And\.\ z\ Kobyl\.|Ban\.\ W\.|B\.\ B\.|B\.\ G\.|Biel\.\ J\.)
It corresponds to that:
And. z Kobyl.
Ban. W.
B. B.
B. G.
Biel. J.
(...)
And then I call it like that:
regexMerge = generateMergeRx arr_with_keywords
some_string.gsub!(regexMerge.to_s.gsub!(/ /, "\s"), "\\1")
But what should I put instead of \1? Because at the moment input = output.
Upvotes: 0
Views: 977
Reputation: 121000
▶ str = 'zzzzzzzzz aaa
▷ bbbb ccccc yyyyyyyyy'
▶ re = "aaa bbbb ccccc"
▶ str.gsub /#{re.gsub(/ +/, '\s+')}/, re
#⇒ "zzzzzzzzz aaa bbbb ccccc yyyyyyyyy"
The general idea is to match any spaces, including \n
and to replace them with original string.
Upvotes: 2