Reputation: 2216
how would I do the following? I tried doing this gsub but I can't figure out what really efficient if strings_to_highlight array is large. Cheers!
string = "Roses are red, violets are blue"
strings_to_highlight = ['red', 'blue']
# ALGORITHM HERE
resulting_string = "Roses are (red), violets are (blue)"
Upvotes: 2
Views: 994
Reputation: 160549
I'd use:
string = "Roses are red, violets are blue"
strings_to_highlight = ['red', 'blue']
string.gsub(/\b(#{Regexp.union(strings_to_highlight).source})\b/) { |s| "(#{s})" } # => "Roses are (red), violets are (blue)"
Here's how it breaks down:
/\b(#{Regexp.union(strings_to_highlight).source})\b/ # => /\b(red|blue)\b/
It's important to use source
when embedding a pattern. Without it results in:
/\b(#{Regexp.union(strings_to_highlight)})\b/ # => /\b((?-mix:red|blue))\b/
and that (?-mix:...)
part can cause problems if you don't understand what it means in regex-ese. The Regexp documentation explains the flags but failing to do this can lead to a really hard to diagnose bug if you're not aware of the problem.
\b
tells the engine to match words, not substrings. Without that you could end up with:
string = "Fred, bluette"
strings_to_highlight = ['red', 'blue']
string.gsub(/(#{Regexp.union(strings_to_highlight).source})/) { |s| "(#{s})" }
# => "F(red), (blue)tte"
Using a block with gsub
allows us to perform calculations on the matched values.
Upvotes: 1
Reputation: 110685
I suggest using the form of String#gsub that employs a hash for making substitutions.
strings_to_highlight = ['red', 'blue']
First construct the hash.
h = strings_to_highlight.each_with_object({}) do |s,h|
h[s] = "(#{s})"
ss = "#{s[0].swapcase}#{s[1..-1]}"
h[ss] = "(#{ss})"
end
#=> {"red"=>"(red)", "Red"=>"(Red)", "Blue"=>"(Blue)", "blue"=>"(blue)"}
Next define a default proc for it:
h.default_proc = ->(h,k) { k }
so that if h
does not have a key k
, h[k]
returns k
(e.g., h["cat"] #=> "cat"
).
Ready to go!
string = "Roses are Red, violets are blue"
string.gsub(/[[[:alpha:]]]+/, h)
=> "Roses are (Red), violets are (blue)"
This should be relatively efficient as only one pass through the string is needed and hash lookups are very fast.
Upvotes: 2
Reputation: 10004
Regexp
has a helpful union
function for combining regular expressions together. Stick with regexp until you have a performance problem:
string = "Roses are red, violets are blue"
strings_to_highlight = ['red', 'blue']
def highlight(str, words)
matcher = Regexp.union words.map { |w| /\b(#{Regexp.escape(w)})\b/ }
str.gsub(matcher) { |word| "(#{word})" }
end
puts highlight(string, strings_to_highlight)
Upvotes: 5
Reputation: 1885
strings_to_highlight = ['red', 'blue']
string = "Roses are red, violets are blue"
strings_to_highlight.each { |i| string.gsub!(/\b#{i}\b/, "(#{i})")}
Upvotes: 4