itx
itx

Reputation: 1419

Replace string with regex and replace if some words match with array

I have a string:

str = "John: hey, what's your name?.. :haha \n Stella: :foo :xx: my name is ... stella :xx:"

I want to replace all smilies in the list ary = [":haha", ":xx:", ":foo", ":bar"] and special characters (except space) with (.*) so that it becomes like this:

John(.*) hey(.*) what(.*)s your name(.*) Stella(.*) my name is (.*) stella (.*)

enter image description here

I tried this:

str.gsub(Regexp.new("^#{ary.join('|')}$")) { |w| "(.*)" }.gsub( /[\W ]+/, "(.*)")
# => "John(.*)hey(.*)what(.*)s(.*)your(.*)name(.*)haha(.*)Stella(.*)my(.*)name(.*)is(.*)stella(.*)"

Problem:

Upvotes: 2

Views: 685

Answers (2)

Cary Swoveland
Cary Swoveland

Reputation: 110755

You can do it like so:

s = "John: hey, what's your name?.. :haha \n Stella: :foo :xx: my name is ... stella :xx:"

r = /\?\.\. :haha \n|: :foo :xx:|\.\.\.|:xx:|[^\w ]/

s.gsub(r,'(.*)')
  #=> "John(.*) hey(.*) what(.*)s your name(.*) Stella(.*) my name is (.*) stella (.*)" 

The only tricky bit concerns the order of the 'or' elements in the regex . In particular, : cannot be replaced before three other strings are replaced.

Upvotes: 0

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627607

I have tried creating a more generic approach, but finally came up with a 3-step approach. Since it seems impossible to filter out multiple consecutive (.*), I am adding a post-process with the 3rd gsub:

str = "John: hey, what's your name?.. :haha \n Stella: :foo :xx: my name is ... stella :xx:"
ary = [":haha", ":xx:", ":foo", ":bar"]
print str.gsub(Regexp.new("#{ary.join('|')}")) { |w| "(.*)" }.gsub( /(?>\(\.\*\)|[^\w ]+)/, "(.*)").gsub(/\(\.\*\)(?>\s*\(\.\*\))*/,"(.*)")

Output of a sample program:

John(.*) hey(.*) what(.*)s your name(.*) Stella(.*) my name is (.*) stella (.*)

Upvotes: 1

Related Questions