Reputation: 3844
I am currently doing a bunch of processing on a string using regular expressions with gsub()
but I'm chaining them quite heavily which is starting to get messy. Can you help me construct a single regex for the following:
string.gsub(/\.com/,'').gsub(/\./,'').gsub(/&/,'and').gsub(' ','-').gsub("'",'').gsub(",",'').gsub(":",'').gsub("#39;",'').gsub("*",'').gsub("amp;",'')
Basically the above removes the following:
Is there an easier way to do this?
Upvotes: 0
Views: 166
Reputation: 14619
Building on Tim's answer:
You can pass a block to String.gsub
, so you could combine them all, if you wanted:
string.gsub(/\.com|[.,:*& ']/) do |sub|
case(sub)
when '&'
'and'
when ' '
'-'
else
''
end
end
Or, building off echoback's answer, you could use a translation hash in the block (you may need to call translations.default = ''
to get this working):
string.gsub(/\.com|[.,:*& ']/) {|sub| translations[sub]}
The biggest perk of using a block is only having one call to gsub
(not the fastest function ever).
Hope this helps!
Upvotes: 0
Reputation: 2869
A translation table is more scalable as you add more options:
translations = Hash.new
translations['.com'] = ''
translations['&'] = 'and'
...
translations.each{ |from, to| string.gsub from, to }
Upvotes: 1
Reputation: 14154
You can combine the ones that remove characters:
string.gsub(/\.com|[.,:*]/,'')
The pipe |
means "or". The right side of the or is a character class; it means "one of these characters".
Upvotes: 3