Reputation: 4004
I want to replace the following reserved chars into spaces:
+ - & | ! ( ) { } [ ] ^ " ~ * ? : \
This is my code, but it doesn't work. Did I miss anything?
keyword = keyword.gsub(/\\+-&\\|!\\(\\)\\{\\}\\[\\]\\^"~\\*\\?:\\\\/, ' ')
Upvotes: 4
Views: 1349
Reputation: 160551
Here's a benchmark showing the speed difference between gsub
and tr
:
require 'benchmark'
require 'pp'
STR = '+ - & | ! ( ) { } [ ] ^ " ~ * ? : \\'
LONG_STR = STR * 1_000
N = 1_000
puts `ruby -v`
pp STR.gsub(/[+&|!(){}\[\]^"~*:?\\-]/, ' ')
pp STR.tr('-+&|!(){}[]^"~*?:\\', ' ')
Benchmark.bm(5) do |b|
b.report('gsub') { N.times { LONG_STR.gsub(/[+&|!(){}\[\]^"~*:?\\-]/, ' ') } }
b.report('tr') { N.times { LONG_STR.tr('+&|!(){}[]^"~*:?\\-', ' ') } }
end
And the output:
ruby 1.8.7 (2012-02-08 patchlevel 358) [universal-darwin12.0]
" "
" "
user system total real
gsub 13.300000 0.190000 13.490000 ( 13.524779)
tr 0.080000 0.010000 0.090000 ( 0.090045)
ruby 1.9.3p392 (2013-02-22 revision 39386) [x86_64-darwin12.2.0]
" "
" "
user system total real
gsub 17.890000 0.040000 17.930000 ( 18.016657)
tr 0.270000 0.000000 0.270000 ( 0.283021)
ruby 2.0.0p0 (2013-02-24 revision 39474) [x86_64-darwin12.2.0]
" "
" "
user system total real
gsub 7.310000 0.020000 7.330000 ( 7.361403)
tr 0.140000 0.010000 0.150000 ( 0.145816)
It's interesting that 1.8.7 out-performed 1.9.3. I suspect it's because of the addition of multibyte character support in 1.9+.
I've done several benchmarks with 2.0 and have been very happy with the speed improvements I've seen.
Upvotes: 10
Reputation: 54984
This is what tr
is for:
keyword.tr '-+&|!(){}[]^"~*?:\\', " "
#=> " "
Upvotes: 10
Reputation: 8169
\W = Any non-word character
>> keyword = '+ - & | ! ( ) { } [ ] ^ " ~ * ? : \\'
=> "+ - & | ! ( ) { } [ ] ^ \" ~ * ? : \\"
>> keyword.gsub!(/\W/," ")
=> " "
Upvotes: 0
Reputation: 132197
Just do this.
keyword.gsub!(/[+\-&|!(){}\[\]^"~*?:\\]/, " ")
Check:
>> keyword = '+ - & | ! ( ) { } [ ] ^ " ~ * ? : \\'
=> "+ - & | ! ( ) { } [ ] ^ \" ~ * ? : \\"
>> keyword.gsub!(/[+\-&|!(){}\[\]^"~*?:\\]/, " ")
=> " "
Character classes (enclosed by []
) are easier to reason about in this case. You need to escape -
and [
and ]
and \
.
Upvotes: 4