Reputation: 384
For a string like
s = "(string1) this is text (string2) that's separated (string3)"
I need a way to remove all the parenthesis and text in them, however if I use the following it'll return an empty string
s.gsub(/\(.*\)/, "")
What can I use to get the following?
" this is text that's separated "
Upvotes: 0
Views: 126
Reputation: 110755
You could do the following:
s.gsub(/\(.*?\)/,'')
# => " this is text that's separated "
The ?
in the regex is to make it "non-greedy". Without it, if:
s = "A: (string1) this is text (string2) that's separated (string3) B"
then
s.gsub(/\(.*\)/,'')
#=> "A: B"
Edit: I ran the following benchmarks for various methods. You will see that there is one important take-away.
n = 10_000_000
s = "(string1) this is text (string2) that's separated (string3)"
Benchmark.bm do |bm|
bm.report 'sawa' do
n.times { s.gsub(/\([^()]*\)/,'') }
end
bm.report 'cary' do
n.times { s.gsub(/\(.*?\)/,'') }
end
bm.report 'cary1' do
n.times { s.split(/\(.*?\)/).join }
end
bm.report 'sawa1' do
n.times { s.split(/\([^()]*\)/).join }
end
bm.report 'sawa!' do
n.times { s.gsub!(/\([^()]*\)/,'') }
end
bm.report '' do
n.times { s.gsub(/\([\w\s]*\)/, '') }
end
end
user system total real
sawa 37.110000 0.070000 37.180000 ( 37.182598)
cary 37.000000 0.060000 37.060000 ( 37.066398)
cary1 35.960000 0.050000 36.010000 ( 36.009534)
sawa1 36.450000 0.050000 36.500000 ( 36.503711)
sawa! 7.630000 0.000000 7.630000 ( 7.632278)
user1179871 38.500000 0.150000 38.650000 ( 38.666955)
I ran the benchmark several times and the results varied a fair bit. In some cases sawa was slightly faster than cary.
[Edit: I added a modified version of @user1179871's method to the benchmark above, but did not change any of the text of my answer. The modification is described in a comment on @user1179871's answer. It looks to be slightly slower that sawa
and cary
, but that may not be the case, as the benchmark times vary from run-to-run, and I did a separate benchmark of the new method.
Upvotes: 4
Reputation: 168269
Cary's answer is the simple way. This answer is the efficient way.
s.gsub(/\([^()]*\)/, "")
To keep in mind: Non-greedy matching requires backtracking, and in general, it is better not use it if you can. But for such simple task, Cary's answer is good enough.
Upvotes: 2