Reputation: 96484
I want 'This Is A 101 Test' to be 'This Is A Test', but I can't get the syntax right.
src = 'This Is A 101 Test'
puts "A) " + src # base => "This Is A 101 Test"
puts "B) " + src[/([a-z]+)/] # only does first word => "his"
puts "C) " + src.gsub!(/\D/, "") # Does digits, I want alphabetic => "101"
puts "D) " + src.gsub!(/\W///g) # Nothing. => ""
puts "E) " + src.gsub(/(\W|\d)/, "") # Nothing. => ""
Upvotes: 15
Views: 21658
Reputation: 7663
First off, you need to be careful with gsub
and gsub!
. The latter is "dangerous!" and will modify the value of src
. If you're executing these statements in order, be aware that a.gsub!(/a/, "b")
and a = a.gsub(/a/, "b")
will both do the same thing to a
. Part of the issue with your code is that src
is being modified.
The B method returns "his"
but makes no changes to source
src[/([a-z]+)/] # => "his"
src # => "This Is A 101 Test"
The C method removes all characters that aren't numbers:
src.gsub!(/\D/, "") # => "101"
src # => "101"
The D method doesn't work because the syntax is wrong. The gsub
method accepts a regular expression/string to search and then a string to use for replacement. If you try it in IRB it will act as though you need another /
somewhere.
The E method replaces all non-word characters and all numbers:
src.gsub(/(\W|\d)/, "") # => "This Is A Test" (note the two spaces)
src # => "This Is A 101 Test"
You point out that it's returning ""
. Well, what's actually happening is that C and D as listed (with syntax issues fixed) are destructive changes. (Also, if run on "101"
, D will actually return nil
as no substitutions were performed.) So E is just being run on "101"
, and since you're replacing all non-words and all numbers with ""
, it becomes "101"
.
The answer you're looking for would be something like:
src.gsub!(/\d\s?/, "") # => "This Is A Test"
src # => "This Is A Test"
And my favorite for dealing with all scenarios of double spaces (because squeeze
is quite efficient at combining like characters, strip
is quite efficient at stripping trailing whitespace, and those !
return nil
if they make no replacements):
src = src.gsub(/\d+/, "").squeeze(" ").strip
Upvotes: 28
Reputation: 80065
No regexp:
src = 'This Is A 101 Test'
src.delete('^a-zA-Z ') #the ^ negates everything
Upvotes: 8
Reputation: 31428
To remove all "non word characters" you can instead keep only those.
src = 'This Is A 101 Test'
src.gsub(/[^a-zA-Z ]/,'').gsub(/ +/,' ')
=> "This Is A Test"
I recommend Rubular for trying out Ruby regular expressions.
Upvotes: 8
Reputation: 3217
Do you just want to delete numbers? If so, src.gsub(/\d/,"")
should work. The reason it doesn't work above is that gsub! modifies the string it is called on, so after C, src = "101" and eliminating all digits leaves an empty string.
If you want to eliminate everything but alphabetic characters and spaces (ie digits and punctuation), src.gsub(/(?=\S)(\d|\W)/,"")
should work.
If you want to eliminate everything but alphabetic characters (eliminating spaces as well as digits and punctuation), src.gsub(/\d|\W/,"")
should work.
Upvotes: 2
Reputation: 230306
Do you want to cut ' 101' from the string? Here's your regex
src = 'This Is A 101 Test'
puts src.gsub /\ \d+/, ''
# => This Is A Test
Also I don't understand why you are using bang version of gsub
. gsub!
modifies the original string, gsub
copies it and modifies the copy.
Upvotes: 4