Reputation: 2218
I have this string:
@string = "Hello.My email is [email protected] and my name is James."
I want to add a space specifically between periods and capital letters. I want to change @string
to:
"Hello. My email is [email protected] and my name is James."
I have the following code:
@string.scan(/.[A-Z]/)
# => [".M"]
Upvotes: 0
Views: 223
Reputation: 6098
You could use gsub
@string = "Hello.My email is [email protected] and my name is James."
@string.gsub!(/(\.)([A-Z])/, '\1 \2')
Output:
"Hello. My email is [email protected] and my name is James."
Update:
Another good way to do it would be by using a positive lookahead, thanks for @CarySwoveland for suggesting that
@string = "Hello.My email is [email protected] and my name is James."
@string.gsub(/\.(?=[A-Z])/, '. ')
Upvotes: 1
Reputation: 627087
To match a .
you need to use an escaped dot. You also need to use gsub
, not scan
as you need to perform a replace operation.
Use
s = "Hello.My email is [email protected] and my name is James."
s = s.gsub(/\.\K(?=[[:upper:]])/, ' ')
See the Ruby demo. A capturing group variation that still allows consecutive matches:
s = s.gsub(/(\.)(?=[[:upper:]])/, '\1 ')
Or lookbehind one:
s = s.gsub(/(?<=\.)(?=[[:upper:]])/, ' ')
Details
\.
- a literal dot\K
- a match reset operator ((?<=\.)
is equal to \.\K
in functionality)(?=[[:upper:]])
- a positive lookahead that requires the presence of an uppercase letter immediately to the right of the current location.In the capturing group based pattern, (\.)
forms Group 1 and \1
inserts the value back when replacing.
Here is a way to deal with U.S.
like words:
s = "Hello.My email is [email protected] and my name is M.B.S James."
rx = /(\b[[:upper:]](?:\.[[:upper:]])+)\b|\.([[:upper:]])/
puts s.gsub(rx) { |m|
m == $~[1] ? $~[1] : ". #{$~[2]}"
}
Here,
\b([[:upper:]](?:\.[[:upper:]])+)\b
- a single uppercase letter followed with 1 or more .
+ 1 or more uppercase letters, captured into Group 1.|
- or\.([[:upper:]])
- a dot and the uppercase letter captured into Group 2.If Group 1 matches, $~[1]
(Group 1 value) is inserted back, else .
is used for replacement. Note that $~
is the match data object currently in use inside gsub
, and $~[N]
is Group N value.
Upvotes: 1