Reputation: 33655
I have a string: HotelCityClass
. I want to add a space in-between each Uppercase letter (apart from the first). i.e. Hotel City Class
.
I have tried using re
re.sub(r'[A-Z]', '', str_name)
But this only replaces each uppercase. Is re
the correct, fast approach?
Upvotes: 2
Views: 3439
Reputation: 91
This should do your job
re.sub(r"(\w)([A-Z])", r"\1 \2", "HotelCityClass")
>>> 'Hotel City Class'
Upvotes: 0
Reputation: 22292
Here is a clear way to do this:
import re
a = 'HotelCityClass'
b = re.findall('[A-Z][a-z]*', a)
c = ' '.join(b)
print(c)
Upvotes: 1
Reputation: 174776
Another one through non-word boundary \B
which matches between two word characters and two non-word characters.
>>> s = 'HotelCityClass'
>>> re.sub(r'\B([A-Z])', r' \1', s)
'Hotel City Class'
>>> re.sub(r'\B(?=[A-Z])', r' ', s)
'Hotel City Class'
Upvotes: 3
Reputation: 627082
If you have to deal with CaMeL words, you can use the following regex:
([a-z])([A-Z])
It captures a lowercase letter and the following uppercase one and then in the replacement, we can add the back-references to the captured groups (\1
and \2
).
import re
p = re.compile(r'([a-z])([A-Z])')
test_str = "HotelCityClass"
result = re.sub(p, r"\1 \2", test_str)
print(result)
See IDEONE demo
Note that in case you want to just insert a space before any capitalized word that is not preceded with a whitespace, I'd use
p = re.compile(r'(\S)([A-Z])')
result = re.sub(p, r"\1 \2", test_str)
I would not use any look-aheads here since they are always hampering performance (although in this case, the impact is too small).
Upvotes: 3
Reputation: 785541
You can use lookahead regex:
import re
regex = re.compile(ur'(?!^)(?=[A-Z])', re.MULTILINE)
str = u"HotelCityClass"
result = re.sub(regex, " ", str)
Output:
Hotel City Class
RegEx Breakup:
(?!^) # negative lookahead to assert that we are not at start
(?=[A-Z]) # positive lookahead to assert that next position is an uppercase letter
Replacement is just by a space if above assertions pass.
Upvotes: 2