Reputation: 33655

Format string by adding a space before each uppercase letter

I have a string: HotelCityClass. I want to add a space in-between each Uppercase letter (apart from the first). i.e. Hotel City Class.

I have tried using re

re.sub(r'[A-Z]', '', str_name)

But this only replaces each uppercase. Is re the correct, fast approach?

Upvotes: 2

Answers (5)

Nsquare

Reputation: 91

This should do your job

re.sub(r"(\w)([A-Z])", r"\1 \2", "HotelCityClass")
>>> 'Hotel City Class'

Upvotes: 0

Remi Guan

Reputation: 22292

Here is a clear way to do this:

import re
a = 'HotelCityClass'
b = re.findall('[A-Z][a-z]*', a)

c = ' '.join(b)

print(c)

Upvotes: 1

Avinash Raj

Reputation: 174776

Another one through non-word boundary \B which matches between two word characters and two non-word characters.

>>> s = 'HotelCityClass'
>>> re.sub(r'\B([A-Z])', r' \1', s)
'Hotel City Class'
>>> re.sub(r'\B(?=[A-Z])', r' ', s)
'Hotel City Class'

Upvotes: 3

Wiktor Stribiżew

Reputation: 627082

If you have to deal with CaMeL words, you can use the following regex:

([a-z])([A-Z])

It captures a lowercase letter and the following uppercase one and then in the replacement, we can add the back-references to the captured groups (\1 and \2).

import re
p = re.compile(r'([a-z])([A-Z])')
test_str = "HotelCityClass"
result = re.sub(p, r"\1 \2", test_str)
print(result)

See IDEONE demo

Note that in case you want to just insert a space before any capitalized word that is not preceded with a whitespace, I'd use

p = re.compile(r'(\S)([A-Z])')
result = re.sub(p, r"\1 \2", test_str)

See another IDEONE demo

I would not use any look-aheads here since they are always hampering performance (although in this case, the impact is too small).

Upvotes: 3

anubhava

Reputation: 785541

You can use lookahead regex:

import re
regex = re.compile(ur'(?!^)(?=[A-Z])', re.MULTILINE)
str = u"HotelCityClass"

result = re.sub(regex, " ", str)

Output:

Hotel City Class

RegEx Demo

RegEx Breakup:

(?!^)      # negative lookahead to assert that we are not at start
(?=[A-Z])  # positive lookahead to assert that next position is an uppercase letter

Replacement is just by a space if above assertions pass.

Upvotes: 2

Format string by adding a space before each uppercase letter

Answers (5)

Related Questions