Reputation: 13
I want to check if the first character of first word is capitalized, if not, then capitalize it leaving rest of the string's casing intact.
I've tried the following method:
import re
p = re.compile(r'((?<=[\.\?!]\s)(\w+)|(^\w+))')
def cap(match):
return(match.group().capitalize())
p.sub(cap, 'CBH1 protein was analyzed. to evaluate the report')
>>> Output : Cbh1 protein was analyzed. To evaluate the report
While this output is nearly correct, I expect the initial Cbh1 to be CBH1. Any suggestions?
Upvotes: 1
Views: 1276
Reputation: 7880
You can use [a-z]\w*
instead of \w+
. In this way you will match only the strings that really need to be capitalized.
I'd also fix the regex in this way:
(?:(?<=[.?!]\s)|^)[a-z]\w*
Upvotes: 0
Reputation: 1728
This isn't a perfect match for any of the builtin str
methods: .title()
and .capitalize()
will lowercase all letters after the first, while .upper()
will capitalize everything.
It's not elegant, but you can just explicitly capitalize the first letter and append the rest:
def cap(match):
grp = match.group()
return grp[0].upper() + grp[1:]
>>> p.sub(cap, 'CBH1 protein was analyzed. to evaluate the report')
'CBH1 protein was analyzed. To evaluate the report'
Upvotes: 2