Mark R
Mark R

Reputation: 27

Regex: how to separate strings by apostrophes in certain cases only

I am looking to capitalize the first letter of words in a string. I've managed to put together something by reading examples on here. However, I'm trying to get any names that start with O' to separate into 2 strings so that each gets capitalized. I have this so far:

\b([^\W_\d](?!')[^\s-]*) *

which omits selecting the X' from any string X'XYZ. That works for capitalizing the part after the ', but doesn't capitalize the X'. Further more, i'm becomes i'M since it's not specific to O'. To state the goal: o'malley should go to O'Malley o'malley's should go to O'Malley's don't should go to Don't i'll should go to I'll (as an aside, I want to omit any strings that start with numbers, like 23F, that seems to work with what I have) How to make it specific to the strings that start with O'? Thx

Upvotes: 1

Views: 53

Answers (1)

R Nar
R Nar

Reputation: 5515

if you use the following pattern:

([oO])'([\w']+)|([\w']+)

then you can access each word by calling:

match[0] == 'o' || match[1] == 'name' #if word is "o'name"
match[2] == 'word' #if word is "word"

if it is one of the two above, the others will be blank, ie if word == "word" then

match[0] == match[1] == ""

since there is no o' prefix.

Test Example:

>>> import re
>>> string = "o'malley don't i'm hello world"
>>> match = re.findall(r"([oO])'([\w']+)|([\w']+)",string)
>>> match
[('o', 'malley', ''), ('', '', "don't"), ('', '', "i'm"), ('', '', 'hello'), ('', '', 'world')]

NOTE: This is for python. This MIGHT not work for all engines.

Upvotes: 1

Related Questions