Reputation: 1071
I am looking for 2 regex. The first regex need to match this type of expressions: P. Parker or M. Jordan or J. Guti
And the second one is pretty much the same but without the space between name and surname: P.Parker or M.Jordan or S.Gohan
I came across with this solution but is not behaving as I expected:
re.sub("[A-Z].[A-z]+[a-z]", "Speaker",chain)
Thanks in advance
Upvotes: 1
Views: 136
Reputation: 626804
I'd suggest
r'\b[A-Z]\.\s?[A-Z][a-z]+\b'
See the regex demo and the regex graph:
Details
\b
- a word boundary[A-Z]
- an uppercase letter\.
- a dot\s?
- an optional whitespace[A-Z][a-z]+
- an uppercase letter and then 1+ lowercase letters\b
- a word boundarySee Python demo:
import re
s = " P. Parker or M. Jordan or J. Guti P.Parker or M.Jordan or S.Gohan "
print(re.findall(r"\b[A-Z]\.\s?[A-Z][a-z]+\b", s))
# => ['P. Parker', 'M. Jordan', 'J. Guti', 'P.Parker', 'M.Jordan', 'S.Gohan']
Upvotes: 1
Reputation: 4472
You can try
import re
s = " P. Parker or M. Jordan or J. Guti P.Parker or M.Jordan or S.Gohan "
print(re.findall(r"[A-Z]+\.\s?[a-zA-Z]*", s))
Output
['P. Parker', 'M. Jordan', 'J. Guti', 'P.Parker', 'M.Jordan', 'S.Gohan']
The regex [A-Z]+\.\s?[a-zA-Z]*
will match any word character followed by .
and then will check for space only zero or one and then all word character.
Upvotes: 0