Borja_042
Borja_042

Reputation: 1071

Regex for matching initials

I am looking for 2 regex. The first regex need to match this type of expressions: P. Parker or M. Jordan or J. Guti

And the second one is pretty much the same but without the space between name and surname: P.Parker or M.Jordan or S.Gohan

I came across with this solution but is not behaving as I expected:

re.sub("[A-Z].[A-z]+[a-z]", "Speaker",chain)

Thanks in advance

Upvotes: 1

Views: 136

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626804

I'd suggest

r'\b[A-Z]\.\s?[A-Z][a-z]+\b'

See the regex demo and the regex graph:

enter image description here

Details

  • \b - a word boundary
  • [A-Z] - an uppercase letter
  • \. - a dot
  • \s? - an optional whitespace
  • [A-Z][a-z]+ - an uppercase letter and then 1+ lowercase letters
  • \b - a word boundary

See Python demo:

import re
s = " P. Parker or M. Jordan or J. Guti P.Parker or M.Jordan or S.Gohan "
print(re.findall(r"\b[A-Z]\.\s?[A-Z][a-z]+\b", s))
# => ['P. Parker', 'M. Jordan', 'J. Guti', 'P.Parker', 'M.Jordan', 'S.Gohan']

Upvotes: 1

Leo Arad
Leo Arad

Reputation: 4472

You can try

import re

s = " P. Parker or M. Jordan or J. Guti P.Parker or M.Jordan or S.Gohan "
print(re.findall(r"[A-Z]+\.\s?[a-zA-Z]*", s)) 

Output

['P. Parker', 'M. Jordan', 'J. Guti', 'P.Parker', 'M.Jordan', 'S.Gohan']

The regex [A-Z]+\.\s?[a-zA-Z]* will match any word character followed by . and then will check for space only zero or one and then all word character.

Upvotes: 0

Related Questions