getintoityuh
getintoityuh

Reputation: 51

Regular Expression with Two Names: One With Middle Initial and One Without

I'm attempting to identify the names in this string, using regex.

Example text:

Elon R. Musk (245)436-7956 Jeff Bezos (235)231-3432

What I've tried so far only seems to work for names without a middle initial:

([A-Z]{1}[a-z]+) ([A-Z]{1}[a-z]+)

Here's an example of Python code using the re module:

import re

strr = 'Elon R. Musk (245)436-7956 Jeff Bezos (235)231-3432'

def gimmethenamesdammit(strr):
    regex = re.compile("([A-Z]{1}[a-z]+) ([A-Z]{1}[a-z]+)")
    print(regex.findall(strr))

gimmethenamesdammit(strr)

How can I modify the regular expression above to highlight both the names Elon R. Musk and Jeff Bezos?

Desired output when running gimmethenamesdammit(strr):

gimmethenamesdammit(strr)

[('Elon', 'R.', 'Musk'), ('Jeff', 'Bezos')]

Upvotes: 1

Views: 84

Answers (2)

getintoityuh
getintoityuh

Reputation: 51

The following regular expression solves the issue:

import re

strr = 'Elon R. Musk (245)436-7956 Jeff Bezos (235)231-3432'

regex = r"[A-Z]\w+\s[A-Z]?\w+"

POCs = re.findall(regex, strr)

f"{POCs[0]}, {POCs[-1]}"

Upvotes: 1

star67
star67

Reputation: 1832

Try this: \b([^\s*][a-zA-Z_\.\s]+)\b

Demo: https://regex101.com/r/7ul1pQ/1

  1. \b...\b -- word boundary
  2. [^\s*][a-zA-Z_\.\s]+ -- text with letters, dots and spaces
  3. () -- captured group

Upvotes: 1

Related Questions