Reputation: 6209

How to extract first and last name from full name

I have a regex that, given the full name, is supposed to capture the first and last name. It should exclude the suffix, like "Jr.":

(.+)\s(.+(?!\sJr\.))

But this regex applied against the string Larry Farry Barry Jones Jr. gives the match:

  1.    Larry Farry Barry Jones
  2.    Jr.

Why is my negative lookahead failing to ignore the "Jr." when parsing the full name? I want match #2 to contain "Jones".

Upvotes: 1

Answers (3)

stema

Reputation: 92976

The reason is that your string is matched by your .+ till the end and then does the regex lookahead, there is no "Jr." following (because we are already at the end) ==> perfect, we match!!!

But that is because your pattern is wrong. Better would be this:

\S+(?:\s(?!Jr\.)\S+)*

See it here on Regexr

Means:

\S+ match a series of at least one non whitespace character.

(?:\s(?!Jr\.)\S+)* Non capturing group: Match a whitespace and then, if it is not "Jr.", match the next series of non whitespace characters. This complete group can be repeated 0 or more times.

Upvotes: 1

Morgan

Reputation: 20514

As a comment mentions it is the first .* that matches most of the string. The use of look ahead seems in correct here, as you do not want to return that value and do not need it to be included in a further match.

The following will split all words up but not return the 'Jr.' So you could take the first and last result.

(\w+\s)+?(?!\sJr\.)

I recommend Rubular for practicing Ruby RegExp.

Upvotes: 1

sawa

Reputation: 168091

Rather than trying to do it with a single regex, I think the following would be a more maintainable code.

full_name = "Larry Farry Barry Jones Jr."
name_parts = full_name.split - ["Jr."]
first_name, last_name = name_parts[0], name_parts[-1]

Upvotes: 2

How to extract first and last name from full name

Answers (3)

Related Questions