match a regular expression with optional lookahead

I have the following strings:

NAME John Nash FROM California

NAME John Nash

I want a regular expression capable of extracting 'John Nash' for both strings.

Here is what I tried

"NAME(.*)(?:FROM)"
"NAME(.*)(?:FROM)?"
"NAME(.*?)(?:FROM)?"

but none of these works for both strings.

Upvotes: 5

Answers (4)

Mayur Koshti

Reputation: 1852

You can do without regex:

>>> myStr = "NAME John Nash FROM California"
>>> myStr.split("FROM")[0].replace("NAME","").strip()
'John Nash'

Upvotes: 0

Pedro Lobito

Reputation: 98921

Make the second part of the string optional (?: FROM.*?)?, i.e.:

NAME (.*?)(?: FROM.*?)?$

MATCH 1
1.  [5-14]  `John Nash`
MATCH 2
1.  [37-46] `John Nash`
MATCH 3
1.  [53-66] `John Doe Nash`

Regex Demo
https://regex101.com/r/bL7kI2/2

Upvotes: 2

Kasravnd

Reputation: 107297

You can use logical OR between FROM and anchor $ :

NAME(.*)(?:FROM|$)

See demo https://regex101.com/r/rR3gA0/1

In this case after the name it will match FROM or the end of the string.But in your regex since you make the FROM optional in firs case it will match the rest of string after the name.

If you want to use a more general regex you better to create your regex based on your name possibility shapes for example if you are sure that your names are create from 2 word you can use following regex :

NAME\s(\w+\s\w+)

Demo https://regex101.com/r/kV2eB9/2

Upvotes: 5

LetzerWille

Reputation: 5658

 r'^\w+\s+(\w+\s+\w+) - word at start of string
 follows by one or more spaces and
 two words and at least one space between them

with open('data', 'r') as f:
    for line in f:
      mo =   re.search(r'^\w+\s+(\w+\s+\w+)',line)
      if mo:
        print(mo.group(1))

John Nash
John Nash

Upvotes: 0

match a regular expression with optional lookahead

Answers (4)

Related Questions