JJA
JJA

Reputation: 73

The regex pattern for the following string?

What is the regex pattern for the following string:

hi firstName lastName 27 Jun 2017

There should be 3 fields identified in the string: priority, name and date. So far, I have the following regex:

^(\w+)\s+(.*?)\s+

It identifies priority but not the full name. My regex identifies up to the firstName, not including the lastName.

Thanks in advance!

Upvotes: 2

Views: 67

Answers (2)

Ajax1234
Ajax1234

Reputation: 71451

You can use re.findall():

import re
s = "hello John Someone 27 June 2017"
name = re.findall("\w+[a-zA-Z]+", s)[1:-1]
priority = re.findall("^\w+", s)[0]
date = re.findall("\d+\s\w+\s\d+", s)[0]
print(name)
print(priority)
print(date)

Output:

['John', 'Someone']
'hello'
'27 June 2017'

Upvotes: 0

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626825

Your regex does not extract a full name because \s+(.*?)\s+ part matches 1 or more whitespaces, and then matches and captures any 0+ chars other than line break chars as few as possible up to the first 1+ whitespaces. These whitespaces are found after firstName, as there are no more obligatory atoms to match.

You may use

^(?P<priority>\w+)\s+(?P<name>.*?)\s+(?P<date>\d.*)

See the regex demo

Details

  • ^ - start of string (implicit if re.match is used)
  • (?P<priority>\w+) - Group "priority": 1+ word chars
  • \s+ - 1 or more whitespaces
  • (?P<name>.*?) - Group "name": any 0+ chars other than line break chars as few as possible
  • \s+ - 1 or more whitespaces
  • (?P<date>\d.*) - Group "date": a digit and then the rest of the line.

Python demo:

import re
rx = r"(?P<priority>\w+)\s+(?P<name>.*?)\s+(?P<date>\d.*)"
s = "hi firstName lastName 27 Jun 2017"
m = re.match(rx, s)
if m:
    print(m.group("priority")) # => hi
    print(m.group("name"))     # => firstName lastName
    print(m.group("date"))     # => 27 Jun 2017

Upvotes: 3

Related Questions