Brad
Brad

Reputation: 12262

find each piece of data in string regular expression

I need to seek out the following pieces of data from each line from the following lines, I will process each line individually.

Here are four lines that should cover every possibility of data it needs to analyze:

// lines to be analyzed
Chuck Norris (M) - 12/1/2009 (5 years)
Rocky Joseph Balboa (M) - 2/26/2012 (2 years)
Mary-Jane Smith (F) - 03/12/2012 (6 years)
Patricia Howser-Silverstine (F) 5/04/2009 (11 years)

// data to be extracted
First name: Chuck Last name: Norris Gender: M Birthdate: 12/1/2009
First name: Rocky Last name: Joseph Balboa Gender: M Birthdate: 2/26/2012
First name: Mary-Jane Last name: Smith Gender: F Birthdate: 03/12/2012
First name: Patricia Last name: Howser-Silverstine Gender: F Birthdate: 5/04/2009

I want to capture the first, last name, gender and birthdate for each line using regular expression, I will store each piece of data into a variable to be later inserted into a database table. I will need a list of regular expressions that would find each piece of data that I need.

Any help is appreciated.

Upvotes: 1

Views: 29

Answers (2)

Zac
Zac

Reputation: 1009

@anubhava's answer is correct, and meets the OP's requirement.

If middle names need to be matched (or thrown away), this variation uses an optional capture group to do the trick:

^(?<fname>[\p{L}-]+)\h+(?:(?<mname>[\p{L}-]+)\h+)?(?<lname>[\p{L}\h-]+?)\h+\((?<gender>[MF])\)[-\h]+(?<dob>[\d/]+)

Demo: https://regex101.com/r/gB2cE3/4

Upvotes: 1

anubhava
anubhava

Reputation: 784998

You can use this regex to capture all these values:

$re = '~^(?<fname>[\p{L}-]+)\h+(?<lname>[\p{L}\h-]+?)\h+\((?<gender>[MF])\)[-\h]+(?<dob>[\d/]+)~mu';

RegEx Demo

Upvotes: 6

Related Questions