Reputation: 245
I have a little trouble with a regular expression. I have the following string patterns
- "Emily Watson (abril de 1897-)"
- "Emaa William (california)".
I need to write a regex which should extract only "Emily Watson" from the 1st string and "Emaa William (california)" whole string from the 2nd string.
Basically my regex should omit the text along with braces if it is in the following pattern "month de year". so far i have tried
(?'NAME'[\w]+\s*[\w]+\s*\([\w]+(?![\w]+\s*de\s*\d{4}-)\))
In above regex works fine for 2nd string i.e., "Emaa William (california)" but it's not working for my 1st string "Emily Watson (abril de 1897-)".
In case of "Emily Watson (abril de 1897-)", I am not getting name i.e., Emily Watson.
Can any one please help me for how to exclude the first string of my problem.
Upvotes: 1
Views: 126
Reputation: 91518
Have a try with this one:
(?<NAME>.+\s\(\w+\)|.+\s(?=\(\w+\sde\s\d{4}-\)))
It returns
Emily Watson
Emaa William (california)
Upvotes: 2
Reputation: 4078
You should swap the negative lookahead with the match.
(?'NAME'[\w]+\s*[\w]+\s*\((?![\w]+\s*de\s*\d{4}-)[\w]+\))
This way, first you're checking if there's no month de year
pattern between the parentheses, and then matching what is between the parentheses, while your version was first matching everything up to the last parenthesis and then checking if there was no month de year
in the bit that was left.
Upvotes: 1