Reputation: 659
I need to get only the string with names that is in Bold:
author={Trainor, Sarah F and Calef, Monika and Natcher, David and Chapin, F Stuart and McGuire, A David and Huntington, Orville and Duffy, Paul and Rupp, T Scott and DeWilde, La'Ona and Kwart, Mary and others},
Is there a way to skip all 'and' 'others' words from match result?
Tried to do lots of things, but nothing works as i expect
(?<=\{).+?(?<=and\s).+(?=\})
Upvotes: 1
Views: 78
Reputation: 163207
You could make use of \G
and a capturing group to get you the matches.
The values are in capturing group 1.
(?:author={|\G(?!^))([^\s,]+,(?:\h+[^\s,]+)+)\h+and\h+(?=[^{}]*\})
About the pattern
(?:
Non capturing group
author={
Match literally|
Or\G(?!^)
Assert position at the end of previous match, not at the start)
Close non capturing group(
Capture group 1
[^\s,]+,
Match not a whitespace char or comma, then match a comma(?:\h+[^\s,]+)+
Repeat 1+ times matching 1+ horizontal whitespace chars followed by matching any char except a whitespace char and a comma)
Close group 1\h+and\h+
Match and between 1+ horizontal whitespaces(?=[^{}]*\})
Assert what is on the right is a closing }Upvotes: 0
Reputation: 20737
Instead of using omission, you could be better off by implementing rules which expect a specific format in order to match the examples you've provided:
([A-Z]+[A-Za-z]*('[A-Za-z]+)*, [A-Z]? ?[A-Z]+[A-Za-z]*('[A-Za-z]+)*( [A-Z])?)
https://regex101.com/r/9LGqn3/3
Upvotes: 1