Reputation: 7920

Regex match multiple initial tokens with common following token

Given a string like:

ASSUME @pete, @grey and @matt_c ARE really tall

is there a way I can use regex to extract:

MATCH 1
1.  `@pete`
2.  `really tall`
MATCH 2
1.  `@grey`
2.  `really tall`
MATCH 3
1.  `@matt_c`
2.  `really tall`

Further, is there a way I can do it with the @ being optional for each of them?

Constraints: The syntax must be of the form ASSUME [names] ARE [statement] where:

[names] consists of one or more [name]s separated by , (space), & or and
[name] consists of alphanumeric + underscores or dashes

Happy to answer any questions relating to setup. A starting point with the example strings I'm trying to make work can be found here: http://regex101.com/r/fS9oK5/4

Upvotes: 0

Answers (3)

alpha bravo

Reputation: 7948

a little variation from the accepted answer:
you would actually consume the first sub-pattern

(@[\w-]+)(?=.*ARE\s(.+))

Demo

to explicitly match ASSUME, depending on your engine and \G option

(?:^ASSUME\s*|\G[^@]*)(@[\w-]+)(?=.*ARE\s(.+))

Demo

Upvotes: 0

hwnd

Reputation: 70732

You would need to use a Positive Lookahead to capture the overlapping matches.

(?=(@[\w-]+).*ARE\s*(.+))

Live Demo

Upvotes: 1

Avinash Raj

Reputation: 174816

I think you want something like this,

ASSUME (@\w+(?:(?:,?\s@\w+)*\s*and\s*@\w+)?)\sARE\s(.+)

DEMO

Upvotes: 1

Regex match multiple initial tokens with common following token

Answers (3)

Related Questions