Steven Foster
Steven Foster

Reputation: 13

RegEx required to capture full word, space and • symbol in all instances within a varaible selection of text

I need to be able to select variable elements from the following ingredient list example.

I wish to collect the 'full word, space & •' in all instances.

INGREDIENTS: ALCOHOL DENAT. • FRAGRANCE (PARFUM) • WATER\AQUA\EAU • HYDROXYCITRONELLAL • LIMONENE • BENZYL BENZOATE • CITRONELLOL • GERANIOL • COUMARIN • FARNESOL • CITRAL • BENZYL ALCOHOL • CINNAMYL ALCOHOL • LINALOOL • ALCOHOL • DIPROPYLENE GLYCOL • ETHYLHEXYL METHOXYCINNAMATE • BUTYL METHOXYDIBENZOYLMETHANE • ETHYLHEXYL SALICYLATE • TRIS(TETRAMETHYLHYDROXYPIPERIDINOL) CITRATE • DILAURYL THIODIPROPIONATE • TOCOPHEROL • BHT • BENZOIC ACID • RED 4 (CI 14700) • EXT. VIOLET 2 (CI 60730) • YELLOW 6 (CI 15985) <ILN46472>

I have \b\w+\s• but this is only selecting 'EAU •' within the copy, where as I need all instances within the list

DENAT. •
(PARFUM) •
EAU •
HYDROXYCITRONELLAL •
LIMONENE •
BENZOATE •
CITRONELLOL •
GERANIOL •
COUMARIN •
FARNESOL •
CITRAL •
ALCOHOL •
ALCOHOL •
LINALOOL •
ALCOHOL •
GLYCOL •
METHOXYCINNAMATE •
METHOXYDIBENZOYLMETHANE •
SALICYLATE •
CITRATE •
THIODIPROPIONATE •
TOCOPHEROL •
BHT •
ACID •
(CI 14700) •
(CI 60730) •

Upvotes: 1

Views: 72

Answers (1)

The fourth bird
The fourth bird

Reputation: 163577

To get those matches, you might use:

(?:\([^()]*\)|\w+\.?)\s•

The pattern matches

  • (?: Non capture group
    • \([^()]*\) Match from (....)
    • | Or
    • \w+\.? Match 1+ word chars followed by an optional .
  • ) Close the non capture group
  • \s• Match a whitespace char and

See a regex demo

If there has to be at least a word character in between the parenthesis:

(?:\([^\w()]*\w[^()]*\)|\w+\.?)\s•

See another regex demo.

Note that \w can also match _

Upvotes: 0

Related Questions