Reputation: 751
I need a regex that will get all the text occurences between parentheses, having in mind that all the content is encapsulated by the word BEGIN and the chars ---- at the end.
Input example:
BEGIN ) Tj\nET37.66 533 Td\n( Td\n(I NEED THIS TEXT ) Tj\nET\nBT\n37.334 Td\n(AND ALSO NEED THIS TEXT ) Tj\nET\nBT\n37.55 Td\n(------------
Expected matches:
I NEED THIS TEXT
AND ALSO NEED THIS TEXT
I already did something like (?<=BEGIN).*(?=\(--)
to the outside pattern, but i couldn't figure out how to get all text occurrences inside parentheses between this.
Upvotes: 1
Views: 38
Reputation: 195418
Try:
\(((?:(?!BEGIN).)*?)\)(?=.*---)
\(((?:(?!BEGIN).)*?)\)
- Match everything between ( )
, but not BEGIN
(?=.*---)
- .*---
must follow after this matchUpvotes: 1
Reputation: 626747
With Python PyPi regex library, you can use
(?s)(?:\G(?!^)\)|BEGIN)(?:(?!\(--).)*?\((?!--)\K[^()]*
See the regex demo
Details:
(?s)
- a DOTALL inline modifier making .
match line break chars(?:\G(?!^)\)|BEGIN)
- either BEGIN
or the end of the previous successful match and a )
right after(?:(?!\(--).)*?
- any char, zero or more but as few as possible occurrences, that does not start a (--
char sequence\(
- a (
char(?!--)
- right after (
, there should be no --
\K
- match reset operator: what was matched before is discarded from the overall match memory buffer[^()]*
- zero or more chars other than (
and )
Upvotes: 2