wvstealth
wvstealth

Reputation: 21

Concatenating two capture groups

I have a string that can be split into 3 parts (Keep1 | Ignore | Keep2). The objective is to remove the middle sub-string and concatenate the other two. To achieve this I created two regular expressions, one to create a capture group for Keep1 and another for Keep2.

Example text:

First String.<ref> IGNORE </ref> Second String.

First regular expression:

.*(?=<ref>)    

Output:

First String.

Second regular expression:

(?<=&lt;\/ref&gt;).*   

Output:

Second String.   

Desired Output:

First String. Second String.

I've so far been unable to figure out a way to concatenate both strings, is such a thing possible on flex?

Upvotes: 1

Views: 499

Answers (1)

rici
rici

Reputation: 241791

(F)lex does not implement capture groups, and nor does it implement lookahead assertions. In general terms, it only implements constructs which meet the mathematical definition of "regular expression", abd can therefore be implemented with a simple finite state machine working in linear time and constant space.

The (short and complete) documentation of its regular expression syntax is found in the Flex manual.

(The "f" in "flex" stands for "fast", but the original "lex" was also pretty snappy, basically because of this design decision.)

You have two choices, depending on the precise nature of your tokens:

  1. If you can definitely recognise the token from the first part, then you could use a start condition to recognise the rest of the token

  2. Otherwise, you could recognise the entire token in one regular expression, and then rescan it to figure out which part you want to keep. You might or might not be able to do the second scan with flex; again, you could use a start condition to apply different rules for the rescan but it will depend on the precise nature of your pattern. You could also rescan with a regular expression library, either the Posix standard library or some more flexible library such as PCRE.

Note that (f)lex also does not implement non-greedy repetition, so if you want to implement "the shortest string starting with X and ending with Y", you need to use a technique like the one shown in the (last) example in the Flex manual chapter on start conditions

Upvotes: 1

Related Questions