SkyDrive
SkyDrive

Reputation: 1455

Match all string followed by asterisk

Basically I'm creating a simple Interpreter for our compiler course. Of course this is not a homework-type question.

Anything that is followed by an asterisks is considered a comment provided that it is not part of the string. I have an escape character in my Interpreter which are brackets.

These are sample syntax for my interpreter

* hello world
OUTPUT: "This is asterisk [*]" * outputs string
OUTPUT: "This is asterisk *"  * outputs string produces syntax error
x = "[*]" & "hello" & "[*]*]" this is already comment which produces syntax error

when I try to run this Regex

[^\[]\*.*

It matches with the following:

* hello world
 * outputs string
 *"  * outputs string produces syntax error
]*]" this is already comment which produces syntax error

My question is, why did the regex "eats" one character before? Wherein I already need

* hello world
* outputs string
*"  * outputs string produces syntax error
*]" this is already comment which produces syntax error

Upvotes: 1

Views: 605

Answers (2)

DhruvPathak
DhruvPathak

Reputation: 43235

You need to use zero width assertions to stop capturing the condition you just want to match, and not "eat" :

(?<=[^\[])\*.*

(?<=REGEX_CONDITION) ensures that matching is done, but the matched part ( NOT A "[" in your case) is not included in the matched result.

Demo : http://regexr.com?32b99

Edit: to make it fully working, I just added or condition on it

(?<=[^\[])\*.*|^\*.*

Upvotes: 1

i_han
i_han

Reputation: 21

Try using groups and use the group value as in [^\[](\*.*)

Upvotes: 0

Related Questions