Reputation: 1455
Basically I'm creating a simple Interpreter for our compiler course. Of course this is not a homework-type question.
Anything that is followed by an asterisks is considered a comment provided that it is not part of the string. I have an escape character in my Interpreter which are brackets.
These are sample syntax for my interpreter
* hello world
OUTPUT: "This is asterisk [*]" * outputs string
OUTPUT: "This is asterisk *" * outputs string produces syntax error
x = "[*]" & "hello" & "[*]*]" this is already comment which produces syntax error
when I try to run this Regex
[^\[]\*.*
It matches with the following:
* hello world
* outputs string
*" * outputs string produces syntax error
]*]" this is already comment which produces syntax error
My question is, why did the regex
"eats" one character before? Wherein I already need
* hello world
* outputs string
*" * outputs string produces syntax error
*]" this is already comment which produces syntax error
Upvotes: 1
Views: 605
Reputation: 43235
You need to use zero width assertions to stop capturing the condition you just want to match, and not "eat" :
(?<=[^\[])\*.*
(?<=REGEX_CONDITION)
ensures that matching is done, but the matched part ( NOT A "[" in your case) is not included in the matched result.
Demo : http://regexr.com?32b99
Edit: to make it fully working, I just added or condition on it
(?<=[^\[])\*.*|^\*.*
Upvotes: 1