superl
superl

Reputation: 3

Regex - how to match a block comment

when it comes to regex I'm always lost. I have an editor created in C# to work with papyrus scripting, the problem I'm having is that users ask me for styling block comment ";/ /;" already working for single line that use ";"

Here is the code I have so far
var inputData = @"comment test and this line not suppose to show
;/
comment line 1
comment line 2
comment line 3
/;
Not suppose to show";

        PapyrusCommentRegex1 = new Regex(@"(;/\*.*?\/;)|(.*\/;)", RegexOptions.Singleline);
        
        foreach (Match match in PapyrusCommentRegex1.Matches(inputData))
        {
            if (match.Success)
            {
                textBox1.AppendText(match.Value + Environment.NewLine);
            }
        }

The result I get is

comment test and this line not suppose to show
;/
comment line 1
comment line 2
comment line 3
/;

All the line before the ";/" shows. My question is what am I doing wrong in my regex expression? Thanks in advance to all

Edit: To make it more clearer I need a regex pattern in C# for finding all block comment that start with ";/" and finish with "/;" and need to include the ";/ /;"

Upvotes: 0

Views: 226

Answers (2)

xanatos
xanatos

Reputation: 111860

Try this:

(;/.*?/;)|(;.*?(?=$|[\r\n]))

Note that I'm still using the SingleLine mode.

The part before the | matches multiline comments, the part after the | matches single line comments (comments that end when they encounter the end of the text $ or a new line \r or \n`. Note that the regex won't capture the end-of-the-line at the end of the single-line comments, so

;xyz\n

the \n won't be captured. To capture it:

(;/.*?/;)|(;.*?(?:$|\r\n?|\n))

Upvotes: 0

wp78de
wp78de

Reputation: 18950

Since you said you need to do this with regex in a .NET library I guess you may want a regex that is using balancing groups to match the block comment

(?x)  # ignore spaces and comments
(
;/                 # open block comment
(?:
  (?<open> ;/ )*   # open++
  .+
  (?<-open> /; )*  # open--
)+
/;                 # close
(?(open)(?!))      # fail if unblanaced: open > 0
)

This should give you what you want. Regex Demo


Some mentioned the problem of block comments in strings (and vice vesa?!). This makes things a lot harder, especially since we do not have the (*SKIP)((*FAIL) backtracking verbs and \K in .NET's regex engine available. I would try to match and capture what you need but only match what you do not need:

This matches your block comments and "..." strings. The trick is to only look at the blockcomment capture group:

(?x)  # ignore spaces and comments
(
;/                 # open block comment
(?:
  (?<open> ;/ )*   # open++
  .+
  (?<-open> /; )*  # open--
)+
/;                 # close
(?(open)(?!))      # fail if unblanaced: open > 0
)
|
(?:(?<openq>")
  [^"]*?
  (?<-openq>")*
)+(?(openq)(?!))

Demo Code

I hope you can apply this in your code.

Upvotes: 1

Related Questions