Mark B.
Mark B.

Reputation: 41

How to match a block of text with C# Regex

I have a text file with hundreds of lines that follow this pattern:

[Part 1.SubPart 2.A 1]
Variable=value
(...)
LastVariable1=value
[Part 1.SubPart 2.B 2]
Variable=value
(...)
LastVariable2=value
[Part 1.SubPart 2.C 3]
Variable=value
(...)
LastVariable3=value
[Part 1.SubPart 3.A 1]
(...)

I need to extract each block that starts with [Part...A *] and ends before the next "A" block starts.

The very last variable "LastVariable3" has a constant name in all the Parts and can be ignored for my purposes.

I've tried using the following expressions based on other posts here, but they are not working.

var pattern = new Regex(@"\[Part.*A..\])(.*)(^LastVariable3)",RegexOptions.Singleline);

var pattern = new Regex(@"\[Part.*A..\])(.|\n)*(^LastVariable3)",RegexOptions.Singleline);

...they always match all the Part blocks in the WHOLE file at once instead of one at a time.

I've also tried (\[Part.*A..\]\n)(.*(\n)){"number of lines"} but the number of variables are not constant so this won't work.

Hope this made sense! Any ideas on what I'm doing wrong? I'm new to Regex.

Upvotes: 2

Views: 1631

Answers (2)

Ωmega
Ωmega

Reputation: 43673

Use RegexOptions.Singleline regex pattern

(\[Part\s[^\]]+\s\d+\.A\s\d+\].*?)(?=(?:[\n\r]\[Part\s[^\]]+\s\d+\.A\s\d+\]|\Z))

Upvotes: 1

Cristian Lupascu
Cristian Lupascu

Reputation: 40526

Your second attempt is very close. You just forgot an extra paren ()). Also, you need to use RegexOptions.Multiline instead of RegexOptions.Singleline.

I've tried this pattern and it worked:

\[Part.*A..\](.|\n)*(^LastVariable3)

Upvotes: 1

Related Questions