fnatt
fnatt

Reputation: 68

Creating a Regex with specific condition

I want to create a regex for extracting blocks from a text file. The block must be between to known value and contains a specific Word.

What I m using right now is this

  Regex.Matches (fileContent, $"START_BLOCK SOMEWORD[^#]+?END_BLOCK")
                    .OfType<Match> ().Select (m => m.Value).ToList ();

which only returns the matches that start with START_BLOCK and have only one space between the start and SOMEWORD. I know that between the start and the word can be only spaces or control characters.

.....
PRG
PROGRAM PRG
VAR
END_VAR
0A
TRUE
ANDA
TRUE
OR
RESULTd
TEST_F
.....

From this, I want to extract the part beginning with PROGRAM PRG and ending with RESULTd. So the block between PRG and TEST_F and containing(directly after PRG but can contain more than spaces or carriage returns) the Keyword PROGRAM.

Note that the file can contain more than one PROGRAM but every one has a unique name.

Upvotes: 1

Views: 55

Answers (1)

The fourth bird
The fourth bird

Reputation: 163447

You could match the lines that contains PROGRAM and then match until the first occurrence of RESULTd

.*\bPROGRAM\b.*(?:\r?\n(?!RESULTd\b).*)*\r?\nRESULTd\b

Regex demo

If the words PRG and TESTF should be there and there can be one or more whitespace chars \s* after PRG, you can use a capturing group.

PRG\r?\n\s*(PROGRAM\b.*(?:\r?\n(?!RESULTd\b).*)*\r?\nRESULTd)\r?\nTEST_F

Regex demo | C# demo

enter image description here

Your code might look like

string pattern = @"PRG\r?\n\s*(PROGRAM\b.*(?:\r?\n(?!RESULTd\b).*)*\r?\nRESULTd)\r?\nTEST_F";
var items = Regex.Matches(fileContent, pattern)
    .OfType<Match>()
    .Select (m => m.Groups[1].Value)
    .ToList();

Upvotes: 1

Related Questions