Reputation: 68
I want to create a regex for extracting blocks from a text file. The block must be between to known value and contains a specific Word.
What I m using right now is this
Regex.Matches (fileContent, $"START_BLOCK SOMEWORD[^#]+?END_BLOCK")
.OfType<Match> ().Select (m => m.Value).ToList ();
which only returns the matches that start with START_BLOCK and have only one space between the start and SOMEWORD. I know that between the start and the word can be only spaces or control characters.
.....
PRG
PROGRAM PRG
VAR
END_VAR
0A
TRUE
ANDA
TRUE
OR
RESULTd
TEST_F
.....
From this, I want to extract the part beginning with PROGRAM PRG
and ending with RESULTd
. So the block between PRG
and TEST_F
and containing(directly after PRG but can contain more than spaces or carriage returns) the Keyword PROGRAM.
Note that the file can contain more than one PROGRAM but every one has a unique name.
Upvotes: 1
Views: 55
Reputation: 163447
You could match the lines that contains PROGRAM and then match until the first occurrence of RESULTd
.*\bPROGRAM\b.*(?:\r?\n(?!RESULTd\b).*)*\r?\nRESULTd\b
If the words PRG and TESTF should be there and there can be one or more whitespace chars \s*
after PRG, you can use a capturing group.
PRG\r?\n\s*(PROGRAM\b.*(?:\r?\n(?!RESULTd\b).*)*\r?\nRESULTd)\r?\nTEST_F
Your code might look like
string pattern = @"PRG\r?\n\s*(PROGRAM\b.*(?:\r?\n(?!RESULTd\b).*)*\r?\nRESULTd)\r?\nTEST_F";
var items = Regex.Matches(fileContent, pattern)
.OfType<Match>()
.Select (m => m.Groups[1].Value)
.ToList();
Upvotes: 1