Reputation: 71
I have below data in log file and i want to extract the lines that are between 2 phrases of "Process Started" and "Process Completed" including begining of the line and end of the line.
2016-11-28 12:18:59.5286 | 14 | Info | Process Started -ABC *****
....
..
2016-11-28 12:18:59.5286 | 14 | Info | Process Completed -ABC, Status: Failed***
2016-11-28 13:18:59.5286 | 14 | Info | Process Started -DEF
....
..
2016-11-28 13:18:59.5286 | 14 | Info | Process Completed -DEF Status: Passed***
Using below RegEx i'm able to extract the lines but beginning and end of the lines with given match are missing.
Regex r = new Regex("^*?Process Started -"+process.Name+"(.*?)Process Completed: "+process.Name+".*?", RegexOptions.Singleline);
Above regex returning like this
Process Started -ABC *****
....
..
2016-11-28 12:18:59.5286 | 14 | Info | Process Completed
But I need like this
2016-11-28 12:18:59.5286 | 14 | Info | Process Started -ABC *****
....
..
2016-11-28 12:18:59.5286 | 14 | Info | Process Completed -ABC, Status: Failed***
Upvotes: 1
Views: 989
Reputation: 5261
You're close, but the lazy quantifier at the end is the problem: it will match the least it has to, which is nothing in this case.
Here's a revision of your regex that works:
Regex r = new Regex("[^\n]*?Process Started -"
+ process.Name + "(.*?)Process Completed -"
+ process.Name + "[^\n]*", RegexOptions.Singleline);
Changes I made:
[^\n]*
at the beginning and end prevent matching newlines, but gets the rest of the lineExtra Info:
I'm not sure how you plan on using this in the context of your code, but if you need to extract all such sections, rather than for one specific process name, you can grab them all at once with this variation:
Regex r = new Regex("[^\n]*?Process Started -(\w+)(.*?)Process Completed -\1[^\n]*", RegexOptions.Singleline);
The \1
is a backreference to whatever process name was matched by (\w+)
. You will end up with a collection of matches, one for each process name.
Upvotes: 2
Reputation: 45135
You'd need to use the Multiline
option and then you could do something like this:
var reg = new Regex(@"^.*Process Started -ABC(.*)$(\n^.*$)*?\n(^.*Process Completed -ABC.*)$",
RegexOptions.Multiline);
But it's kind of ugly. As @blaze_125 suggested in the comments, you're best bet is to probably divide in into lines and iterate looking for the Started
and Completed
strings and then grabbing all the lines in-between
You could do something like:
var lines = str.Split('\n');
var q = new Queue<string>();
foreach (var l in lines)
{
q.Enqueue(l);
if (l.Contains("Process Completed")) // you could use a regex here if you want more
// complex matching
{
string output;
while (q.Count > 0)
{
// your queue here would contain exactly one entry
output = q.Dequeue();
Console.WriteLine(output);
}
}
}
Upvotes: 0