Reputation: 23
with this pattern in a log file
event y:
event x: specific data A
event y:
event z: count = 1 (or 2, 3, etc)
event y:
event x: specific data B
event y:
event z: count = 0
event y:
The event names represented by x y z are static.
I want to extract the "specific data" that occurs prior to "count = 0". Its close enough for me to get extract these lines.
event x: specific data B
event y:
event z: count = 0
The best I can do is (multiline option used in editpad pro)
event x.+?count = 0
But that gives me too much
event x: specific data A
event y:
event z: count = 1 (or 2, 3, etc)
event y:
event x: specific data B
event y:
event z: count = 0
Even though its non-greedy, the match is going back "too far"
How can I get just the following lines?
event x: specific data B
event y:
event z: count = 0
Upvotes: 2
Views: 69
Reputation: 3716
If using grep
is an option it has a -B n
argument that tells it to include n
lines before the line matching the string/expression you gave it. so grep -B 2 "count = 0"
should do it.
Alternatively, if you want to just use regex, try this:
(?:^.*$\s){2}^.*count = 0
This can be broken up into two bits: (?:^.*$\s){2}
and ^.*count = 0
The second part is quite obviously regex for "the line containing 'count = 0'".
The first part is the regex for "include two lines before that" where ^.*$\s
is the regex for "a line". (specifically, the start of a line, followed by any number of characters before the end of a line and a whitespace character (which by necessity would be \n
).
Upvotes: 2
Reputation: 89547
You need to be more explicit, example:
event x:(?>[^ec]++|\B[ec]|e(?!vent x:)|c(?!ount = 0))++count = 0
pattern details:
event x:
(?> # open an atomic group
[^ec]++ # all characters except e and c one or more times
| # OR
\B[ec] # e or c not precedent by a word boundary
| # OR
e(?!vent x:) # e not followed by "vent x:"
| # OR
c(?!ount = 0) # c not followed by "ount = 0"
)++ # repeat the atomic group one or more times
count = 0
Upvotes: 2