user3170654
user3170654

Reputation: 23

regex to pull sections out of log file

with this pattern in a log file

event y:  
event x: specific data A  
event y:  
event z: count = 1 (or 2, 3, etc)  
event y:  
event x: specific data B  
event y:  
event z: count = 0  
event y:  

The event names represented by x y z are static.

I want to extract the "specific data" that occurs prior to "count = 0". Its close enough for me to get extract these lines.

event x: specific data B  
event y:  
event z: count = 0 

The best I can do is (multiline option used in editpad pro)

event x.+?count = 0

But that gives me too much

event x: specific data A  
event y:  
event z: count = 1 (or 2, 3, etc)  
event y:  
event x: specific data B  
event y:  
event z: count = 0 

Even though its non-greedy, the match is going back "too far"

How can I get just the following lines?

event x: specific data B  
event y:  
event z: count = 0 

Upvotes: 2

Views: 69

Answers (2)

amnn
amnn

Reputation: 3716

If using grep is an option it has a -B n argument that tells it to include n lines before the line matching the string/expression you gave it. so grep -B 2 "count = 0" should do it.

Alternatively, if you want to just use regex, try this:

(?:^.*$\s){2}^.*count = 0

This can be broken up into two bits: (?:^.*$\s){2} and ^.*count = 0

The second part is quite obviously regex for "the line containing 'count = 0'".

The first part is the regex for "include two lines before that" where ^.*$\s is the regex for "a line". (specifically, the start of a line, followed by any number of characters before the end of a line and a whitespace character (which by necessity would be \n).

Upvotes: 2

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89547

You need to be more explicit, example:

event x:(?>[^ec]++|\B[ec]|e(?!vent x:)|c(?!ount = 0))++count = 0

pattern details:

event x: 
(?>                # open an atomic group
    [^ec]++        # all characters except e and c one or more times
  |                # OR
    \B[ec]         # e or c not precedent by a word boundary
  |                # OR
    e(?!vent x:)   # e not followed by "vent x:"
  |                # OR
    c(?!ount = 0)  # c not followed by "ount = 0"
)++                # repeat the atomic group one or more times
count = 0          

Upvotes: 2

Related Questions