PraveenKS
PraveenKS

Reputation: 1175

Extract multiple lines only if all patterns matches on the same order

I am encountering the similar difficult that was asked here.

My Linux log file (sample log file) contains entries as below and I’d like to grep the lines ‘Total Action Failed :’ and ‘Total Action Processed:’ only if these two lines are followed by a line that contains the string '> Processing file: R'.

INF----BusinessLog:08/06/19 20:44:33 > Processing file:  R1111111.R222222222.TEST0107, and creates the reports.
Line2
Line3
Line4
INF----BusinessLog:08/06/19 20:44:33 > Data
    =========
    Overview:
        Total Action          : 100
        Total Action Failed   : 0
        Total Action Processed: 100

INF----BusinessLog:08/06/19 20:44:35 > Processing file:  R333333333.R222222222.TEST0107, and creates the reports.
Line2
Line3
Line4
INF----BusinessLog:08/06/19 20:44:35 > Data
    =========
    Overview:
        Total Action          : 50
        Total Action Failed   : 0
        Total Action Processed: 50

Tried with the solution using pcregrep given on the earlier question as below:

/opt/pdag/bin/pcregrep -M  '> Processing file:  R.*(\n|.)*Total Action Failed   :.*(\n|.)*Total Action Processed:'" $log_path/LogFile.log

I have trouble with the below two concerns:

(1) Above command returns all the lines that are present in-between the pattern lines – which is not required

(2) If the log file contains entries as below (> Processing file: Z) instead of (> Processing file: R) then the above pcregrep command doesn't give accurate result.

INF----BusinessLog:08/06/19 20:44:33 > Processing file:  R1111111.R222222222.TEST0107, and creates the reports.
Line2
Line3
Line4
INF----BusinessLog:08/06/19 20:44:33 > Data
    =========
    Overview:
        Total Action          : 100
        Total Action Failed   : 0
        Total Action Processed: 100

INF----BusinessLog:08/06/19 20:44:35 > Processing file:  Z333333333.R222222222.TEST0107, and creates the reports.
Line2
Line3
Line4
INF----BusinessLog:08/06/19 20:44:35 > Data
    =========
    Overview:
        Total Action          : 50
        Total Action Failed   : 0
        Total Action Processed: 50
INF----BusinessLog:08/06/19 20:44:45 > Processing file:  R555555555.R222222222.TEST0107, and creates the reports.
Line2
Line3
Line4
INF----BusinessLog:08/06/19 20:44:54 > Data
    =========
    Overview:
        Total Action          : 300
        Total Action Failed   : 45
        Total Action Processed: 300

Can someone help me to find a solution to this issue?

I need just the three lines as below when all the patterns matches in the same order; also, the number of lines between the first pattern > Processing file: R and second pattern Total Action Failed : differs and it will not be always 3 lines.

INF----BusinessLog:08/06/19 20:44:33 > Processing file:  R1111111.R222222222.TEST0107, and creates the reports.
        Total Action Failed   : 0
        Total Action Processed: 50
INF----BusinessLog:08/06/19 20:44:45 > Processing file:  R555555555.R222222222.TEST0107
            Total Action Failed   : 45
            Total Action Processed: 300

Upvotes: 1

Views: 54

Answers (1)

Ed Morton
Ed Morton

Reputation: 203189

I think you're getting to hung up on trying to create a regexp that satisfies your requirements when in fact all you really want to do is the pint the first line and last 2 lines of every block that starts with a line including > Processing file: R. Given that, with any awk in any shell on every UNIX box:

$ awk -v OFS='\n' '
    /> Processing file:[[:space:]]*R/ { if (h) print h, y, z; h=$0 }
    NF { y=z; z=$0 }
    END { print h, y, z }
' file
INF----BusinessLog:08/06/19 20:44:33 > Processing file:  R1111111.R222222222.TEST0107, and creates the reports.
        Total Action Failed   : 0
        Total Action Processed: 50
INF----BusinessLog:08/06/19 20:44:45 > Processing file:  R555555555.R222222222.TEST0107, and creates the reports.
        Total Action Failed   : 45
        Total Action Processed: 300

If that's not what you want then update your question to clarify your requirements and provide an example that the above does not work for and we can post the trivial, portable awk solution for whatever that is instead.

Upvotes: 1

Related Questions