Reputation: 1175
I am encountering the similar difficult that was asked here.
My Linux log file (sample log file) contains entries as below and I’d like to grep the lines ‘Total Action Failed :
’ and ‘Total Action Processed:
’ only if these two lines are followed by a line that contains the string '> Processing file: R
'.
INF----BusinessLog:08/06/19 20:44:33 > Processing file: R1111111.R222222222.TEST0107, and creates the reports.
Line2
Line3
Line4
INF----BusinessLog:08/06/19 20:44:33 > Data
=========
Overview:
Total Action : 100
Total Action Failed : 0
Total Action Processed: 100
INF----BusinessLog:08/06/19 20:44:35 > Processing file: R333333333.R222222222.TEST0107, and creates the reports.
Line2
Line3
Line4
INF----BusinessLog:08/06/19 20:44:35 > Data
=========
Overview:
Total Action : 50
Total Action Failed : 0
Total Action Processed: 50
Tried with the solution using pcregrep
given on the earlier question as below:
/opt/pdag/bin/pcregrep -M '> Processing file: R.*(\n|.)*Total Action Failed :.*(\n|.)*Total Action Processed:'" $log_path/LogFile.log
I have trouble with the below two concerns:
(1) Above command returns all the lines that are present in-between the pattern lines – which is not required
(2) If the log file contains entries as below (> Processing file: Z
) instead of (> Processing file: R
) then the above pcregrep command doesn't give accurate result.
INF----BusinessLog:08/06/19 20:44:33 > Processing file: R1111111.R222222222.TEST0107, and creates the reports.
Line2
Line3
Line4
INF----BusinessLog:08/06/19 20:44:33 > Data
=========
Overview:
Total Action : 100
Total Action Failed : 0
Total Action Processed: 100
INF----BusinessLog:08/06/19 20:44:35 > Processing file: Z333333333.R222222222.TEST0107, and creates the reports.
Line2
Line3
Line4
INF----BusinessLog:08/06/19 20:44:35 > Data
=========
Overview:
Total Action : 50
Total Action Failed : 0
Total Action Processed: 50
INF----BusinessLog:08/06/19 20:44:45 > Processing file: R555555555.R222222222.TEST0107, and creates the reports.
Line2
Line3
Line4
INF----BusinessLog:08/06/19 20:44:54 > Data
=========
Overview:
Total Action : 300
Total Action Failed : 45
Total Action Processed: 300
Can someone help me to find a solution to this issue?
I need just the three lines as below when all the patterns matches in the same order; also, the number of lines between the first pattern > Processing file: R
and second pattern Total Action Failed :
differs and it will not be always 3 lines.
INF----BusinessLog:08/06/19 20:44:33 > Processing file: R1111111.R222222222.TEST0107, and creates the reports.
Total Action Failed : 0
Total Action Processed: 50
INF----BusinessLog:08/06/19 20:44:45 > Processing file: R555555555.R222222222.TEST0107
Total Action Failed : 45
Total Action Processed: 300
Upvotes: 1
Views: 54
Reputation: 203189
I think you're getting to hung up on trying to create a regexp that satisfies your requirements when in fact all you really want to do is the pint the first line and last 2 lines of every block that starts with a line including > Processing file: R
. Given that, with any awk in any shell on every UNIX box:
$ awk -v OFS='\n' '
/> Processing file:[[:space:]]*R/ { if (h) print h, y, z; h=$0 }
NF { y=z; z=$0 }
END { print h, y, z }
' file
INF----BusinessLog:08/06/19 20:44:33 > Processing file: R1111111.R222222222.TEST0107, and creates the reports.
Total Action Failed : 0
Total Action Processed: 50
INF----BusinessLog:08/06/19 20:44:45 > Processing file: R555555555.R222222222.TEST0107, and creates the reports.
Total Action Failed : 45
Total Action Processed: 300
If that's not what you want then update your question to clarify your requirements and provide an example that the above does not work for and we can post the trivial, portable awk solution for whatever that is instead.
Upvotes: 1