Reputation: 64874
A log file contains a number of Python tracebacks. I only care about tracebacks raised because of a KevinCustomError
. There may be more than one of this class of error in the file.
How can I use grep
, another popular unix command, or a combination thereof to dump the entire traceback for my specific error?
Here's an example log file. I would like lines 1-3 from this file. In the real log file the tracebacks are much longer.
Traceback (most recent call last):
File "<stdin>", line 1, in ?
KevinCustomError: integer division or modulo by zero
Traceback (most recent call last):
File "<stdin>", line 1, in ?
ZeroDivisionError: integer division or modulo by zero
Upvotes: 8
Views: 1722
Reputation: 28416
If I have correctly understood the structure of the Python log file, the following is a tiny sed
solution, which is less cryptic than it seems
#!/usr/bin/sed -f
/^Traceback/{
:here
N
/\nKevinCustomError/b
s/.*\n\(Traceback\)/\1/
b here
}
In summary, the script takes action only on lines starting by Traceback
; on these lines, the script keeps appending a new line (N
) and subsequently checking if the newly added line starts by KevinCustomError
; if this is the case, the script branches to the end and prints the multiline pattern space; if not, the script removes everything but the last Traceback
-starting line from the pattern space, and then branches back to :here
and appends another line (N
), and so on.
In detail, it works as follows:
#!/usr/bin/sed -f
: this is the shebang line, which tells the shell to use /usr/bin/sed
as the interpreter, and to pass the file argument to it through the -f
option (this allows executing ./script file
instead of sed -f script file
);/^Traceback/
only matches the lines that start by Traceback
;{…}
groups the commands that are executed only for those lines matched at step 2;
:here
is not a command, but just a label which marks the line where we can come back to by means of a t
est or b
ranch command;N
reads and appends the following line of text the current pattern space inserting a newline \n
in between, which makes the pattern space a multiline;/\nKevinCustomError/b
, this b
ranches to the end of the script if the pattern space contains KevinCustomError
preceded by a \n
;
p
rinting the multiline pattern space, which starts by Traceback
, contains (at least) a \n
in it, and contains KevinCustomError
just after the (last) \n
;s/.*\n\(Traceback\)/\1/
(we are here if the pattern in 3. did not match) deletes the leading part of the patterns space up to and including the last \n
;b here
branches to :here
, and no printing occurs at this point.Upvotes: 1
Reputation: 5186
I've used something like this before for a similar issue. As a bonus, if you have multiple occurrences of KevinCustomError it will extract the traceback for each one.
#!/bin/bash
INPUT=$1
TOP='Traceback'
BOTTOM='KevinCustomError'
grep -n "$BOTTOM" $INPUT | while read match
do
# This gets just the line number from the grep command
END=${match%%:*}
# Gets just the part of the file before END, flips it,
# then gets the line number for TOP
TEMP=`head -n $END $INPUT | tac | grep -n $TOP`
# TEMP is really the number of lines from END to Traceback
START=`expr $END - ${TEMP%%:*} + 1`
echo $START $END
sed -n "$START, $END p" < $INPUT
done
Output when run on your test data with the tracebacks flipped (because it is more interesting):
4 6
Traceback (most recent call last):
File "<stdin>", line 1, in ?
KevinCustomError: integer division or modulo by zero
Upvotes: 0
Reputation: 77105
Here is another way with awk
:
awk '
/^KevinCustomError/ { for(;x<length(arry);) print arry[++x]; print $0 }
/^Traceback/ { delete arry; i=x=0 }
{ arry[++i]=$0 }
' logFile
Upvotes: 0
Reputation: 70562
Here's an AWK script I tried whipping together.
awk '{a[NR]=$0}; /KevinCustomError/ {for(i=0; a[NR-i] !~ /Traceback/; i++) {} i++; while(i-- >= 0) {print a[NR-i]}}' logfile
Or, in file form.
{a[NR] = $0};
{
if ($0 ~ /KevinCustomError/)
{
for (i = 0; a[NR-i] !~ /Traceback/; i++)
{}
i++
while (i-- >= 0)
{
print a[NR-i];
}
}
}
Used like: awk -f logscript.awk logfile
.
Not too familiar with AWK, so any criticism is welcome. Basically, it keeps track of all lines read so far, and just searches backwards to find a "Traceback" token (which you can replace if you'd like), and then prints everything in between (in the correct order).
Upvotes: 2