Reputation: 2673
I recently asked for help to parse out Java error stacks from a group of log files and got a very nice solution at the link below (using awk).
Pull out Java error stacks from log files
I marked the question answered and after some debugging and studying I found a few potential issues and since they are unrelated to my initial question but rather due to my limited understanding of awk and regular expressions, I thought it might be better to ask a new question.
Here is the solution:
BEGIN{ OFS="," }
/[[:space:]]+*<Error / {
split("",n2v)
while ( match($0,/[^[:space:]]+="[^"]+/) ) {
name = value = substr($0,RSTART,RLENGTH)
sub(/=.*/,"",name)
sub(/^[^=]+="/,"",value)
$0 = substr($0,RSTART+RLENGTH)
n2v[name] = value
print name value
}
code = n2v["ErrorCode"]
desc[code] = n2v["ErrorDescription"]
count[code]++
if (!seen[code,FILENAME]++) {
fnames[code] = (code in fnames ? fnames[code] ", " : "") FILENAME
}
}
END {
print "Count", "ErrorCode", "ErrorDescription", "Files"
for (code in desc) {
print count[code], code, desc[code], fnames[code]
}
}
One issue I am having with it is that not all ErrorDescriptions are being captured. For example, this error description appears in the output of this script:
ErrorDescription="Database Error."
But this error description does not appear in the results (description copied from actual log file):
ErrorDescription="Operation not allowed for reason code "7" on table "SCHEMA.TABLE".. SQLCODE=-668, SQLSTATE=57016, DRIVER=4.13.127"
Nor does this one:
ErrorDescription="Cannot Find Person For Given Order."
It seems that most error descriptions are not being returned by this script but do exist in the log file. I don't see why some error descriptions would appear and some not. Does anyone have any ideas?
EDIT 1:
Here is a sample of the XML I am parsing:
<Errors>
<Error ErrorCode="ERR_0139"
ErrorDescription="Cannot Find Person For Given Order." ErrorMoreInfo="">
...
...
</Error>
</Errors>
Upvotes: 0
Views: 327
Reputation: 54593
The pattern in the script will not match your data:
/[[:space:]]+*<Error / {
Details:
This would be a better pattern:
/^[[:space:]]*ErrorDescription[[:space:]]*=[[:space:]]*".*"/
Upvotes: 1
Reputation: 1944
This regex would only match the error description.
ErrorDescription="(.+?)"
It uses a capturing group to remember your error description.
Upvotes: 1