Matt
Matt

Reputation: 2673

awk/regex: parsing error logs not always returned error description

I recently asked for help to parse out Java error stacks from a group of log files and got a very nice solution at the link below (using awk).

Pull out Java error stacks from log files

I marked the question answered and after some debugging and studying I found a few potential issues and since they are unrelated to my initial question but rather due to my limited understanding of awk and regular expressions, I thought it might be better to ask a new question.

Here is the solution:

BEGIN{ OFS="," }
/[[:space:]]+*<Error / {
    split("",n2v)
    while ( match($0,/[^[:space:]]+="[^"]+/) ) {
        name = value = substr($0,RSTART,RLENGTH)
        sub(/=.*/,"",name)
        sub(/^[^=]+="/,"",value)
        $0 = substr($0,RSTART+RLENGTH)
        n2v[name] = value
    print name value
    }
    code = n2v["ErrorCode"]
    desc[code] = n2v["ErrorDescription"]
    count[code]++
    if (!seen[code,FILENAME]++) {
        fnames[code] = (code in fnames ? fnames[code] ", " : "") FILENAME
    }
}
END {
    print "Count", "ErrorCode", "ErrorDescription", "Files"
    for (code in desc) {
        print count[code], code, desc[code], fnames[code]
    }
}

One issue I am having with it is that not all ErrorDescriptions are being captured. For example, this error description appears in the output of this script:

ErrorDescription="Database Error."

But this error description does not appear in the results (description copied from actual log file):

ErrorDescription="Operation not allowed for reason code &quot;7&quot; on table &quot;SCHEMA.TABLE&quot;.. SQLCODE=-668, SQLSTATE=57016, DRIVER=4.13.127"

Nor does this one:

ErrorDescription="Cannot Find Person For Given Order."

It seems that most error descriptions are not being returned by this script but do exist in the log file. I don't see why some error descriptions would appear and some not. Does anyone have any ideas?

EDIT 1:

Here is a sample of the XML I am parsing:

    <Errors>
        <Error ErrorCode="ERR_0139"
            ErrorDescription="Cannot Find Person For Given Order." ErrorMoreInfo="">
    ...
    ...
</Error>
    </Errors>

Upvotes: 0

Views: 327

Answers (2)

Thomas Dickey
Thomas Dickey

Reputation: 54593

The pattern in the script will not match your data:

/[[:space:]]+*<Error / {

Details:

  • The "+" tells it to match at least one space.
  • The space after "Error" tells it to match another space - but your data has no space before the "=".
  • The "<" is unnecessary (but not part of the problem).

This would be a better pattern:

/^[[:space:]]*ErrorDescription[[:space:]]*=[[:space:]]*".*"/

Upvotes: 1

Amen Jlili
Amen Jlili

Reputation: 1944

This regex would only match the error description.

ErrorDescription="(.+?)"

It uses a capturing group to remember your error description.

Demo here. (Tested against a combination of your edit and your previous question error log.)

Upvotes: 1

Related Questions