Matt
Matt

Reputation: 2673

Grepping and grouping for errors in log files

I'm on Linux (and also sometimes on AIX) and have a bunch of log files in a folder. I have a grep command that will filter out all of the ERRORs in a format as follows.

CreateOrder_hostname_tee.log:2015-09-29 15:42:06,715:ERROR  :Thread-26_CreateOrder: [1443555726715] Error1  [system]: Class1
CreateOrder_hostname_tee.log:2015-09-29 15:42:06,715:ERROR  :Thread-15_CreateOrder: [1443555726715] Error1  [system]: Class1
CreateOrder_hostname_tee.log:2015-09-29 15:42:06,715:ERROR  :Thread-28_CreateOrder: [1443555726715] Error2  [system]: Class2
ScheduleOrder_hostname_tee.log:2015-09-30 03:55:05,011:ERROR  :Thread-5_ScheduleOrder: [1443599705009] Error3  [system]: Class3

Is it possible using some combination of grep/awk/sed to get the above data in a format like this?

API: Error: Count
CreateOrder: Error1: 50
CreateOrder: Error2: 50
ScheduleOrder: Error3: 50

If not, would it be possible to get the format like this? Then I could use wc or similar to count the distinct errors.

API: Date: Error
CreateOrder: 2015-09-29 15:42:06,715: Error1
CreateOrder: 2015-09-29 15:42:06,715: Error2
ScheduleOrder: 2015-09-29 15:42:06,715: Error3

EDIT 1:

The error could be any string (including spaces). Basically, anything in between the brackets below should be displayed.

[1443555726715] Error1: This is an error with description.  [system]: Class1

Upvotes: 1

Views: 1532

Answers (3)

Chris Koknat
Chris Koknat

Reputation: 3451

This solution sorts the output alphabetically by API
At the beginning, it prints the header line
Looping over each line, it searches for a /regular expression/
If found, it stores the result into a hash
At the end, it sorts the keys of the hash, and prints the results

perl -lane 'BEGIN{print "API: Error: Count"} if(/^([^_]+).*\]\s*(Error[^\[]+)\[/){$h{"$1: $2:"}++} END{for $k (sort keys %h){ print "$k $h{$k}"}}' log

input:

CreateOrder_hostname_tee.log:2015-09-29 15:42:06,715:ERROR  :Thread-26_CreateOrder: [1443555726715] Error1  [system]: Class1
CreateOrder_hostname_tee.log:2015-09-29 15:42:06,715:ERROR  :Thread-15_CreateOrder: [1443555726715] Error1  [system]: Class1
CreateOrder_hostname_tee.log:2015-09-29 15:42:06,715:ERROR  :Thread-28_CreateOrder: [1443555726715] Error2  [system]: Class2
ScheduleOrder_hostname_tee.log:2015-09-30 03:55:05,011:ERROR  :Thread-5_ScheduleOrder: [1443599705009] Error3  [system]: Class3
ScheduleOrder_hostname_tee.log:2015-09-30 03:55:05,011:ERROR  :Thread-5_ScheduleOrder: [1443555726715] Error1: This is an error with description.  [system]: Class1

output:

API: Error: Count
CreateOrder: Error1  : 2
CreateOrder: Error2  : 1
ScheduleOrder: Error1: This is an error with description.  : 1
ScheduleOrder: Error3  : 1

Upvotes: 0

Alfe
Alfe

Reputation: 59586

input=$(your grep command)
formatted=$(
  echo "$input" |
    sed 's/^\([^_]*\).*[0-9]*\] \([^[]*[^\[ ]\).*/\1: \2/'
)
kinds=$(echo "$formatted" | sort -u)
while IFS= read kind
do
  count=$(echo "$formatted" | grep "$kind" | wc -l)
  echo "$kind: $count"
done <<< "$kinds"

For the input given in your question, this gives this output:

CreateOrder: Error1: 2
CreateOrder: Error2: 1
ScheduleOrder: Error3: 1

Everything is done in memory, so it might not be feasible for very large data structures (dozens or hundreds of megabytes). But in these cases you can use temporary files instead of shell variables (e. g. echo "$input" | sed … > formatted.tmp and sort -u formatted.tmp > kinds.tmp etc.).

Upvotes: 2

rkachach
rkachach

Reputation: 17375

Following is a simple bash script where you could add new patterns easily, the usage is:

myscript.sh logfile

Script code:

#!/bin/bash

PATTERN_1=(CreateOrder Error1)
PATTERN_2=(CreateOrder Error2)
PATTERN_3=(ScheduleOrder Error3)

function get_pattern_count {
    COUNT=$(grep -E ".+$1.+$2.+" $3 | wc -l)
    echo $1 " : " $2 : $COUNT
}

echo "API: Error: Count"
get_pattern_count ${PATTERN_1[0]} ${PATTERN_1[1]} $1
get_pattern_count ${PATTERN_2[0]} ${PATTERN_2[1]} $1
get_pattern_count ${PATTERN_3[0]} ${PATTERN_3[1]} $1

Upvotes: 0

Related Questions