Reputation: 9839

Extracting unique entries from log

I have a log file which prints out lines in the following format:

ERROR [10 Dec 2013 03:57:07] ........ Project ID: [88000317019]......

I want to count the number of unique project IDs which have errored.
Each Project ID may emit an error multiple times.

How do i do it?

Upvotes: 1

Answers (4)

fedorqui

Reputation: 290525

You can use:

awk -F[][] '/ERROR/ {a[$4]++} END{for (i in a) print i, a[i]}' file

Explanation

-F[][] set [ and ] as possible field separators.
/ERROR/ {a[$4]++} create an array with the values a[ key1 ]=num_of_ocurrences_key1, a[ key2 ]=num_of_ocurrences_key2, etc. $4 is used because it is the text appearing inside the [] brackets and makes it the 4th position. /ERROR/ filters the lines containing the text ERROR.
END{for (i in a) print i, a[i]} print the results.

Test

$ cat a
ERROR [10 Dec 2013 03:57:07] ........ Project ID: [88000317019]......
ERROR [10 Dec 2013 03:57:07] ........ Project ID: [88000317019]......
WARNING [10 Dec 2013 03:57:07] ........ Project ID: [88000317019]......
ERROR [10 Dec 2013 03:57:07] ........ Project ID: [88000317013]......
WARNING [10 Dec 2013 03:57:07] ........ Project ID: [88000317010]......

$ awk -F[][] '/ERROR/ {a[$4]++} END{for (i in a) print i, a[i]}' a
88000317019 2
88000317013 1

Upvotes: 3

BMW

Reputation: 45353

other ways.

sed -n '/ERROR/ s/.*\[//;s/\].*//p' infile|sort |uniq -c |sort -n

Upvotes: 0

Chris Seymour

Reputation: 85913

This should work for any contents before and after the part you are looking for and only for those lines that log ERROR:

$ cat file                                                                     
.............Project ID: [xyz] ERROR...........
.............Project ID: [abc] INFO............
.............Project ID: [abc] ERROR...........
.............Project ID: [xyz] WARNING.........
.............Project ID: [xyz] ERROR...........

$ grep -Po '(?<=Project ID: [[])[^]]+(?=[]] ERROR)' file | sort | uniq -c        
      1 abc
      2 xyz

Note: Requires GNU grep.

Upvotes: 2

Håkon Hægland

Reputation: 40778

You can try:

awk '
{
   match($0,/\[(.*)\]/,a)
   id[a[1]]++
}
END {
   for(i in id) 
      q++
   print "Number of unique ids: " q
}' log.file

Upvotes: 0

Extracting unique entries from log

Answers (4)

Explanation

Test

Related Questions