Reputation: 417

How to sort and count wrt a string

This is my input file.

yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA]
yyyy-mm-dd hh:mm:ss string *5555555555 [AAAAA]
yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA]
yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA]
yyyy-mm-dd hh:mm:ss string *2222222222 [AAAAA]
yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA]
yyyy-mm-dd hh:mm:ss string *3333333333 [AAAAA]
yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA]
yyyy-mm-dd hh:mm:ss string *9999999999 [BBBBB]
yyyy-mm-dd hh:mm:ss string *6666666666 [AAAAA]

Let's consider the above input as input.gz, how to get the count of *9999999999 with last column as [AAAAAA]

I need a script using SED or AWK or GREP.

Expected output should be:

What if the above input has the last column extended to a new line? like:

yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA]
yyyy-mm-dd hh:mm:ss string *5555555555 [AAAAA]
yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA  
zzzzzzzzzzzz xxxxxxxx yy]
yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA]
yyyy-mm-dd hh:mm:ss string *2222222222 [AAAAA]
yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA]
yyyy-mm-dd hh:mm:ss string *3333333333 [AAAAA]
yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA]
yyyy-mm-dd hh:mm:ss string *9999999999 [BBBBB]
yyyy-mm-dd hh:mm:ss string *6666666666 [AAAAA]

In the above case, won't it be difficult to use AWK? How to overcome this using SED?

I'm sorry for editing it again. What if the 10-digit number is unknown? like *9999999999 is unknown, can we find out the number of times *NNNNNNNNNN is occuring with last column as [AAAAA]?

Upvotes: 1

Answers (4)

F. Knorr

Reputation: 3055

Try this:

 awk '$NF ~ /\[A+\]/ && $(NF1)~/\*9+/' input | wc -l

For the sake of simplicity, I use the wc-command to do the counting. Of course, this could be implemented in awk, too:

 awk '$NF ~ /\[A+\]/ && $(NF1)~/\*9+/{counter++}END{print counter}' input

Update: How to list the number of occurrences for each number

 awk '$NF ~ /\[A+\]/{ar[$(NF-1)]++}END{for(key in ar){print key,ar[key]}}' input

Output:

*2222222222 1
*6666666666 1
*5555555555 1
*3333333333 1
*9999999999 5

Upvotes: 1

karakfa

Reputation: 67467

awk to the rescue!

$ awk -v key='*9999999999' '$NF=="[AAAAA]" && $(NF-1)==key {c++} END{print c}' file
5

if the last field is split into two lines, by definition it won't be equal to "[AAAAA]"

Upvotes: 0

Walter A

Reputation: 19982

Just with one grep:

grep -c "\*9999999999.*\[AAAAA\]$" inputfile

When you have the input split over 2 lines (sometimes) but [AAAAA still on the first, you can try

grep -c "\*9999999999.*\[AAAAA" inputfile

Upvotes: 0

Drew Varner

Reputation: 56

cat input_file | grep '[*]9999999999 \[AAAAA\]$' | wc -l

Upvotes: 2

How to sort and count wrt a string

Answers (4)

Related Questions