Reputation: 1386
I have two test files, the first one contains a 3rd party names, the second file contains a message status like sent, failed, technical errors, etc.
I want to search in a log file for each 3rd party name (from first file) and get count of each message status (listed in file 2)
example of 1st file.txt (3rd party names)
BNF_IPL
one97
pajwok
RadioAzadi
SPICDIGITAL
U2OPIA
UNIFUN
UNIFUNRS
vectracom
VNTAF
YRMP
INFOTT
second file.txt (message status):
success
partial
failed
Error absentSubscriber
UnknownSubscriber
smDeliveryFailure
userSpecificReason
CallBarred
systemFailure
my goal is to produce a report contains total status for each 3rd party. something like
sent | failed | TechErrpr | Absent | subscriber
IBM someValue someValue someValue someValue someValue
Microsoft someValue someValue someValue someValue someValue
Oracle someValue someValue someValue someValue someValue
google someValue someValue someValue someValue someValue
To get the values i will grep those names and status in a log file and get the totals. for that i am trying to use nested loop but with no luck.something like:
for ((i = 0; i < wc -l 3rdPList.txt ; i++)); do
for ((j = i; j < wc -l status.txt ; j++)); do
grep 3rdPList.txt logFile | grep status.txt | wc -l > outputFile.txt
echo $st[j]
done
done
example of the log file:
2018-10-30 00:07:19,640 DEBUG [org.mobicents.smsc.library.CdrGenerator] 2018-10-29 14:42:45,789 +0430,588,5,0,93706315646,1,1,temp_failed,BNF_IPL,26674477,0702700006,412012004908984,null,ایید.,Error absentSubscriber after MtForwardSM Request: MAPErrorMessageAbsentSubscriber []
2018-10-30 00:07:41,034 DEBUG [org.mobicents.smsc.library.CdrGenerator] 2018-10-29 16:21:27,260 +0430,588,5,0,0700375593,1,1,temp_failed,BNF_IPL,27008401,null,null,null,عدد1 را به588 ارسال ,AbsentSubscriber response from HLR: MAPErrorMessageAbsentSubscriber []
Upvotes: 0
Views: 70
Reputation: 207425
This does pretty much what you ask, but I didn't work too much on pretty formatting!
{ sed 's/^/1,/' 1.txt; sed 's/^/2,/' 2.txt; cat log.txt; } | awk -F, '$1==1{c=substr($0,3);cc[c]++;next} $1==2{s=substr($0,3); ss[s]++;next} {s=$10;c=$11;res[c SEP s]++} END{for(s in ss){printf("%s ",s)};printf("\n");for(c in cc){printf("%s ",c);for(s in ss){printf("%d ",res[c SEP s]+0)}printf("\n")}}'
Sample Output
systemFailure temp_failed CallBarred userSpecificReason smDeliveryFailure UnknownSubscriber Error absentSubscriber partial success
pajwok 0 0 0 0 0 0 0 0 0
SPICDIGITAL 0 0 0 0 0 0 0 0 0
YRMP 0 0 0 0 0 0 0 0 0
UNIFUN 0 0 3 0 0 0 0 0 0
U2OPIA 0 0 0 0 0 0 0 0 0
UNIFUNRS 0 0 0 0 0 0 0 0 0
RadioAzadi 0 0 0 0 0 0 0 0 0
one97 0 0 0 0 0 0 0 0 0
BNF_IPL 0 2 0 0 0 0 0 0 0
VNTAF 0 0 0 0 0 0 0 0 0
INFOTT 0 0 0 0 0 0 0 0 0
vectracom 0 0 0 0 0 0 0 0 0
If you want to understand it, try running the parts separately. So, for the first part, I prefix all the company names by a 1
so that awk
can differentiate them from status codes and log lines:
sed 's/^/1,/' 1.txt
Output
1,BNF_IPL
1,one97
1,pajwok
1,RadioAzadi
1,SPICDIGITAL
1,U2OPIA
1,UNIFUN
1,UNIFUNRS
1,vectracom
1,VNTAF
1,YRMP
1,INFOTT
Then, I prefix all the status messages with a 2
so that awk
can differentiate those from company names and log lines:
sed 's/^/2,/' 2.txt
Output
2,success
2,partial
2,temp_failed
2,Error absentSubscriber
2,UnknownSubscriber
2,smDeliveryFailure
2,userSpecificReason
2,CallBarred
2,systemFailure
Then I cat
the log file into awk
:
cat log.txt
The awk
can be written across multiple lines and commented:
{ sed ...; sed ...; cat ...; } | awk -F, '
$1==1 {c=substr($0,3); cc[c]++; next} # Process company name in "1.txt", "c" holds name, "cc[]" is an array of names
$1==2 {s=substr($0,3); ss[s]++; next} # Process status code in "2.txt, "s" holds status, "ss[]" is an array of statuses
{s=$10; c=$11; res[c SEP s]++} # Process line from log, status is field 10, company is field 11. Increment results array "res[]"
END {
# Print line of status codes
for(s in ss){printf("%s ",s)};
printf("\n");
for(c in cc){printf("%s ",c);
for(s in ss){printf("%d ",res[c SEP s]+0)}printf("\n")}
}'
SEP
is just a separator to fake 2-D arrays.
Upvotes: 1