Reputation: 341
I need help with following:
Input file:
abc message=sent session:111,x,y,z
pqr message=receive session:111,4,5,7
abc message=sent session:123,x,y,z
pqr message=receive session:123,4,5,7
abc message=sent session:342,x,y,z
abc message=sent session:589,x,y,z
pqr message=receive session:589,4,5,7
Output file:
abc message=sent session:111,x,y,z, pqr message=receive session:111,4,5,7
abc message=sent session:123,x,y,z, pqr message=receive session:123,4,5,7
abc message=sent session:342,x,y,z, NOMATCH
abc message=sent session:589,x,y,z, pqr message=receive session:589,4,5,7
Notes:
If you see in source file, for every "sent" message there is "receive"
only for session=342 there is no receive
session is unknow, can't be hardcoded
So merge only those sent and receive where we have matching session number
Upvotes: 3
Views: 608
Reputation: 16974
Another way:
awk -F "[:,]" '/=sent/{a[$2]=$0;}/=receive/{print a[$2], $0;delete a[$2];}END{for(i in a)print a[i],"NO MATCH";}' file
Results:
abc message=sent session:111,x,y,z pqr message=receive session:111,4,5,7
abc message=sent session:123,x,y,z pqr message=receive session:123,4,5,7
abc message=sent session:589,x,y,z pqr message=receive session:589,4,5,7
abc message=sent session:342,x,y,z NO MATCH
When the send
record is encountered, it is store in the array with the session id as the index. When the receive
record is encountered, the send
record is fetched from the array and printed along with receive
record. Also, sent records are removed from array as and when receive
records are received. At the END, all the remaining records in the array are printed as NO MATCH.
Upvotes: 1
Reputation: 54392
Here's one way using awk
. Run like:
awk -f script.awk file
Contents of script.awk
:
{
x = $0
gsub(/[^:]*:|,.*/,"")
a[$0] = (a[$0] ? a[$0] "," FS : "") x
b[$0]++
}
END {
for (i in a) {
print (b[i] == 2 ? a[i] : a[i] "," FS "NOMATCH") | "sort"
}
}
Results:
abc message=sent session:111,x,y,z, pqr message=receive session:111,4,5,7
abc message=sent session:123,x,y,z, pqr message=receive session:123,4,5,7
abc message=sent session:342,x,y,z, NOMATCH
abc message=sent session:589,x,y,z, pqr message=receive session:589,4,5,7
Alternatively, here's the one-liner:
awk '{ x = $0; gsub(/[^:]*:|,.*/,""); a[$0] = (a[$0] ? a[$0] "," FS : "") x; b[$0]++ } END { for (i in a) print (b[i] == 2 ? a[i] : a[i] "," FS "NOMATCH") | "sort" }' file
Note that you can drop the pipe to sort
if you don't care about sorted output. HTH.
Upvotes: 1