Reputation: 57
I have an instrumented log file that have 6 lines of duplicated first column as below.
//SC001@1/1/1@1/1,get,ClientStart,1363178707755
//SC001@1/1/1@1/1,get,TalkToSocketStart,1363178707760
//SC001@1/1/1@1/1,get,DecodeRequest,1363178707765
//SC001@1/1/1@1/1,get-reply,EncodeReponse,1363178707767
//SC001@1/1/1@1/2,get,DecodeRequest,1363178708765
//SC001@1/1/1@1/2,get-reply,EncodeReponse,1363178708767
//SC001@1/1/1@1/2,get,TalkToSocketEnd,1363178708770
//SC001@1/1/1@1/2,get,ClientEnd,1363178708775
//SC001@1/1/1@1/1,get,TalkToSocketEnd,1363178707770
//SC001@1/1/1@1/1,get,ClientEnd,1363178707775
//SC001@1/1/1@1/2,get,ClientStart,1363178708755
//SC001@1/1/1@1/2,get,TalkToSocketStart,1363178708760
Note: , (comma) is the delimiter here
Like wise there are many duplicate first column values (IDs) in the log file (above example having only two values (IDs); //SC001@1/1/1@1/1 and //SC001@1/1/1@1/2) I need to consolidate log records as below format.
ID,ClientStart,TalkToSocketStart,DecodeRequest,EncodeReponse,TalkToSocketEnd,ClientEnd
//SC001@1/1/1@1/1,1363178707755,1363178707760,1363178707765,1363178707767,1363178707770,1363178707775
//SC001@1/1/1@1/2,1363178708755,1363178708760,1363178708765,1363178708767,1363178708770,1363178708775
I suppose to have a bash script for this exercise and appreciate an expert support for this. Hope there may be a sed or awk solution which is more efficient.
Thanks much
Upvotes: 1
Views: 126
Reputation: 16994
One way:
sort -t, -k4n,4 file | awk -F, '{a[$1]=a[$1]?a[$1] FS $NF:$NF;}END{for(i in a){print i","a[i];}}'
sort
command sorts the file on the basis of the last(4th) column. awk
takes the sorted input and forms an array where the 1st field is the key, and the value is combination of values of the last column.
Upvotes: 1