Reputation: 2545
From the example below I want to sum the scores for the rows where Targets
and miRNA
are similar: Please see below.
Target miRNA Score
NM_198900 hsa-miR-423-5p -0.244
NM_198900 hsa-miR-423-5p -0.6112
NM_1989230 hsa-miR-413-5p -0.644
NM_1989230 hsa-miR-413-5p -0.912
Output:
NM_198900 hsa-miR-423-5p -0.8552
NM_1989230 hsa-miR-413-5p -1.556
Upvotes: 2
Views: 72
Reputation: 207650
Like this:
awk '{x[$1 " " $2]+=$3} END{for (r in x)print r,x[r]}' file
As it sees each line, it adds the third field ($3
) into an array x[]
as indexed by joining fields 1 and 2 with a space between them. At the end, it prints all elements of x[]
.
Following @jaypal's suggestion, you may prefer this which retains your header line (NR==1) and uses TABs as the Output Field Separator
awk 'NR==1{OFS="\t";print;next} {x[$1 OFS $2]+=$3} END{for (r in x)print r,x[r]}' file
Upvotes: 4