Reputation: 1096
I have a big file (many columns) that generally looks like:
Gene,A,B,C
Gnai3,2,3,4
P53,5,6,7
H19,4,4,4
I want to sum every row of the data frame and add it as a new column as below:
Gene,A,B,C,total
Gnai3,2,3,4,9
P53,5,6,7,18
H19,4,4,4,12
I tried awk -F, '{sum=0; for(i=1; i<=NF; i++) sum += $i; print sum}'
but then I am not able to make a new column for the total counts.
Any help would be appreciated.
Upvotes: 0
Views: 1656
Reputation: 10123
Using only bash:
#!/bin/bash
while read -r row; do
sum=
if [[ $row =~ (,[0-9]+)+ ]]; then
numlist=${BASH_REMATCH[0]}
sum=,$((${numlist//,/+}))
fi
echo "$row$sum"
done < datafile
There are a few assumptions here about rows in the data file: Numeric fields to be summed up are non-negative integers and the first field is not a numeric field (it will not participate in the sum even if it is a numeric field). Also, the numeric fields are consecutive, that is, there is no a non numeric field between two numeric fields. And, the sum won't overflow.
Upvotes: 1
Reputation: 133458
Could you please try following.
awk '
BEGIN{
FS=OFS=","
}
FNR==1{
print $0,"total"
next
}
{
for(j=2;j<=NF;j++)
$(NF+1)+=$j
}
$1=$1
}
1
' Input_file
2nd solution: adding solution as per OP's comment to print first column and sum only.
BEGIN{
FS=OFS=","
}
FNR==1{
print $0,"total"
next
}
{
for(j=2;j<=NF;j++)
sum+=$j
}
print $1,sum
sum=""
}
' Input_file
Upvotes: 4
Reputation: 246764
Can use perl here:
perl -MList::Util=sum0 -F, -lane '
print $_, ",", ($. == 1 ? "total" : sum0( @F[1..$#F] ));
' file
Upvotes: 3
Reputation: 140960
To add a new column, just increment number of columns and assign the new column a value:
NF++; $NF=sum
do:
awk -v OFS=, -F, 'NR>1{sum=0; for(i=1; i<=NF; i++) sum += $i; NF++; $NF=sum } 1'
Upvotes: 2