Reputation: 325
I have 1-N files in this format:
file 1:
1 1
2 5
3 0
4 0
5 0
file 2:
1 5
2 1
3 0
4 0
5 1
As an output, I want to sum all second columns of all files, so the output looks like this:
output:
1 6
2 6
3 0
4 0
5 1
Thanks a lot.
(Alternatively would be the best for me to do this operation automatically with all files that have the same name, but start with different number, e.g. 1A.txt, 2A.txt, 3A.txt as one output and 1AD.txt, 2AD.txt, 3AD.txt as next output)
Upvotes: 1
Views: 1875
Reputation: 17188
Pure Bash:
declare -a sum
for file in *A.txt; do
while read a b; do
((sum[a]+=b))
done < "$file"
done
for idx in ${!sum[*]}; do # iterate over existing indices
echo "$idx ${sum[$idx]}"
done
Upvotes: 1
Reputation: 572
#!/bin/bash
suffixes=$(find . -name '*.txt' | sed 's/.*[0-9][0-9]*\(.*\)\.txt/\1/' | sort -u)
for suffix in ${suffixes}; do
paste *${suffix}.txt | awk '{sum = 0; for (i = 2; i <= NF; i += 2) sum += $i;
print $1" "sum}' > ${suffix}.sums.txt
done
exit 0
Upvotes: 1
Reputation: 785186
Something like this should work:
cat *A.txt | awk '{sums[$1] += $2;} END { for (i in sums) print i " " sums[i]; }'
cat *AD.txt | awk '{sums[$1] += $2;} END { for (i in sums) print i " " sums[i]; }'
Upvotes: 3
Reputation: 36049
A quick summing solution can be done in awk
:
{ sum[$1] += $2; }
END { for (i in sum) print i " " sum[i]; }
Grouping your input files is done easiest by building a list of suffixes and then globbing for them:
ls *.txt | sed -e 's/^[0-9]*//' | while read suffix; do
awk '{ sum[$1] += $2; } END { for (i in sum) print i " " sum[i]; }' *$suffix > ${suffix}.sum
done
Upvotes: 2