Severian
Severian

Reputation: 325

summing second columns of all files in bash

I have 1-N files in this format:

file 1:

1 1
2 5
3 0
4 0
5 0

file 2:

1 5
2 1
3 0
4 0
5 1

As an output, I want to sum all second columns of all files, so the output looks like this:

output:

1 6
2 6
3 0
4 0
5 1

Thanks a lot.

(Alternatively would be the best for me to do this operation automatically with all files that have the same name, but start with different number, e.g. 1A.txt, 2A.txt, 3A.txt as one output and 1AD.txt, 2AD.txt, 3AD.txt as next output)

Upvotes: 1

Views: 1875

Answers (4)

Fritz G. Mehner
Fritz G. Mehner

Reputation: 17188

Pure Bash:

declare -a sum
for file in *A.txt; do
  while read a b; do
    ((sum[a]+=b))
  done < "$file"
done

for idx in ${!sum[*]}; do       # iterate over existing indices
  echo  "$idx ${sum[$idx]}"
done

Upvotes: 1

Nick Atoms
Nick Atoms

Reputation: 572

#!/bin/bash

suffixes=$(find . -name '*.txt' | sed 's/.*[0-9][0-9]*\(.*\)\.txt/\1/' | sort -u)

for suffix in ${suffixes}; do
  paste *${suffix}.txt | awk '{sum = 0; for (i = 2; i <= NF; i += 2) sum += $i;
                               print $1" "sum}' > ${suffix}.sums.txt
done

exit 0

Upvotes: 1

anubhava
anubhava

Reputation: 785186

Something like this should work:

cat *A.txt | awk '{sums[$1] += $2;} END { for (i in sums) print i " " sums[i]; }'

cat *AD.txt | awk '{sums[$1] += $2;} END { for (i in sums) print i " " sums[i]; }'

Upvotes: 3

thiton
thiton

Reputation: 36049

A quick summing solution can be done in awk:

{ sum[$1] += $2; }
END { for (i in sum) print i " " sum[i]; }

Grouping your input files is done easiest by building a list of suffixes and then globbing for them:

ls *.txt | sed -e 's/^[0-9]*//' | while read suffix; do
   awk '{ sum[$1] += $2; } END { for (i in sum) print i " " sum[i]; }' *$suffix > ${suffix}.sum
done

Upvotes: 2

Related Questions