Reputation: 136
I have a file (sorted) that looks like this:
alma-court, 1, 0, 8
alma-court, 4, 2, 24
atlantic-house, 99, 1, 340
diamond, 2, 2, 16
iminds-zuiderpoort, 0, 1, 0
north-plaza, 18, 3, 718
north-plaza, 90, 2, 19
I'd like to make a vertical total for each identical first column.
So for alma-court, i'd like to get the total of 1+4, 0+2 and 8+24. atlantic-house is already fine, as it's the only occurrence. And north-plaza should be 18+90, 3+2 and 718+19.
And printing everything like this:
alma-court, 5, 2, 32
atlantic-house, 99, 1, 340
...
Could someone give me the solution using Bash?
Thanks.
Upvotes: 0
Views: 165
Reputation: 5492
groups=( $( cut -d ',' -f 1 count.txt | sort -u ) )
for group in "${groups[@]}"
do
grep $group count.txt | awk '{ sum1+=$2; sum2+=$3; sum3+=$4;} END {print $1 " " sum1 ", " sum2 ", " sum3}'
done
anew@buddha:~/dev/so$ bash so.sh
alma-court, 5, 2, 32
atlantic-house, 99, 1, 340
diamond, 2, 2, 16
iminds-zuiderpoort, 0, 1, 0
north-plaza, 108, 5, 737
Upvotes: 0
Reputation: 4903
You can use this as the basis of a bash script
set -u
tr -d "," < data2 | while read name col1 col2 col3 ; do
echo name=$name col1=$col1 col2=$col2 col3=$col3
done
Where data2
is your sorted file.
It gives out the following and from there you should be able to detect changes in 'name' and do the math.
name=alma-court col1=1 col2=0 col3=8
name=alma-court col1=4 col2=2 col3=24
name=atlantic-house col1=99 col2=1 col3=340
name=diamond col1=2 col2=2 col3=16
name=iminds-zuiderpoort col1=0 col2=1 col3=0
name=north-plaza col1=18 col2=3 col3=718
name=north-plaza col1=90 col2=2 col3=19
These things are usually done better in Perl/Python/Awk.
use strict;
my %names;
while(<>) {
my @F = split(',');
my $name = shift @F;
foreach my $x (0..$#F) {
$names{$name}[$x] += $F[$x];
}
}
foreach my $key ( sort keys %names ){
print $key, " ", join(" ", @{$names{$key}}), "\n";
}
Using as perl myperl.pl < yourdata
gives
alma-court 5 2 32
atlantic-house 99 1 340
diamond 2 2 16
iminds-zuiderpoort 0 1 0
north-plaza 108 5 737
Upvotes: 2
Reputation: 754110
Use awk
(unless you know or want to learn Perl or Python):
awk '{ sum[$1,0] += $2; sum[$1,1] += $3; sum[$1,2] += $4; names[$1] = 1; }
END { for (name in names)
printf "%s %d, %d, %d\n", name, sum[name,0], sum[name,1], sum[name,2];
}' <<EOF
alma-court, 1, 0, 8
alma-court, 4, 2, 24
atlantic-house, 99, 1, 340
diamond, 2, 2, 16
iminds-zuiderpoort, 0, 1, 0
north-plaza, 18, 3, 718
north-plaza, 90, 2, 19
EOF
Output:
iminds-zuiderpoort, 0, 1, 0
alma-court, 5, 2, 32
north-plaza, 108, 5, 737
atlantic-house, 99, 1, 340
diamond, 2, 2, 16
If you want the names in a specific order, sort the output. Note that the name includes the trailing comma, so the print format doesn't add a comma after the name.
Pure bash
(4.x) implementation
{
declare -A sum
declare -A names
IFS=,
while read name v1 v2 v3
do
names[$name]="$name"
((sum[${name}1] += $v1))
((sum[${name}2] += $v2))
((sum[${name}3] += $v3))
done
for name in "${names[@]}"
do
printf "%s, %d, %d, %d\n" $name ${sum[${name}1]} ${sum[${name}2]} ${sum[${name}3]}
done
} <<EOF
alma-court, 1, 0, 8
alma-court, 4, 2, 24
atlantic-house, 99, 1, 340
diamond, 2, 2, 16
iminds-zuiderpoort, 0, 1, 0
north-plaza, 18, 3, 718
north-plaza, 90, 2, 19
EOF
Output:
diamond, 2, 2, 16
atlantic-house, 99, 1, 340
north-plaza, 108, 5, 737
alma-court, 5, 2, 32
iminds-zuiderpoort, 0, 1, 0
Fortunately, the output of the awk
and the bash
scripts are the same, give or take the sort order of the data.
The bash
script is using associative arrays, which are a feature of bash
4.x that is not present in bash
3.x.
Upvotes: 1