Reputation: 3
I am trying to loop through files in a directory to find an animal and its value. The command is supposed to only display the animal and total value. For example:
File1 has:
Monkey 11
Bear 4
File2 has:
Monkey 12
If I wanted the total value of monkeys then I would do:
for f in *; do
total=$(grep $animal $f | cut -d " " -f 2- | paste -sd+ | bc)
done
echo $animal $total
This would return the correct value of:
Monkey 23
However, if there is only one instance of an animal like for example Bear, the variable total doesn't return any value, I only get echoed:
Bear
Why is this the case and how do I fix it?
Note: I'm not allowed to use the find
command.
Upvotes: 0
Views: 71
Reputation: 3985
$ head File*
==> File1 <==
Monkey 11
Bear 4
==> File2 <==
Monkey 12
==> File3 <==
Bear
Monkey
Using awk and bash array
#!/bin/bash
sumAnimals(){
awk '
{ NF == 1 ? a[$1]++ : a[$1]=a[$1]+$2 }
END{
for (i in a ) printf "[%s]=%d\n",i, a[i]
}
' File*
}
# storing all animals in bash array
declare -A animalsArr="( $(sumAnimals) )"
# show array content
declare -p animalsArr
# getting total from array
echo "Monkey: ${animalsArr[Monkey]}"
echo "Bear: ${animalsArr[Monkey]}"
Output
declare -A animalsArr=([Bear]="5" [Monkey]="24" )
Monkey: 24
Bear: 5
Upvotes: 0
Reputation: 35461
Comments on OP's question about why code behaves as it does:
total
is reset on each pass through the loop so ...total
will have the count from the 'last' file processedBear
the 'last' file processed is File2
and since File2
does not contain any entries for Bear
we get total=''
, which is what's printed by the echo
Bear
entry is moved from File1
to File2
then OP's code should print Bear 4
File2
in this case)OP's current code generates the following:
# Monkey
Monkey 12 # from File2
# Bear
Bear # no match in File2
I'd probably opt for replacing the whole grep/cut/paste/bc
(4x subprocesses) with a single awk
(1x subprocess) call (and assuming no matches we report 0
):
for animal in Monkey Bear Hippo
do
total=$(awk -v a="${animal}" '$1==a {sum+=$2} END {print sum+0}' *)
echo "${animal} ${total}"
done
This generates:
Monkey 23
Bear 4
Hippo 0
NOTES:
echo
the count to stdout hence the need of the total
variable otherwise we could eliminate the total
variable and have awk
print the animal/sum pair directly to stdoutawk
call could process all of the animals at once; objective being to have awk
generate the entire set of animal/sum pairs that could then be fed to the looping construct; if this is the case, and OP has some issues implementing a single awk
solution, a new question should be askedUpvotes: 0
Reputation: 247210
With just bash:
declare -A animals=()
for f in *; do
while read -r animal value; do
(( animals[$animal] = ${animals[$animal]:-0} + value ))
done < "$f"
done
declare -p animals
outputs
declare -A animals=([Monkey]="23" [Bear]="4" )
With this approach, you have all the totals for all the animals by processing each file exactly once
Upvotes: 0
Reputation: 142005
Why is this the case
grep
outputs nothing, so nothing is propagated through the pipe and empty string is assigned to total
.
Because total
is reset every loop (total=anything
without referencing previous value), it just has the value for the last file.
how do I fix it?
Do not try to do all at once, just less thing at once.
total=0
for f in *; do
count=$(grep "$animal" "$f" | cut -d " " -f 2-)
total=$((total + count)) # reuse total, reference previous value
done
echo "$animal" "$total"
A programmer fluent in shell will most probably jump to AWK for such problems. Remember to check your scripts with shellcheck.
With what you were trying to do, you could do all files at once:
total=$(
{
echo 0 # to have at least nice 0 if animal is not found
grep "$animal" * |
cut -d " " -f 2-
} |
paste -sd+ |
bc
)
Upvotes: 0
Reputation: 17290
you could use this little awk
instead of for
grep
cut
paste
bc
:
awk -v animal="Bear" '
$1 == animal { count += $2 }
END { print count + 0 }
' *
Upvotes: 1