Reputation: 3409
I have a file which contains several thousand numbers, each on its own line:
34
42
11
6
2
99
...
I'm looking to write a script which will print the sum of all numbers in the file. I've got a solution, but it's not very efficient. (It takes several minutes to run.) I'm looking for a more efficient solution. Any suggestions?
Upvotes: 269
Views: 231999
Reputation: 2807
UPDATE :
gnu-parallel
benchmarking pre-made file over -pipe-part
:
(parallel --pipe-part --argfile "${DT}/temptestpipepartinput.txt" | gpaste )
Exactly like command above: 61.57s user 76.92s system 424% cpu 32.609 total
-j 2
27.883 total
-j 4
21.850 total
-j 6
21.221 total <—- min point (didn't check 5
or 7
)
-j 8
25.133 total
-j 10
30.734 total
-j 12
36.279 total
Using the pre-made file
mawk1.9.9.6
:: 6.953 secs
using its own file I/O
, and 7.128 secs
piped-in.
perl 5.36.1
:: 8.786 secs
using its own file I/O
, and 8.925 secs
piped in.
python3.11.5
:: here's the strange beast - apparently summing via int(_)
instead of float(_)
is a 17.98 %
slow down penalty:
8.468 secs
python3 -c 'import sys; print(int(sum((float(_) for _ in sys.stdin))))'
9.991 secs
python3 -c 'import sys; print(int(sum(( int(_) for _ in sys.stdin))))'
Side note: this set of integers created a file with perfect digit uniformity when it came to stats from gnu-wc
:
99,999,999 888,888,888 888,888,888
A perfect chain of eight 9
s for row count, and chain of nine 8
s for byte count. The digits-only count after backing out all the \n(ewlines)
:
788,888,889
In awk
, just getting a 2nd column with cumulative sum is far less syntax than saving it towards the end:
jot 20 61111111889 - 799973766543 |
mawk '$2=_+=$1' # skips rows with zero(0) as its value
gawk '($2=_+=$1)_' # no rows left behind
61111111889 61111111889
861084878432 922195990321
1661058644975 2583254635296
2461032411518 5044287046814
3261006178061 8305293224875
4060979944604 12366273169479
4860953711147 17227226880626
5660927477690 22888154358316
6460901244233 29349055602549
7260875010776 36609930613325
8060848777319 44670779390644
8860822543862 53531601934506
9660796310405 63192398244911
10460770076948 73653168321859
11260743843491 84913912165350
12060717610034 96974629775384
12860691376577 109835321151961
13660665143120 123495986295081
14460638909663 137956625204744
15260612676206 153217237880950
For all practical purposes, perl5
python3
and mawk2
are tied for speed summing up from 1
to 99,999,999
::
(echo '99999999' | mawk2 '$++NF = (__=+$++_)*++__/++_'
99999999 4999999950000000
(All input digits were re-generated on the fly and piped in to eliminate any potential cache access advantage):
in0: 847MiB 0:00:10 [81.1MiB/s] [81.1MiB/s] [ <=> ]
1 4999999950000000
(python3 -c 'import sys; print(int(sum((float(_) for _ in sys.stdin))))')
19.14s user 0.55s system 188% cpu 10.473 total
gcat -b 0.00s user 0.00s system 0% cpu 10.473 total
in0: 847MiB 0:00:10 [81.0MiB/s] [81.0MiB/s] [ <=> ]
1 4999999950000000
(perl536 -nle '$sum += $_ } END { print $sum')
19.37s user 0.55s system 190% cpu 10.472 total
gcat -b 0.00s user 0.00s system 0% cpu 10.472 total
in0: 847MiB 0:00:10 [81.1MiB/s] [81.1MiB/s] [ <=>]
1 4999999950000000
(mawk1996 '{ _+=$__ } END { print _ }')
17.51s user 0.57s system 172% cpu 10.463 total
gcat -b 0.00s user 0.00s system 0% cpu 10.463 total
However, once you eliminate the pipe and hashing speed factors and ask them to sum it among itself, perl5.36
is some 52% slower
:
( time (
mawk2 'BEGIN { for(___=_-=_=__=((_+=++_)+(_*=_+_))^_; ++_<__;)___+=_
print ___ }'
) | gcat -b ) | lgp3 ;
( time (
perl5 -e '$y = $x = 0; $z = 10**8; while(++$x < $z) { $y += $x } print $y'
) | gcat -b ) | lgp3 ;
1 4999999950000000
( mawk2 ; ) 1.97s user 0.01s system 99% cpu 1.981 total
gcat -b 0.00s user 0.00s system 0% cpu 1.979 total
( perl5 -e '$y = $x = 0; $z = 10**8; while(++$x < $z) { $y += $x } print $y'; 2.98s user 0.03s system 99% cpu 3.015 total
gcat -b 0.00s user 0.00s system 0% cpu 3.014 total
1 4999999950000000
As for gnu-parallel
, they're more than half an order of magnitude slower
- 36 concurrent jobs with 5,000,000 rows per job and very generous 100 MB size upper cap running on M1 Max with 64 GB ram and it still took nearly 53 seconds compare to about
10.5 secs
for the other 3.
( time ( mawk2 'BEGIN { for(_-=_=__=((_+=++_)+(_*=_+_))^_; ++_ < __; ) print _ }' |
pvE0 |
parallel --block 100M -N 5000000 -j 36 --pipe "gpaste -sd+ - | bc" | gpaste -sd+ - | bc
) | gcat -b ) | lgp3 | lgp3 -1;
in0: 847MiB 0:00:47 [17.8MiB/s] [17.8MiB/s] [ <=> ]
1 4999999950000000
0.00s user 0.00s system 0% cpu 52.895 total
======================
reference code for massively loop unrolled summations (this variant is 512 numbers per while()
-loop round :
( gawk -p- -be "${DT}/temptestpipepartinput.txt"; )
8.50s user 1.46s system 99% cpu 9.970 total
1 4999999950000000
2 # gawk profile, created Sat Oct 21 04:25:20 2023
3 # BEGIN rule(s)
4 BEGIN {
5 1 CONVFMT="%.250g"
6 1 FS=RS
7 1 RS="^$"
8 }
9 # END rule(s)
10 END {
11 1 print ______()
12 }
13 # Functions, listed alphabetically
14 1 function ______(_, __, ___)
15 {
16 1 ___=(__=_=_<_)+NF
17 196079 while (_<___)
18 __ += $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
19 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
20 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
21 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
22 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
23 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
24 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
25 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
26 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
27 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
28 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
29 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
30 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
31 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
32 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
33 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
34 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
35 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
36 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
37 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
38 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
39 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
40 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
41 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
42 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
43 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
44 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
45 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
46 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
47 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
48 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
49 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
50 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
51 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
52 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
53 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
54 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
55 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
56 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
57 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
58 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
59 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
60 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
61 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
62 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
63 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
64 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
65 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
66 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
67 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
68 + $++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_+$++_
69 + $++_+$++_
71 return __
73 }
Upvotes: 0
Reputation: 12749
C always wins for speed:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv) {
ssize_t read;
char *line = NULL;
size_t len = 0;
double sum = 0.0;
while (read = getline(&line, &len, stdin) != -1) {
sum += atof(line);
}
printf("%f\n", sum);
return 0;
}
Timing for 1M numbers (same machine/input as my python answer):
$ gcc sum.c -o sum && time ./sum < numbers
5003371677.000000
real 0m0.188s
user 0m0.180s
sys 0m0.000s
Upvotes: 5
Reputation: 34120
say sum lines
~$ raku -e '.say for 0..1000000' > test.in
~$ raku -e 'say sum lines' < test.in
500000500000
The way this works is that lines
produces a sequence of strings which are the input lines.
sum
takes that sequence, turns each line into a number and adds them together.
All that is left is for say
to print out that value followed by a newline. (It could have been print
or put
, but say
is more alliterative.)
Upvotes: 9
Reputation: 31
As long as there only integer-numbers i basically translate the file into an bash math expression and execute it. It is similar to the solution with 'bc' from further above, but faster. Observe the zero at the end of the inner expression is needed for an argument of the final line. I have tested it with 475.000 lines and it is less than a second.
echo $(($(cat filename | tr '\n' '+')0))
Upvotes: 2
Reputation: 9893
I prefer to use GNU datamash for such tasks because it's more succinct and legible than perl or awk. For example
datamash sum 1 < myfile
where 1 denotes the first column of data.
Upvotes: 11
Reputation: 123448
None of the solution thus far use paste
. Here's one:
paste -sd+ filename | bc
If the file has a trailing newline, a trailing +
will incur a syntax error. Fix the error by removing the trailing +
:
paste -sd+ fiilename | sed 's/+$//g' | bc
As an example, calculate Σn where 1<=n<=100000:
$ seq 100000 | paste -sd+ | bc -l
5000050000
(For the curious, seq n
would print a sequence of numbers from 1
to n
given a positive number n
.)
Upvotes: 145
Reputation: 7253
Bash variant
raw=$(cat file)
echo $(( ${raw//$'\n'/+} ))
$ wc -l file
10000 file
$ time ./test
323390
real 0m3,096s
user 0m3,095s
sys 0m0,000s
What is happening here? Read the content of a file into $raw var. Then create math statement from this var by changing all new lines into '+'
Upvotes: 2
Reputation: 1807
I couldn't just pass by... Here's my Haskell one-liner. It's actually quite readable:
sum <$> (read <$>) <$> lines <$> getContents
Unfortunately there's no ghci -e
to just run it, so it needs the main function, print and compilation.
main = (sum <$> (read <$>) <$> lines <$> getContents) >>= print
To clarify, we read entire input (getContents
), split it by lines
, read
as numbers and sum
. <$>
is fmap
operator - we use it instead of usual function application because sure this all happens in IO. read
needs an additional fmap
, because it is also in the list.
$ ghc sum.hs
[1 of 1] Compiling Main ( sum.hs, sum.o )
Linking sum ...
$ ./sum
1
2
4
^D
7
Here's a strange upgrade to make it work with floats:
main = ((0.0 + ) <$> sum <$> (read <$>) <$> lines <$> getContents) >>= print
$ ./sum
1.3
2.1
4.2
^D
7.6000000000000005
Upvotes: 5
Reputation: 211
GNU Parallel can presumably be used to improve many of the above answers by spreading the workload across multiple cores.
In the example below we send chunks of 500 numbers (--max-lines=500
) to bc
processes which are executed in parallel 4 at a time (-j 4
). The results are then aggregated by a final bc
.
time parallel --max-lines=500 -j 4 --pipe "paste -sd+ - | bc" < random_numbers | paste -sd+ - | bc
The optimal choice of work size and number of parallel processes depends on the machine and problem. Note that this solution only really shines when there's a large number of parallel processes with substantial work each.
Upvotes: 0
Reputation: 52334
One in tcl:
#!/usr/bin/env tclsh
set sum 0
while {[gets stdin num] >= 0} { incr sum $num }
puts $sum
Upvotes: 0
Reputation: 836
In shell using awk, I have used below script to do so:
#!/bin/bash
total=0;
for i in $( awk '{ print $1; }' <myfile> )
do
total=$(echo $total+$i | bc )
((count++))
done
echo "scale=2; $total " | bc
Upvotes: 0
Reputation: 1807
C++ "one-liner":
#include <iostream>
#include <iterator>
#include <numeric>
using namespace std;
int main() {
cout << accumulate(istream_iterator<int>(cin), istream_iterator<int>(), 0) << endl;
}
Upvotes: 3
Reputation: 12749
In Go:
package main
import (
"bufio"
"fmt"
"os"
"strconv"
)
func main() {
scanner := bufio.NewScanner(os.Stdin)
sum := int64(0)
for scanner.Scan() {
v, err := strconv.ParseInt(scanner.Text(), 10, 64)
if err != nil {
fmt.Fprintf(os.Stderr, "Not an integer: '%s'\n", scanner.Text())
os.Exit(1)
}
sum += v
}
fmt.Println(sum)
}
Upvotes: 1
Reputation: 1598
I've written an R script to take arguments of a file name and sum the lines.
#! /usr/local/bin/R
file=commandArgs(trailingOnly=TRUE)[1]
sum(as.numeric(readLines(file)))
This can be sped up with the "data.table" or "vroom" package as follows:
#! /usr/local/bin/R
file=commandArgs(trailingOnly=TRUE)[1]
sum(data.table::fread(file))
#! /usr/local/bin/R
file=commandArgs(trailingOnly=TRUE)[1]
sum(vroom::vroom(file))
Same benchmarking data as @glenn jackman.
for ((i=0; i<1000000; i++)) ; do echo $RANDOM; done > random_numbers
In comparison to the R call above, running R 3.5.0 as a script is comparable to other methods (on the same Linux Debian server).
$ time R -e 'sum(scan("random_numbers"))'
0.37s user
0.04s system
86% cpu
0.478 total
R script with readLines
$ time Rscript sum.R random_numbers
0.53s user
0.04s system
84% cpu
0.679 total
R script with data.table
$ time Rscript sum.R random_numbers
0.30s user
0.05s system
77% cpu
0.453 total
R script with vroom
$ time Rscript sum.R random_numbers
0.54s user
0.11s system
93% cpu
0.696 total
For reference here as some other methods suggested on the same hardware
Python 2 (2.7.13)
$ time python2 -c "import sys; print sum((float(l) for l in sys.stdin))" < random_numbers
0.27s user 0.00s system 89% cpu 0.298 total
Python 3 (3.6.8)
$ time python3 -c "import sys; print(sum((float(l) for l in sys.stdin)))" < random_number
0.37s user 0.02s system 98% cpu 0.393 total
Ruby (2.3.3)
$ time ruby -e 'sum = 0; File.foreach(ARGV.shift) {|line| sum+=line.to_i}; puts sum' random_numbers
0.42s user
0.03s system
72% cpu
0.625 total
Perl (5.24.1)
$ time perl -nle '$sum += $_ } END { print $sum' random_numbers
0.24s user
0.01s system
99% cpu
0.249 total
Awk (4.1.4)
$ time awk '{ sum += $0 } END { print sum }' random_numbers
0.26s user
0.01s system
99% cpu
0.265 total
$ time awk '{ sum += $1 } END { print sum }' random_numbers
0.34s user
0.01s system
99% cpu
0.354 total
C (clang version 3.3; gcc (Debian 6.3.0-18) 6.3.0 )
$ gcc sum.c -o sum && time ./sum < random_numbers
0.10s user
0.00s system
96% cpu
0.108 total
Lua (5.3.5)
$ time lua -e 'sum=0; for line in io.lines() do sum=sum+line end; print(sum)' < random_numbers
0.30s user
0.01s system
98% cpu
0.312 total
tr (8.26) must be timed in bash, not compatible with zsh
$time { { tr "\n" + < random_numbers ; echo 0; } | bc; }
real 0m0.494s
user 0m0.488s
sys 0m0.044s
sed (4.4) must be timed in bash, not compatible with zsh
$ time { head -n 10000 random_numbers | sed ':a;N;s/\n/+/;ta' |bc; }
real 0m0.631s
user 0m0.628s
sys 0m0.008s
$ time { head -n 100000 random_numbers | sed ':a;N;s/\n/+/;ta' |bc; }
real 1m2.593s
user 1m2.588s
sys 0m0.012s
note: sed calls seem to work faster on systems with more memory available (note smaller datasets used for benchmarking sed)
Julia (0.5.0)
$ time julia -e 'print(sum(readdlm("random_numbers")))'
3.00s user
1.39s system
136% cpu
3.204 total
$ time julia -e 'print(sum(readtable("random_numbers")))'
0.63s user
0.96s system
248% cpu
0.638 total
Notice that as in R, file I/O methods have different performance.
Upvotes: 2
Reputation: 246744
Just for fun, let's benchmark it:
$ for ((i=0; i<1000000; i++)) ; do echo $RANDOM; done > random_numbers
$ time perl -nle '$sum += $_ } END { print $sum' random_numbers
16379866392
real 0m0.226s
user 0m0.219s
sys 0m0.002s
$ time awk '{ sum += $1 } END { print sum }' random_numbers
16379866392
real 0m0.311s
user 0m0.304s
sys 0m0.005s
$ time { { tr "\n" + < random_numbers ; echo 0; } | bc; }
16379866392
real 0m0.445s
user 0m0.438s
sys 0m0.024s
$ time { s=0;while read l; do s=$((s+$l));done<random_numbers;echo $s; }
16379866392
real 0m9.309s
user 0m8.404s
sys 0m0.887s
$ time { s=0;while read l; do ((s+=l));done<random_numbers;echo $s; }
16379866392
real 0m7.191s
user 0m6.402s
sys 0m0.776s
$ time { sed ':a;N;s/\n/+/;ta' random_numbers|bc; }
^C
real 4m53.413s
user 4m52.584s
sys 0m0.052s
I aborted the sed run after 5 minutes
I've been diving to lua, and it is speedy:
$ time lua -e 'sum=0; for line in io.lines() do sum=sum+line end; print(sum)' < random_numbers
16388542582.0
real 0m0.362s
user 0m0.313s
sys 0m0.063s
and while I'm updating this, ruby:
$ time ruby -e 'sum = 0; File.foreach(ARGV.shift) {|line| sum+=line.to_i}; puts sum' random_numbers
16388542582
real 0m0.378s
user 0m0.297s
sys 0m0.078s
Heed Ed Morton's advice: using $1
$ time awk '{ sum += $1 } END { print sum }' random_numbers
16388542582
real 0m0.421s
user 0m0.359s
sys 0m0.063s
vs using $0
$ time awk '{ sum += $0 } END { print sum }' random_numbers
16388542582
real 0m0.302s
user 0m0.234s
sys 0m0.063s
Upvotes: 102
Reputation: 309
It is not easier to replace all new lines by +
, add a 0
and send it to the Ruby
interpreter?
(sed -e "s/$/+/" file; echo 0)|irb
If you do not have irb
, you can send it to bc
, but you have to remove all newlines except the last one (of echo
). It is better to use tr
for this, unless you have a PhD in sed
.
(sed -e "s/$/+/" file|tr -d "\n"; echo 0)|bc
Upvotes: 0
Reputation: 8863
Another option is to use jq
:
$ seq 10|jq -s add
55
-s
(--slurp
) reads the input lines into an array.
Upvotes: 38
Reputation: 10528
More succinct:
# Ruby
ruby -e 'puts open("random_numbers").map(&:to_i).reduce(:+)'
# Python
python -c 'print(sum(int(l) for l in open("random_numbers")))'
Upvotes: 5
Reputation: 4107
You can do it with Alacon - command-line utility for Alasql database.
It works with Node.js, so you need to install Node.js and then Alasql package:
To calculate sum from TXT file you can use the following command:
> node alacon "SELECT VALUE SUM([0]) FROM TXT('mydata.txt')"
Upvotes: 0
Reputation: 21775
With Ruby:
ruby -e "File.read('file.txt').split.inject(0){|mem, obj| mem += obj.to_f}"
Upvotes: 1
Reputation: 12749
Here is a solution using python with a generator expression. Tested with a million numbers on my old cruddy laptop.
time python -c "import sys; print sum((float(l) for l in sys.stdin))" < file
real 0m0.619s
user 0m0.512s
sys 0m0.028s
Upvotes: 3
Reputation: 20280
Just for fun, lets do it with PDL, Perl's array math engine!
perl -MPDL -E 'say rcols(shift)->sum' datafile
rcols
reads columns into a matrix (1D in this case) and sum
(surprise) sums all the element of the matrix.
Upvotes: 4
Reputation: 1849
cat nums | perl -ne '$sum += $_ } { print $sum'
(same as brian d foy's answer, without 'END')
Upvotes: 4
Reputation: 1549
Here's another:
open(FIL, "a.txt");
my $sum = 0;
foreach( <FIL> ) {chomp; $sum += $_;}
close(FIL);
print "Sum = $sum\n";
Upvotes: 0
Reputation: 1216
Another for fun
sum=0;for i in $(cat file);do sum=$((sum+$i));done;echo $sum
or another bash only
s=0;while read l; do s=$((s+$l));done<file;echo $s
But awk solution is probably best as it's most compact.
Upvotes: 1
Reputation: 72312
Here's another one-liner
( echo 0 ; sed 's/$/ +/' foo ; echo p ) | dc
This assumes the numbers are integers. If you need decimals, try
( echo 0 2k ; sed 's/$/ +/' foo ; echo p ) | dc
Adjust 2 to the number of decimals needed.
Upvotes: 7
Reputation: 132758
For a Perl one-liner, it's basically the same thing as the awk
solution in Ayman Hourieh's answer:
% perl -nle '$sum += $_ } END { print $sum'
If you're curious what Perl one-liners do, you can deparse them:
% perl -MO=Deparse -nle '$sum += $_ } END { print $sum'
The result is a more verbose version of the program, in a form that no one would ever write on their own:
BEGIN { $/ = "\n"; $\ = "\n"; }
LINE: while (defined($_ = <ARGV>)) {
chomp $_;
$sum += $_;
}
sub END {
print $sum;
}
-e syntax OK
Just for giggles, I tried this with a file containing 1,000,000 numbers (in the range 0 - 9,999). On my Mac Pro, it returns virtually instantaneously. That's too bad, because I was hoping using mmap
would be really fast, but it's just the same time:
use 5.010;
use File::Map qw(map_file);
map_file my $map, $ARGV[0];
$sum += $1 while $map =~ m/(\d+)/g;
say $sum;
Upvotes: 121
Reputation: 129373
I have not tested this but it should work:
cat f | tr "\n" "+" | sed 's/+$/\n/' | bc
You might have to add "\n" to the string before bc (like via echo) if bc doesn't treat EOF and EOL...
Upvotes: 0