Reputation: 17
I have the following input file:
-0.805813 0.874753 -0.776101 -0.749147 -0.636834 0.379035 -0.004061 -0.004061
-0.426119 -0.024801 -0.041989 -0.783686 0.361837 0.055206 0.368603 0.147965
-0.632526 -0.100358 0.847947 -0.690233 -0.996141 0.445275 1.086014 -1.097968
0.411383 0.411383 -0.734988 0.344954 2.577123 -0.372104 -0.923401 0.302907
0.302907 -1.424862 1.165900 -0.776100 -0.776100 -0.495400 0.182533 0.002356
0.002356 0.002356
I used awk to calculate the sum of these values in a sequential order (sum = -3.0000):
awk '{ for (i=1; i<=NF; i++) sum += $i } END { printf("%3.4f", sum) }' input.txt
Is there any possibility to use awk to skip values in a sequential order starting from the last line and to calculate sum for the rest of the values? For instance:
-0.805813 0.874753 -0.776101 -0.749147 -0.636834 0.379035 -0.004061 -0.004061
-0.426119 -0.024801 -0.041989 -0.783686 0.361837 0.055206 0.368603 0.147965
-0.632526 -0.100358 0.847947 -0.690233 -0.996141 0.445275 1.086014 -1.097968
0.411383 0.411383 -0.734988 0.344954 2.577123 -0.372104 -0.923401 0.302907
0.302907 -1.424862 **1.165900 -0.776100 -0.776100 -0.495400 0.182533 0.002356
0.002356 0.002356**
where I want to skip the values between the stars (sum = -2.3079). The number of values that should be skipped may variate.
Thanks!
I already achieved this by using sed piped with awk:
sed '$d' input.txt | awk '{ for (i=1; i<=NF; i++) sum += $i } END { for (i=NF-5; i<=NF; i++) sum -= $i; print sum }'
However, a pure awk one-liner would be more preferred.
Upvotes: 1
Views: 129
Reputation: 2895
this awk
approach makes no assumption(s) regarding
** ... **
sections, if any\r\n
or \n
The full input in processed in one shot, and bypasses the need for a temp storage array.
Even numbered fields, which would be where the skipped sections reside, get blanked out, then the leftovers are re-split into usable fields.
mawk 'BEGIN { FS = (_ = "[*]")_
RS = (__ = "") "^$" } END {
for (++_; _++ < NF; _++) $_ = __
_*= FS = "[ \11-\15]+"
$_ = $_
_ = ++NF
while(--_)__ += $_
printf("%.16g\n", __) }'
-2.307901
Upvotes: 0
Reputation: 16819
Stripping down @markp-fuso's idea:
awk -v RS=' ' '
NF {
ndx = cnt++ % lastN
sum += circlist[ndx]
circlist[ndx] = +$0
}
END { printf "%3.4f", sum }
' lastN=8 input.txt
The reason his array initialization and comparisons are not needed is that awk guarantees the values of uninitialized variables.
Splitting input on space (RS=' '
) instead of newline and then checking the record has a field (the default behaviour of FS
will split on the remaining whitespace), is more compact than his for loop to read each field, but requires that there is at least one actual space character between each number.
Your example lines begin with a leading space; if they did not, my code would fail silently by discarding the first element on each line (it would become $2
but +$0
is parsed as just the value of $1
). If your awk supports using regex as RS (which a future standard may allow, and many popular versions already support), this problem can be fixed by using RS='[[:space:]]+'
. (Or by using the original for
loop to iterate over the fields.)
Upvotes: 1
Reputation: 3985
Using GNU AWK
$ awk -v RS='\\s' '!/^$/' file |
awk -v n=8 '{sum[NR]=sum[NR-1]+$1} END{print sum[NR-n]}'
-2.3079
Upvotes: 1
Reputation: 204456
Assuming what you're asking to do is be able to skip the last N numbers from the input then using any awk:
$ awk -v n=8 '
{ for (i=1; i<=NF; i++) vals[++c]=$i }
END { for (i=1; i<=c-n; i++) sum+=vals[i]; printf "%3.4f", sum }
' file
-2.3079
or if you wanted to skip all values on the last line plus the 6 values at the end of the line before that:
$ awk -v n=6 '
{ for (i=1; i<=NF; i++) vals[++c]=$i }
END { for (i=1; i<=c-(n+NF); i++) sum+=vals[i]; printf "%3.4f", sum }
' file
-2.3079
Upvotes: 2
Reputation: 35256
General approach:
One awk
idea:
awk -v lastN=HOW_MANY_TO_IGNORE '
BEGIN { for (i=0;i<lastN;i++) circlist[i]="X" } # initialize circular list
{ for (i=1;i<=NF;i++) {
cnt++ # increment count of numbers seen so far
ndx=cnt%lastN # calculate modulo index
sum+=(circlist[ndx] != "X" ? circlist[ndx] : 0) # add previous entry from circlist[] ?
circlist[ndx]=$i # add current value to circlist[]
}
}
END { printf("%3.4f", sum) }
' input.txt
NOTES:
lastN
is assigned a positive integer otherwise OP can add logic to validate the value of lastN
For OP's 2nd set of data we use -v lastN=8
which generates:
-2.3079
To verify the result we can make note of the fact that the 1st number to be ignored (1.165900
) only occurs once in the data set so we can hardcode this into OP's current code:
$ awk '{for (i=1;i<=NF;i++) if ($i == 1.165900) exit; else sum += $i} END {printf("%3.4f", sum)}' input.txt
-2.3079
Upvotes: 2