Reputation: 33327
I want to divide two numbers in awk
, using integer division, i.e truncating the result. For example
k = 3 / 2
print k
should print 1
According to the manual,
Division; because all numbers in awk are floating-point numbers, the result is not rounded to an integer
Is there any workaround to get an integer value?
The reason is that I want to get the middle element of an array with integer indexes [0 to num-1].
Upvotes: 39
Views: 42782
Reputation: 13551
In simple cases, you can safely use int()
which truncates towards zero1:
awk 'BEGIN { print int(3 / 2) }' # prints 1
gawk 'BEGIN { print int(-3 / 2) }' # prints -1; not guaranteed in POSIX awk
Keep in mind that awk always uses double-precision floating point numbers2 and floating-point arithmetic3, though. The only way you can get integers and integer arithmetic is to use strings and either roll your own integer arithmetic (see another answer), or call external tools, e.g. the standard expr
utility:
awk 'BEGIN { "expr 3 / 2" | getline result; print result; }' # prints 1
This is really awkward, long, slow, … but safe and portable.
1 In POSIX awk, truncation to zero is guaranteed only for positive arguments: int(x) — Return the argument truncated to an integer. Truncation shall be toward 0 when x>0. GNU awk (gawk) uses truncation toward zero even for negative numbers: int(x) — Return the nearest integer to x, located between x and zero and truncated toward zero. For example, int(3) is 3, int(3.9) is 3, int(-3.9) is -3, and int(-3) is -3 as well.
2 Numeric expressions are specified as double-precision floats in Expressions in awk in POSIX.
3 All arithmetic shall follow the semantics of floating-point arithmetic as specified by the ISO C standard (see Concepts Derived from the ISO C Standard). — POSIX awk: Arithmetic functions
If you choose to use floats, you should know about their quirks and be ready to spot them and avoid related bugs. Several scary examples:
Unrepresentable numbers:
awk 'BEGIN { x = 0.875; y = 0.425; printf("%0.17g, %0.17g\n", x, y) }'
# prints 0.875, 0.42499999999999999
Round-off errors accumulation:
awk 'BEGIN{s=0; for(i=1;i<=100000;i++)s+=0.3; printf("%.10f, %d\n",s,int(s))}'
# prints 29999.9999999506, 29999
Round-off errors ruin comparisons:
awk 'BEGIN { print (0.1 + 12.2 == 12.3) }' # prints 0
Precision decreases with magnitude, causing infinite loops:
awk 'BEGIN { for (i=10^16; i<10^16+5; i++) printf("%d\n", i) }'
# prints 10000000000000000 infinitely many times
Read more on how floats work:
Stack Overflow tags floating-point wiki
Wikipedia article Floating point
GNU awk arbitrary precision arithmetic – contains both info on the specific implementation and general knowledge
Upvotes: 16
Reputation: 2805
for what it's worth, I have these 3 functions in my personal library to, one each for
awk
approach),The 3 functions are cross-dependent so to avoid re-inventing the wheel 3 times. Inputs can be in any format - integer, floating point, or numeric strings.
Both dividend and divisor are pre-truncated before any division occurs.
With gawk -M
(bigint via GMP
), these functions offer UNLIMITED division precision without needing to set the PREC
parameter. Without GMP
, precision offered is the standard 53-bits of double precision FP underlying all of awk
.
function divmod_trunc(___, _, __) {
return \
(__ = _ = int(_)) == (_ = !!_) \
? (+(__ = "=%_/=") < _ \
? ((!_)__) ___ \
: ERRNO = (_ = "NAN")__ (! (___ = +int(substr(___,
_ = (__ = _)^(_ < _), -(_++) + _^_^_) ".")) \
? __ : substr("-INF", _ - (___ < !_)))) \
: (_ = (___ = \
int(___)) % __) "=%_/=" (___ - _) /__
}
function divmod_floor(__, _, ___) {
return \
((__ = int(__)) < !!__) == ((_ = int(_)) < !!_) || !_ \
? divmod_trunc(__, _) \
: (__ - _ * (___ = (\
__ - (___ = __ % _)) / _ - !!_)) ("=%_/=")___
}
function divmod_euclid(__, _, ___) {
return \
(___ = (_ = int(_)) == !!_ ||
-(__ = int(__)) < __) || __*_ < !_ \
? (___ ? divmod_trunc(__, _) \
: divmod_floor(__, _)) \
: ((___ = (__ - (\
__ %= _)) / _)^!_ * __ - _) "=%_/=" (++___)
}
Since awk
lack tuples as a return type, these functions attempt to emulate that effect by simultaneously returning both remainder and quotient as a "string-connected tuple", in the format
REMAINDER=%_/=QUOTIENT
All zeros are treated as unsigned. Division by zero return unsigned NAN
as the remainder, and one of NAN
, INF
, or -INF
as its quotient.
This pair of primes will showcase their differences :
468888899996789 / 23456789
TRUNC :: 2701014=%_/=19989475
FLOOR :: 2701014=%_/=19989475
ECLID :: 2701014=%_/=19989475
-468888899996789 / 23456789
TRUNC :: -2701014=%_/=-19989475
FLOOR :: 20755775=%_/=-19989476
ECLID :: 20755775=%_/=-19989476
468888899996789 / -23456789
TRUNC :: 2701014=%_/=-19989475
FLOOR :: -20755775=%_/=-19989476
ECLID :: 2701014=%_/=-19989475
-468888899996789 / -23456789
TRUNC :: -2701014=%_/=19989475
FLOOR :: -2701014=%_/=19989475
ECLID :: 20755775=%_/=19989476
These functions are fully POSIX
-compliant and works on all awk
s. No numbers are hard-coded at all in these functions since all numeric constants and offsets required by the functions are generated on the fly as part of input cleansing.
Upvotes: 1
Reputation: 185073
Use the int
function to get the integer part of the result, truncated toward 0. This produces the nearest integer to the result, located between the result and 0. For example, int(3/2)
is 1, int(-3/2)
is -1.
Source: The AWK Manual - Numeric Functions
Upvotes: 58
Reputation: 61
Safe and quick awk integer division can be done with:
q=(n-n%d)/d+(n<0)
Upvotes: 6