Reputation: 4518
I was wondering about the differences between positive and negative zero in different numeric types.
I understand the IEEE-754 for floating point arithmetic and bit representation in double precision so the following didn't come as a surprise
double posz = 0.0;
double negz = -0.0;
System.out.println(Long.toBinaryString(Double.doubleToLongBits(posz)));
System.out.println(Long.toBinaryString(Double.doubleToLongBits(negz)));
// output
>>> 0
>>> 1000000000000000000000000000000000000000000000000000000000000000
What did surprise me and showed me that im clueless about the bit representation of long
type in java is that even if i shift right (unsigned >>>
) then the binary representation of both positive and negative zero is the same
long posz = 0L;
long negz = -0L;
for (int i = 63; i >= 0; i--) {
System.out.print((posz >>> i) & 1);
}
System.out.println();
for (int i = 63; i >= 0; i--) {
System.out.print((negz >>> i) & 1);
}
// output
>>> 0000000000000000000000000000000000000000000000000000000000000000
>>> 0000000000000000000000000000000000000000000000000000000000000000
so i am wondering what does java do from a bit representation when i write the following
long posz = 0L;
long negz = -0L;
Does the compiler understand that they are both zero and disregards the sign (and so assignes 0 to the sign bit) or is there other magic here?
Upvotes: 0
Views: 536
Reputation: 103728
or is there other magic here?
Yes. 2's complement.
2's complement is a bit magical. It accomplishes 2 major objectives. Before getting into that, let's first stew on the notion of negative zero for a moment.
Negative zero isn't actually a thing. Ask any mathematician "Hey, so, what's up with negative zero?" and they'll just look at you in befuddlement. It's not a thing. Mathematically, 0 and -0 are utterly identical. Not just 'nearly identical', but 100%, fully, in all possible ways, identical. We don't generally want our numbers to be capable of representing both 5.0
as well as 5.00
- as those two are entirely, 100%, identical. If you don't think that a value system ought to waste bits trying to differentiate between 5.0
and 5.00
, then it's equally bizarro to want the ability to represent -0.0
and +0.0
as distinct entities.
So, wanting -0
in the first place is kinda weird. All the numeric primitives (long
, int
, short
, byte
, and I guess char
which is technically numeric too) all cannot represent this number. Instead, long z = -0
boils down to:
-
is a unary operator. Just like 2+5
makes the system calculate the binary operation of "addition" on elements 2 and 5, -x
makes the system calculate the unary operation of "negation" on element x
. Applying the negation operation to 0 produces 0. It's no different from writing, say, int x = 5 + 0;
. That +0
part doesn't do anything. The -
in front of -0
doesn't do anything. In contrast to -0.0
where it does do something (gets you negative zero, the double
value, instead of positive zero).z
(so, just 0 then).There is no way to tell if that minus is there. They both result in ALL ZERO bits, and hence, there is no way for the computer to tell if you initialized that variable with the expression -0
or with +0
. Again in contrast to double
where as you noticed there's a bit different.
double
have it then?Let's stew a bit on the notion of doubles and IEEE-754 math.
A double
takes 64 bits. From basic pure mathematical principles then, a double is as incapable of representing more than 2^64 different possible values you are capable of breaking the speed of light or making 1+1=3
.
And yet, a double
aims to represent all numbers. There are way more numbers between 0 and 1 than 2^64 options (in fact, an infinite amount of numbers exist between 0 and 1), and that's just 0 to 1.
So, how doubles actually work is different. A few less than 2^64 numbers are chosen from the entire number line. Let's call these the blessed numbers.
The blessed numbers are not equally distributed. The closer you are to 1
, the more blessed numbers exist. In other words, the distance between 2 blessed numbers increases as you move away from 1. For example, if you go from, say, 1e100 (a 1 with a hundred zeroes) and want to find the next blessed number, it's quite a ways. It's in fact higher than 1.0! - 1e100+1
is in fact 1e100 again, because the way double
math works is that after every single last mathematical operation you to do them, the end result is rounded to the nearest blessed number.
Let's try it!
double d = 1e100;
System.out.println(d);
System.out.println(d + 1);
// prints: 1.0E100
// 1.0E100
But that means.. double
values don't actually represent a single number!!. What any given double represents is in fact this concept:
An unknown number whose value lies between [D - 𝛿, D + 𝛿], where D is the blessed number that is closed to this unknown number this value represents, and, and 𝛿 is half of the distance between D and the next nearest blessed number on either side.
Given that usually 𝛿 is incredibly small, this is 'good enough'. But this weirdness does explain why you really, really do not want any business at all with double
if accuracy is important (such as with currencies. Don't store those in doubles, ever!)
Given that, what does -0.0
represent? not actually just 0. It represents, specifically: An unknown number whose value lies between [-𝛿, 0] where 0 is real zero (and this, has no sign), and 𝛿 is Double.MIN_VALUE
: the smallest non-zero positive number representable with a double
.
That's why -0.0
and +0.0
both exist: They are in fact different concepts. Rarely relevant, but sometimes it is. In contrast to e.g. long
where 5
just means 5
and not "between 4.5 and 5.5", because longs fundamentally don't recognize that fractional parts exist in the first place. Given that 5
just means 5
, then 0
just means 0
, and there is no such thing as negative zero in the first place.
2's complement is a cool system. It has two neat properties:
It DOES matter for greater than, less than, and divide.
2's complement works like this: To negate a number, take all bits and flip them (i.e. do a NOT operation on the bits). Then, add 1.
Let's try it!
int x = 5;
int y = -x;
for (int i = 31; i >= 0; i--) {
System.out.print((x >>> i) & 1);
}
System.out.println();
for (int i = 31; i >= 0; i--) {
System.out.print((y >>> i) & 1);
}
System.out.println();
// prints 00000000000000000000000000000101
// 11111111111111111111111111111011
As we can see, the 'flip all bits and add 1' algorithm was applied.
2s complement is, of course, reversible: If you do 'flip all bits and add 1' twice in a row you get the same number out.
Now let's try -0
. 0 is 32 0 bits, then flip them all, then add 1:
00000000000000000000000000000000
11111111111111111111111111111111 // flip all
100000000000000000000000000000000 // add 1
00000000000000000000000000000000 // that 1 fell off
and because ints can only store 32 bits, that final '1' falls off of the end. And we're left with zero again.
Now let's go with bytes ( abit smaller) and try to add, say, 200 and 50 together.
11001000 // 200 in binary
00110010 // 50 in binary
-------- +
11111010 // 250 in binary.
now let's instead go: Oh wait, whoops, that was an error, actually these numbers are in 2s complement. That wasn't 200, nono. 11001000 is a bit sequence that actually means (let's apply the 'flip all bits, add 1' scheme: 00111000 - it's actually -56. So the operation was meant to represent '-56 + 50'. Which is -6. -6 in binary is (write out 6, flip bits, add 1):
00000110
11111001
11111010
hey now, look at that, nothing changed! It's the same result! So, when the computer does x + y
, where x and y are numbers, the computer does not care. Whether x is "an unsigned number" or "a signed with 2s complement number", the operation is identical.
That's why 2s complement is applied. It makes math MUCH faster. The CPU doesn't have to futz about with branching out to deal with sign bits.
In this sense it is more correct to say that in java, int
, long
, char
, byte
and short
are neither signed nor unsigned, they just are. At least for the purposes of +, -, ++, and --. No the idea that int
is signed is fundamentally a property of e.g. System.out.println(int)
- that method chooses to render the bitsequence 11111111111111111111111111111111 as "-1" instead of as 4294967296.
Upvotes: 3
Reputation: 198461
long
has no such thing as negative zero. Only float
and double
have a different representation of positive and negative zero.
Upvotes: 1