javalong-integerbit-representationnegative-zero

Reputation: 4518

Long Representation vs Double representation of positive and negative zero in java

I was wondering about the differences between positive and negative zero in different numeric types.
I understand the IEEE-754 for floating point arithmetic and bit representation in double precision so the following didn't come as a surprise

double posz = 0.0;
double negz = -0.0;
System.out.println(Long.toBinaryString(Double.doubleToLongBits(posz)));
System.out.println(Long.toBinaryString(Double.doubleToLongBits(negz)));
// output
>>> 0
>>> 1000000000000000000000000000000000000000000000000000000000000000

What did surprise me and showed me that im clueless about the bit representation of long type in java is that even if i shift right (unsigned >>>) then the binary representation of both positive and negative zero is the same

long posz = 0L;
long negz = -0L;
for (int i = 63; i >= 0; i--) {
    System.out.print((posz >>> i) & 1);
}
System.out.println();
for (int i = 63; i >= 0; i--) {
    System.out.print((negz >>> i) & 1);
}
// output
>>> 0000000000000000000000000000000000000000000000000000000000000000
>>> 0000000000000000000000000000000000000000000000000000000000000000

so i am wondering what does java do from a bit representation when i write the following

long posz = 0L;
long negz = -0L;

Does the compiler understand that they are both zero and disregards the sign (and so assignes 0 to the sign bit) or is there other magic here?

Upvotes: 0

Answers (2)

rzwitserloot

Reputation: 103728

or is there other magic here?

Yes. 2's complement.

2's complement is a bit magical. It accomplishes 2 major objectives. Before getting into that, let's first stew on the notion of negative zero for a moment.

Negative zero is kinda weird. Why does it exist at all?

Negative zero isn't actually a thing. Ask any mathematician "Hey, so, what's up with negative zero?" and they'll just look at you in befuddlement. It's not a thing. Mathematically, 0 and -0 are utterly identical. Not just 'nearly identical', but 100%, fully, in all possible ways, identical. We don't generally want our numbers to be capable of representing both 5.0 as well as 5.00 - as those two are entirely, 100%, identical. If you don't think that a value system ought to waste bits trying to differentiate between 5.0 and 5.00, then it's equally bizarro to want the ability to represent -0.0 and +0.0 as distinct entities.

So, wanting -0 in the first place is kinda weird. All the numeric primitives (long, int, short, byte, and I guess char which is technically numeric too) all cannot represent this number. Instead, long z = -0 boils down to:

Take the constant "0".
Apply the 'negate' operation to this number (- is a unary operator. Just like 2+5 makes the system calculate the binary operation of "addition" on elements 2 and 5, -x makes the system calculate the unary operation of "negation" on element x. Applying the negation operation to 0 produces 0. It's no different from writing, say, int x = 5 + 0;. That +0 part doesn't do anything. The - in front of -0 doesn't do anything. In contrast to -0.0 where it does do something (gets you negative zero, the double value, instead of positive zero).
Store this result in z (so, just 0 then).

There is no way to tell if that minus is there. They both result in ALL ZERO bits, and hence, there is no way for the computer to tell if you initialized that variable with the expression -0 or with +0. Again in contrast to double where as you noticed there's a bit different.

So why does `double` have it then?

Let's stew a bit on the notion of doubles and IEEE-754 math.

A double takes 64 bits. From basic pure mathematical principles then, a double is as incapable of representing more than 2^64 different possible values you are capable of breaking the speed of light or making 1+1=3.

And yet, a double aims to represent all numbers. There are way more numbers between 0 and 1 than 2^64 options (in fact, an infinite amount of numbers exist between 0 and 1), and that's just 0 to 1.

So, how doubles actually work is different. A few less than 2^64 numbers are chosen from the entire number line. Let's call these the blessed numbers.

The blessed numbers are not equally distributed. The closer you are to 1, the more blessed numbers exist. In other words, the distance between 2 blessed numbers increases as you move away from 1. For example, if you go from, say, 1e100 (a 1 with a hundred zeroes) and want to find the next blessed number, it's quite a ways. It's in fact higher than 1.0! - 1e100+1 is in fact 1e100 again, because the way double math works is that after every single last mathematical operation you to do them, the end result is rounded to the nearest blessed number.

Let's try it!

double d = 1e100;
System.out.println(d);
System.out.println(d + 1);

// prints: 1.0E100
//         1.0E100

But that means.. double values don't actually represent a single number!!. What any given double represents is in fact this concept:

An unknown number whose value lies between [D - 𝛿, D + 𝛿], where D is the blessed number that is closed to this unknown number this value represents, and, and 𝛿 is half of the distance between D and the next nearest blessed number on either side.

Given that usually 𝛿 is incredibly small, this is 'good enough'. But this weirdness does explain why you really, really do not want any business at all with double if accuracy is important (such as with currencies. Don't store those in doubles, ever!)

Given that, what does -0.0 represent? not actually just 0. It represents, specifically: An unknown number whose value lies between [-𝛿, 0] where 0 is real zero (and this, has no sign), and 𝛿 is Double.MIN_VALUE: the smallest non-zero positive number representable with a double.

That's why -0.0 and +0.0 both exist: They are in fact different concepts. Rarely relevant, but sometimes it is. In contrast to e.g. long where 5 just means 5 and not "between 4.5 and 5.5", because longs fundamentally don't recognize that fractional parts exist in the first place. Given that 5 just means 5, then 0 just means 0, and there is no such thing as negative zero in the first place.

Now we get to 2's complement

2's complement is a cool system. It has two neat properties:

It only has the one zero.
It does not matter if you treat the bit sequence as signed-by-way-of-2s-complement or as unsigned, for the purposes of the operations: Addition, Substraction, Increment, Decrement, zero-check. The modifications you do to the bits to implement those operations is identical.

It DOES matter for greater than, less than, and divide.

2's complement works like this: To negate a number, take all bits and flip them (i.e. do a NOT operation on the bits). Then, add 1.

Let's try it!

int x = 5;
int y = -x;
for (int i = 31; i >= 0; i--) {
    System.out.print((x >>> i) & 1);
}
System.out.println();
for (int i = 31; i >= 0; i--) {
    System.out.print((y >>> i) & 1);
}
System.out.println();

// prints 00000000000000000000000000000101
//        11111111111111111111111111111011

As we can see, the 'flip all bits and add 1' algorithm was applied.

2s complement is, of course, reversible: If you do 'flip all bits and add 1' twice in a row you get the same number out.

Now let's try -0. 0 is 32 0 bits, then flip them all, then add 1:

 00000000000000000000000000000000
 11111111111111111111111111111111 // flip all
100000000000000000000000000000000 // add 1
 00000000000000000000000000000000 // that 1 fell off

and because ints can only store 32 bits, that final '1' falls off of the end. And we're left with zero again.

Now let's go with bytes ( abit smaller) and try to add, say, 200 and 50 together.

11001000 // 200 in binary
00110010 // 50 in binary
-------- +
11111010 // 250 in binary.

now let's instead go: Oh wait, whoops, that was an error, actually these numbers are in 2s complement. That wasn't 200, nono. 11001000 is a bit sequence that actually means (let's apply the 'flip all bits, add 1' scheme: 00111000 - it's actually -56. So the operation was meant to represent '-56 + 50'. Which is -6. -6 in binary is (write out 6, flip bits, add 1):

00000110
11111001
11111010

hey now, look at that, nothing changed! It's the same result! So, when the computer does x + y, where x and y are numbers, the computer does not care. Whether x is "an unsigned number" or "a signed with 2s complement number", the operation is identical.

That's why 2s complement is applied. It makes math MUCH faster. The CPU doesn't have to futz about with branching out to deal with sign bits.

In this sense it is more correct to say that in java, int, long, char, byte and short are neither signed nor unsigned, they just are. At least for the purposes of +, -, ++, and --. No the idea that int is signed is fundamentally a property of e.g. System.out.println(int) - that method chooses to render the bitsequence 11111111111111111111111111111111 as "-1" instead of as 4294967296.

Upvotes: 3

Louis Wasserman

Reputation: 198461

long has no such thing as negative zero. Only float and double have a different representation of positive and negative zero.

Upvotes: 1

Long Representation vs Double representation of positive and negative zero in java

Answers (2)

Negative zero is kinda weird. Why does it exist at all?

So why does double have it then?

Now we get to 2's complement

Related Questions

So why does `double` have it then?