rghome
rghome

Reputation: 8819

Is Java char signed or unsigned for arithmetic?

Java char is a 16 bit data type, but is it signed or unsigned when it comes to performing arithmetic on it?

Can you use it as an unsigned 16 bit integer in arithmetic?

For example, is the following correct?

char c1;
char c2;

int i = c1 << 16 | c2;

Or is it necessary to strip the sign-extended bits off c2 first?

(I am sure the answer to this is elsewhere, but there doesn't seem to be picked up by obvious searches).

Upvotes: 5

Views: 3220

Answers (2)

T.J. Crowder
T.J. Crowder

Reputation: 1074048

char is unsigned. From JLS§4.2.1:

For char, from '\u0000' to '\uffff' inclusive, that is, from 0 to 65535

...but note that when you use any of the various mathematic operations on them (including bitwise operations and shift operations), they're widened to another type based on the type of the other operand, and that other type may well be signed:

  1. Widening primitive conversion (§5.1.2) is applied to convert either or both operands as specified by the following rules:

    • If either operand is of type double, the other is converted to double.

    • Otherwise, if either operand is of type float, the other is converted to float.

    • Otherwise, if either operand is of type long, the other is converted to long.

    • Otherwise, both operands are converted to type int.

For instance, char + char is int, so:

public class Example {
    public static void main(String[] args) {
        char a = 1;
        char b = 2;

        char c = a + b;          // error: incompatible types: possible lossy conversion from int to char
        System.out.println(c);
    }
}

Re bit-extension, if we follow the link above to widening primitive conversion:

A widening conversion of a char to an integral type T zero-extends the representation of the char value to fill the wider format.

So char 0xFFFF becomes int 0x0000FFFF, not 0xFFFFFFFF.

Upvotes: 9

Federico klez Culloca
Federico klez Culloca

Reputation: 27119

From the specs

For char, from '\u0000' to '\uffff' inclusive, that is, from 0 to 65535

Since it's 16 bits, it means they are unsigned.

Upvotes: 1

Related Questions