More permanent stuff at http://www.dusbabek.org/~garyd

11 October 2008

Sign extension in java

I was talking with a programmer recently about a bit-twiddling problem. The conversation brought to mind things I learned a long time ago when I was new to Java.

One of the very unfun things about java is that there are no unsigned types. This means that if the hi-bit of a byte is on it gets extended when promoting a byte to an int. This applies to 8 bit unsigned values in the range from 128-255 (0x80-0xff).

This leads some some puzzling problems when you're byte munging in Java for the first time. Consider this:


System.out.println(Integer.toBinaryString(128));
> 10000000

System.out.println(Integer.toBinaryString(-128));
> 11111111111111111111111110000000

byte b = 0x80;
System.out.println(Integer.toBinaryString(b));
> 11111111111111111111111110000000

System.out.println(Integer.toBinaryString(0x80));
> 10000000

System.out.println(0x80 == b);
> false

Huh?

At first it seems silly that 0x80 != b when b was explicitly set to 0x80. The problem here is that java promotes b to an integer so it can be compared with 0x80 (which is already an integer, even though your mind wants to treat it as 8 bits). The process of promoting b (which is a negative number as far as a signed byte goes) extends the 1 in the hi-bit.

Another way to explain this is to say that casting 0x80 to a byte converts it from a positive 4-byte integer to a negative 1-byte integer. Casting 0x80 to a byte solves the problem:

System.out.println((byte)0x80 == b);
> true

The java libraries get around this problem by treating all bytes as integers (look at InputStream and OutputStream to get a feeling for this). In that case, it is probably more correct to:

System.out.println(0x80 == (0x000000ff & b));
> true

One more thing to be aware of this that java has an unsigned right shift operator: >>> that always shifts in a zero regardless of sign:

System.out.println(Integer.toBinaryString(b>>1));
> 11111111111111111111111111000000

System.out.println(Integer.toBinaryString(b>>>1));
> 1111111111111111111111111000000

Be aware that this will byte you (har!) when you assume you're shifting an 8-bit value though.

P.S. By popular demand (my wife), my next post will be non-technical.

1 comments:

jeremy said...

A bad pun even from Gary! ("this will byte you")