Last post we looked at positional and non-positional representation of numbers. We covered binary, octal, decimal and hexadecimal notation, and how to convert between them.
In this post we’re going to look at the representation of integral numbers in computer memory. Integral numbers are whole numbers without any decimal point or fractional component. Their representation is a lot simpler and easier to understand than that of floating point (fractional) numbers, which we will investigate in the next post.
Integral Numbers
Java is a strongly typed language, and as such we need to know about the different data types we can use. Java uses a split type system where we have primitive (also called fundamental or raw) data types and object reference data types.
In this post, we’ll look at the primitive integral data types in Java, being byte
, short
, int
, long
and char
. As I mentioned earlier, the primitive floating point types float
and double
have radically different internal bit representations and behaviours.
I use the term integral when I’m talking about the entire group of whole numbers. When I use a term like int
, I’m specially talking about the int
data type and nothing else.
Positive Numbers
Integral numbers are represented as binary numbers (0
’s and 1
’s). Each of the primitive integral data types have different sizes in memory, and can therefore support different value ranges.
Type | Bytes | Minimum Value | Maximum Value |
---|---|---|---|
byte |
1 | -128 | 127 |
short |
2 | -32,768 | 32,767 |
int |
4 | -2,147,483,648 | 2,147,483, 647 |
long |
8 | -9,223,372,036,854,775,808L | 9,223,372,036,854,775,807L |
char |
2 | 0 | 65,535 |
It’s important to note that:
-
The ranges and sizes of the integral data types in Java are always the same from one JVM to another. They do not depend on the physical machine on which the code is running. This is different in languages like C when the sizes depend on the machine architecture and compiler used.
-
All the integral data types are signed. Signed numbers can store both positive and negative values.
-
The fundamental character data type in Java is
char
. This represents characters in the original Unicode encoding scheme. It defined characters as fixed-width two byte characters, allowing 65,536 characters. -
Even though
char
is a integral data type, it is an unsigned number. Unsigned types can only store non-negative values. -
A signed integer can get an overflow error (or wrap around to a negative number) if the value exceeds the maximum limit. An unsigned number wraps around to 0 if the value exceeds the maximum limit.
Representation of Positive Numbers
So how do we represent positive numbers? It’s very easy using positional notation (see the previous article). Let’s work with the example of a byte
. It’s to easier to visualise because we’re only working with 8 bits, each having a value of 0 or 1.
For simplicity’s sake, let’s assume all the bits from right to left represent a higher power of two, from 2 to the power of 0, to 2 to the power of 7. The leftmost bit is called the least significant bit (LSB). It represents 2 to the power of zero, which is 1. The following bits represent 2, 4, 8, 16, 32, 64 and 128 respectively. The rightmost bit is called the most significant bit (MSB).
If all the bits are zero, the value will obviously be zero, and if all the bits are one, the value will obviously be 255. This gives us 256 distinct values that we can represent in a byte.
It’s all just simple multiplication and addition. Dead easy! But what about negative numbers?
Representation of Negative Numbers
Let’s take a simplistic approach. Let’s use the MSB to represent the sign – this would be the sign bit. If the sign bit is a zero, the number is positive. If the sign bit is a one, the number is negative.
Sounds okay to start. Let’s look at an example. If the first seven bits (starting from the right) are all ones, and the MSB is zero, the value will be +127
. If the MSB changes to 1 (i.e. all ones), the value will be -127
. All fine so far.
Let’s look at another example. If all the bits are zero, including the sign bit, the value will be +0
. If the MSB changes to a one, the value will be -0
. But that gives us two representations of zero, a positive zero and a negative zero. Not so good!
And if we check how many numbers we can represent, it only gives us 254 distinct values with two representations of zero. Definitely not good!
Two’s Complement
There are a variety of ways to handle negative numbers:
-
Signed magnitude uses the leftmost bit to represent the sign. For a positive number, the sign bit is
0
, and for the negative number the sign bit is1
. That is what we saw earlier. -
One’s complement uses the complement of the positive number. This means that, in binary, the
0
s are changed to1
s and the1
s are changed to0
s. Technically, this is the bitwise NOT of the binary digits. The sign bit will automatically be0
for positive numbers and1
for negative numbers. -
Two’s complement uses the same complement of the positive number, and then
1
is added to the least significant digit. Although this sounds odd, it is actually the method most commonly used in current computing devices.
In two’s complement, there is only one zero, represented as 00000000
. Negating a number (whether negative or positive) is done by inverting all the bits and then adding one to that result.
The following table shows the bit representation of byte
values. The two’s complement column is signed (Java only allows this with raw data types). The unsigned column is there for illustration (allowed in C, C++ and C#).
Binary | Two’s complement | Unsigned value |
---|---|---|
00000000 | 0 | 0 |
00000001 | 1 | 1 |
… | … | … |
01111110 | 126 | 126 |
01111111 | 127 | 127 |
10000000 | −128 | 128 |
10000001 | −127 | 129 |
10000010 | −126 | 130 |
… | … | … |
11111110 | −2 | 254 |
11111111 | −1 | 255 |
You can notice that in the middle of the table, if you add one to the maximum value of a byte
, it “wraps” around to the largest negative number.
You can try it with this snippet of code:
byte b = Byte.MAX_VALUE;
System.out.println(b);
b = (byte)(b+1);
System.out.println(b);
int i = Integer.MAX_VALUE;
System.out.println(i);
i = i+1;
System.out.println(i);
Class Wrappers
As mentioned earlier, Java has a split data type system. It has the fundamental data types we’ve just looked at, and it has a number of class wrappers that we use to encapsulate (wrap) a single field of a fundamental data type in an object. These classes provide methods for converting the wrapped data field to and from a String
, as well as other useful constants and methods for dealing with that particular data type.
The class wrappers are very similarly named, being Byte
, Short
, Integer
, Long
and Character
. Note the uppercase spelling! They represent classes. Please pronounce them as they are spelled. Don’t pronounce int
as integer, or char
as character! You’ll only confuse yourself and anyone you’re explaining your code to.
Constants include BYTES
, SIZE
, TYPE
, MIN_VALUE
and MAX_VALUE
(the values were seen in the first table).
The following example shows how to use these constants:
/**
* A simple program to print the names, sizes and ranges of the
* fundamental data types using the appropriate class wrappers.
* We're using printf() here for nicely formatted output.
*
* We can use \u2502 instead of | (pipe sign) for vertical lines.
*
* Run with -Djava.locale.providers=SPI to get thousands separators.
*
* Note the type cast to int when printing characters. Without it,
* the runtime system will look up the characters associated with the
* positions 0 and 65535, and attempt to print them.
* 0 is the NUL character (https://www.unicode.org/charts/PDF/U0000.pdf)
* 65535 is undefined (https://www.unicode.org/charts/PDF/UFF00.pdf).
*
*/
public class WrapperTest {
public static void main(String args[]) {
final String title = "| %-7s | %4s | %27s | %26s |%n";
final String fmt = "| %-7s | %4d | %,27d |%,27d |%n";
System.out.println();
System.out.printf(title, "TYPE", "SIZE", "MINIMUM", "MAXIMUM");
System.out.printf(fmt, Byte .TYPE, Byte .SIZE,
Byte .MIN_VALUE, Byte .MAX_VALUE );
System.out.printf(fmt, Short .TYPE, Short .SIZE,
Short .MIN_VALUE, Short .MAX_VALUE );
System.out.printf(fmt, Integer.TYPE, Integer.SIZE,
Integer.MIN_VALUE, Integer.MAX_VALUE );
System.out.printf(fmt, Long .TYPE, Long .SIZE,
Long .MIN_VALUE, Long .MAX_VALUE );
System.out.printf(fmt, Character.TYPE, Character.SIZE,
(int)Character.MIN_VALUE,
(int)Character.MAX_VALUE );
}
} // end of class
Further Reading and Signing Off
For more on signed number representations, see the Wikipedia page on the topic.
For more on two’s complement, see the Wikipedia page.
You can type in numbers and see the two’s complement bit representations at https://www.allmath.com/twos-complement.php.
Was this useful? Please share your comments, and as always, stay safe and keep learning!