Little-Endian vs Big-Endian

Endianness: little-endian vs big-endian

I’m going to discuss a concept that we as Java programmers don’t usually deal with on a day-to-day basis, if at all. However, we do need to know about it because the JVM uses it behind the scenes all the time.

This is the concept of endianness. What is it, and how does it affect us?

Little-Endian vs Big-Endian

You may have heard the terms little-endian and big-endian before. These terms describe the ordering of the bytes that make up a variable when running on a specific CPU architecture. Some platforms use big-endian order internally (e.g. Mac, IBM 390); some use little-endian order (e.g., Intel).

Anything stored in computer memory can be accessed through the address where it is stored. It feels more natural to store numbers in memory with the least significant byte stored at the lower memory address, and the most significant byte being stored at the higher memory address. This is little-endian ordering.

A big-endian system is the opposite of a little-endian system. A big-endian system reverses the order of the bytes. It stores the most significant byte of a word at the smallest memory address and the least significant byte at the largest.

So we can think of endianness as writing data either “left-to-right” or “right-to-left”.

Big-endianness is the leading ordering in networking protocols, and is referred to as network order. It transmits the most significant byte first. Little-endianness is the dominant ordering for processor architectures and their memory (x86, most ARM implementations, etc.) File formats can use either ordering. Some file formats use a mixture of both, or contain a byte order mark (BOM) to indicate the endianness of the file.

Internally, any specific computer could work equally well no matter what endianness it uses. Its hardware would consistently use the same endianness to both load and store data. This allows us to normally ignore the endianness of the computer we’re working on.

There are other byte orderings. They are generically called middle-endian or mixed-endian. However, I won’t cover them here as they aren’t particularly relevant to the JVM.

Endianness Example

Let’s say that we store a 32-bit (four byte) int on two machines using different endianness. Let’s assign a hexadecimal value of 0x12345678 to this int, as follows:

int value = 0x12345678;  // using hex for ease of understanding

In both cases, the int is split over four bytes with the values of 0x12, 0x34, 0x56, and 0x78. The bytes are stored in four sequential locations in memory, starting with the address a (lowest address), then a+1, a+2, and a+3 (highest address).

The difference between big-endian and little-endian is the order in which the four bytes are stored in memory:

Address (a) a a+1 a+2 a+3
Little-endian 78 56 34 12
Big-endian 12 34 56 78

A little-endian machine will store the integer with the least-significant byte (0x78) at address a, and the most-significant byte (0x12) at address a+3. Big-endian does the opposite.

Endianness in Java

Java binary files is stored in big-endian order (i.e., network order). This means that if we use only Java, all files are formatted the same way on all platforms: Windows, MacOS, Linux, etc. We can exchange binary data between Java applications without worrying about endianness. The JVM translates the Java big-endian form to whatever the native CPU is using.

In chapter 4. The Class File Format of the Java Virtual Machine specification, it specifies that “Multibyte data items are always stored in big-endian order, where the high bytes come first.”

A machine can read its own data perfectly. Problems arise when one computer stores data and a different type tries to read it. Any difference in endianness can become an issue when transferring data between two machines. We can run into problems when we exchange data files with a non-Java program that uses little-endian order. If we were examining a memory dump, we would get different results if the endianness differs. In these cases, we must be aware of the endianness of the data, and handle it appropriately.

Accessing Endianness with Java

We can use the method java.nio.ByteOrder.nativeOrder() to get the endianness used by our specific CPU. Here’s a snippet of code to do it:

ByteOrder byteOrder = ByteOrder.nativeOrder();
System.out.println(byteOrder);   

It prints LITTLE_ENDIAN when run on an Intel CPU.

We can use it with Java NIO ByteBuffers. If we choose the native hardware ordering when allocating a ByteBuffer, we’ll get better performance. Native code libraries are usually more efficient with these buffers.

Summary and Further Reading

To summarise:

  • Java hides the internal endianness from us, and gives us consistent results in all platforms.

  • When we exchange data files between platforms with different endianness, not being aware of, and dealing with, endianness can lead to incompatibility problems.

Just for interest, the terms big-endian and little-endian come from the 1726 book “Gulliver’s Travels”, where two groups of Lilliputians argue over whether to break the shell of a boiled egg at the little end or the big end. People haven’t changed much…

A very easy to read article is here.

A more comprehensive page is here.

Wikipedia has a highly detailed page here.

Was this interesting? Please share your comments on the blog post, and as always, stay safe and keep learning!

Leave a Comment

Your email address will not be published. Required fields are marked *

Thank You

We're Excited!

Thank you for completing the form. We're excited that you have chosen to contact us about training. We will process the information as soon as we can, and we will do our best to contact you within 1 working day. (Please note that our offices are closed over weekends and public holidays.)

Don't Worry

Our privacy policy ensures your data is safe: Incus Data does not sell or otherwise distribute email addresses. We will not divulge your personal information to anyone unless specifically authorised by you.

If you need any further information, please contact us on tel: (27) 12-666-2020 or email info@incusdata.com

How can we help you?

Let us contact you about your training requirements. Just fill in a few details, and we’ll get right back to you.